Re: [PATCH v3 1/2] dt-bindings: can: add can-controller.yaml
On Thu, 22 Oct 2020 09:52:17 +0200, Oleksij Rempel wrote: > For now we have only node name as common rule for all CAN controllers > > Signed-off-by: Oleksij Rempel > Link: https://lore.kernel.org/r/20201016073315.16232-2-o.rem...@pengutronix.de > Signed-off-by: Marc Kleine-Budde > --- > .../bindings/net/can/can-controller.yaml | 18 ++ > 1 file changed, 18 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/net/can/can-controller.yaml > Reviewed-by: Rob Herring
Re: [PATCH v3 2/2] dt-bindings: can: flexcan: convert fsl,*flexcan bindings to yaml
On Thu, 22 Oct 2020 09:52:18 +0200, Oleksij Rempel wrote: > In order to automate the verification of DT nodes convert > fsl-flexcan.txt to fsl,flexcan.yaml > > Signed-off-by: Oleksij Rempel > Link: https://lore.kernel.org/r/20201016073315.16232-3-o.rem...@pengutronix.de > Signed-off-by: Marc Kleine-Budde > --- > .../bindings/net/can/fsl,flexcan.yaml | 135 ++ > .../bindings/net/can/fsl-flexcan.txt | 57 > 2 files changed, 135 insertions(+), 57 deletions(-) > create mode 100644 Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml > delete mode 100644 Documentation/devicetree/bindings/net/can/fsl-flexcan.txt > Reviewed-by: Rob Herring
Re: [PATCH 1/4] dt-bindings: add defines for i.MX8MN power domains
On Thu, 22 Oct 2020 10:08:04 -0500, Adam Ford wrote: > The i.MX8M Nano has a similar power domain controller to that of the > mini, but it isn't fully compatible, so it needs a separate binding > and power domain tables. > > Add the bindings and tables. > > Signed-off-by: Adam Ford > Acked-by: Rob Herring
Re: [PATCH v2 08/10] dt-bindings: pwm: Add binding for RPi firmware PWM bus
On Thu, 22 Oct 2020 17:58:55 +0200, Nicolas Saenz Julienne wrote: > The PWM bus controlling the fan in RPi's official PoE hat can only be > controlled by the board's co-processor. > > Signed-off-by: Nicolas Saenz Julienne > > --- > Changes since v1: > - Update bindings to use 2 #pwm-cells > > .../arm/bcm/raspberrypi,bcm2835-firmware.yaml | 20 +++ > .../pwm/raspberrypi,firmware-pwm.h| 13 > 2 files changed, 33 insertions(+) > create mode 100644 include/dt-bindings/pwm/raspberrypi,firmware-pwm.h > Reviewed-by: Rob Herring
Re: [PATCH] Input: add SW_COVER_ATTACHED and SW_EXT_PEN_ATTACHED
Hi, On Fri, Oct 30, 2020 at 10:15:52PM +0900, Jungrae Kim wrote: > From 23aed4567e234b7e108c31abadb9f3a3f7d2 Mon Sep 17 00:00:00 2001 > From: Jungrae Kim > Date: Fri, 30 Oct 2020 21:23:12 +0900 > Subject: [PATCH] Input: add SW_COVER_ATTACHED and SW_EXT_PEN_ATTACHED > > SW_COVER_ATTACHED represents the connected state of a removable cover > of a device. Value 0 means cover was attached with device, value 1 means > removed it. Any reason against using SW_MACHINE_COVER? That was introduced for Nokia N900, where you actually remove the cover to access battery/SD card/ SIM card (so there is state 0 = cover removed/open and state 1 = cover attached/closed). -- Sebastian > SW_EXT_PEN_ATTACHED represents the state of the pen. > Some device have internal pen slot. but other some device have external pen > slot. These two cases has different use case in userspace. So need to > separate a event. Value 0 means pen was detach on external pen slot on > device, value 1 means pen was attached external pen slot on device. > > Signed-off-by: Jungrae Kim > --- > include/linux/mod_devicetable.h| 2 +- > include/uapi/linux/input-event-codes.h | 4 +++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h > index 5b08a473cdba..897f5a3e7721 100644 > --- a/include/linux/mod_devicetable.h > +++ b/include/linux/mod_devicetable.h > @@ -320,7 +320,7 @@ struct pcmcia_device_id { > #define INPUT_DEVICE_ID_LED_MAX0x0f > #define INPUT_DEVICE_ID_SND_MAX0x07 > #define INPUT_DEVICE_ID_FF_MAX 0x7f > -#define INPUT_DEVICE_ID_SW_MAX 0x10 > +#define INPUT_DEVICE_ID_SW_MAX 0x12 > #define INPUT_DEVICE_ID_PROP_MAX 0x1f > > #define INPUT_DEVICE_ID_MATCH_BUS 1 > diff --git a/include/uapi/linux/input-event-codes.h > b/include/uapi/linux/input-event-codes.h > index ee93428ced9a..a0506369de6d 100644 > --- a/include/uapi/linux/input-event-codes.h > +++ b/include/uapi/linux/input-event-codes.h > @@ -893,7 +893,9 @@ > #define SW_MUTE_DEVICE 0x0e /* set = device disabled */ > #define SW_PEN_INSERTED0x0f /* set = pen inserted */ > #define SW_MACHINE_COVER 0x10 /* set = cover closed */ > -#define SW_MAX 0x10 > +#define SW_COVER_ATTACHED 0x11 /* set = cover attached */ > +#define SW_EXT_PEN_ATTACHED0x12 /* set = external pen attached */ > +#define SW_MAX 0x12 > #define SW_CNT (SW_MAX+1) > > /* > -- > 2.17.1 signature.asc Description: PGP signature
Re: [PATCH v6 4/9] mm, kfence: insert KFENCE hooks for SLAB
On Fri, 30 Oct 2020 at 03:49, Jann Horn wrote: > On Thu, Oct 29, 2020 at 2:17 PM Marco Elver wrote: > > Inserts KFENCE hooks into the SLAB allocator. > [...] > > diff --git a/mm/slab.c b/mm/slab.c > [...] > > @@ -3416,6 +3427,11 @@ static void cache_flusharray(struct kmem_cache > > *cachep, struct array_cache *ac) > > static __always_inline void __cache_free(struct kmem_cache *cachep, void > > *objp, > > unsigned long caller) > > { > > + if (kfence_free(objp)) { > > + kmemleak_free_recursive(objp, cachep->flags); > > + return; > > + } > > This looks dodgy. Normally kmemleak is told that an object is being > freed *before* the object is actually released. I think that if this > races really badly, we'll make kmemleak stumble over this bit in > create_object(): > > kmemleak_stop("Cannot insert 0x%lx into the object search tree > (overlaps existing)\n", > ptr); Good catch. Although extremely unlikely, let's just avoid it by moving the freeing after. > > > + > > /* Put the object into the quarantine, don't touch it for now. */ > > if (kasan_slab_free(cachep, objp, _RET_IP_)) > > return;
Re: [PATCH] sched/fair: remove the spin_lock operations
On Fri, Oct 30, 2020 at 10:46:21PM +0800 Hui Su wrote: > Since 'ab93a4bc955b ("sched/fair: Remove > distribute_running fromCFS bandwidth")',there is > nothing to protect between raw_spin_lock_irqsave/store() > in do_sched_cfs_slack_timer(). > > So remove it. > > Signed-off-by: Hui Su > --- > kernel/sched/fair.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 290f9e38378c..5ecbf5e63198 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5105,9 +5105,6 @@ static void do_sched_cfs_slack_timer(struct > cfs_bandwidth *cfs_b) > return; > > distribute_cfs_runtime(cfs_b); > - > - raw_spin_lock_irqsave(_b->lock, flags); > - raw_spin_unlock_irqrestore(_b->lock, flags); > } > > /* > -- > 2.29.0 > > Nice :) Reviewed-by: Phil Auld --
Re: [PATCH v2 6/7] media: dt-bindings: media: qcom,camss: Add bindings for SDM660 camss
On Thu, 22 Oct 2020 19:47:05 +0200, khol...@gmail.com wrote: > From: AngeloGioacchino Del Regno > > Add bindings for qcom,sdm660-camss in order to support the camera > subsystem on SDM630/660 and SDA variants. > > Signed-off-by: AngeloGioacchino Del Regno > Reviewed-by: Robert Foss > --- > Documentation/devicetree/bindings/media/qcom,camss.txt | 7 +++ > 1 file changed, 7 insertions(+) > Acked-by: Rob Herring
Re: [Y2038][time namespaces] Question regarding CLOCK_REALTIME support plans in Linux time namespaces
Hi Thomas, > Lukasz, > > On Fri, Oct 30 2020 at 11:02, Lukasz Majewski wrote: > > I do have a question regarding the Linux time namespaces in respect > > of adding support for virtualizing the CLOCK_REALTIME. > > > > According to patch description [1] and time_namespaces documentation > > [2] the CLOCK_REALTIME is not supported (for now?) to avoid > > complexity and overhead in the kernel. > > > > Is there any plan to add support for it in a near future? > > Not really. Just having an offset on clock realtime would be incorrect > in a number of ways. Doing it correct is a massive trainwreck. > > For a debug aid, which is what you are looking for, the correctness > would not really matter, but providing that is a really slippery > slope. > > If at all we could hide it under a debug option which depends on > CONFIG_BROKEN and emitting a big fat warning in dmesg with a clear > statement that it _is_ broken, stays so forever and any attempt to > "fix" it results in a permanent ban from all kernel lists. > > Preferrably we don't go there. I see. Thanks for the explanation. Now, I do use QEMU to emulate ARM 32 bit system with recent kernel (5.1+). It works. Another option would be to give a shoot to QEMU with the "user mode" to run cross-compiled tests (with using a cross-compiled glibc in earlier CI stage). The problem with above is the reliance on QEMU emulation of ARM syscalls (and if 64 bit time supporting syscalls - i.e. clock_settime64 - are available). > > Thanks, > > tglx > > Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lu...@denx.de pgpjWm70UlCQP.pgp Description: OpenPGP digital signature
[PATCH v4 0/9] ARM: remove set_fs callers and implementation
From: Arnd Bergmann Hi Christoph, Russell, This is the rebased version of my ARM set_fs patches on top of v5.10-rc1, dropping the TASK_SIZE_MAX patch but leaving everything else unchanged. I have tested the oabi-compat changes using the LTP tests for the three modified syscalls using an Armv7 kernel and a Debian 5 OABI user space. I also tested the syscall_get_nr() in all combinations of OABI/EABI kernel user space and fixed the bugs I found after Russell pointed out one of those issues. Russell, you can pull these from https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git arm-setfs_v4 or I can add them to the ARM patch tracker if you prefer. Arnd Arnd Bergmann (9): mm/maccess: fix unaligned copy_{from,to}_kernel_nofault ARM: traps: use get_kernel_nofault instead of set_fs() ARM: oabi-compat: add epoll_pwait handler ARM: syscall: always store thread_info->syscall ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation ARM: oabi-compat: rework sys_semtimedop emulation ARM: oabi-compat: rework fcntl64() emulation ARM: uaccess: add __{get,put}_kernel_nofault ARM: uaccess: remove set_fs() implementation arch/arm/Kconfig | 1 - arch/arm/include/asm/ptrace.h | 1 - arch/arm/include/asm/syscall.h | 16 ++- arch/arm/include/asm/thread_info.h | 4 - arch/arm/include/asm/uaccess-asm.h | 6 - arch/arm/include/asm/uaccess.h | 169 ++- arch/arm/kernel/asm-offsets.c | 3 +- arch/arm/kernel/entry-common.S | 17 +-- arch/arm/kernel/process.c | 7 +- arch/arm/kernel/ptrace.c | 9 +- arch/arm/kernel/signal.c | 8 -- arch/arm/kernel/sys_oabi-compat.c | 181 - arch/arm/kernel/traps.c| 47 +++- arch/arm/lib/copy_from_user.S | 3 +- arch/arm/lib/copy_to_user.S| 3 +- arch/arm/tools/syscall.tbl | 2 +- fs/eventpoll.c | 5 +- include/linux/eventpoll.h | 18 +++ include/linux/syscalls.h | 3 + ipc/sem.c | 84 - mm/maccess.c | 28 - 21 files changed, 332 insertions(+), 283 deletions(-) Cc: linux-kernel@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-a...@vger.kernel.org Cc: linux...@kvack.org Cc: Alexander Viro Cc: Linus Walleij Cc: Arnd Bergmann -- 2.27.0
Re: [PATCH v1 1/2] ptrace: Set PF_SUPERPRIV when checking capability
On Fri, Oct 30, 2020 at 1:39 PM Mickaël Salaün wrote: > Commit 69f594a38967 ("ptrace: do not audit capability check when outputing > /proc/pid/stat") replaced the use of ns_capable() with > has_ns_capability{,_noaudit}() which doesn't set PF_SUPERPRIV. > > Commit 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in > ptrace_has_cap()") replaced has_ns_capability{,_noaudit}() with > security_capable(), which doesn't set PF_SUPERPRIV neither. > > Since commit 98f368e9e263 ("kernel: Add noaudit variant of ns_capable()"), a > new ns_capable_noaudit() helper is available. Let's use it! > > As a result, the signature of ptrace_has_cap() is restored to its original > one. > > Cc: Christian Brauner > Cc: Eric Paris > Cc: Jann Horn > Cc: Kees Cook > Cc: Oleg Nesterov > Cc: Serge E. Hallyn > Cc: Tyler Hicks > Cc: sta...@vger.kernel.org > Fixes: 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in > ptrace_has_cap()") > Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing > /proc/pid/stat") > Signed-off-by: Mickaël Salaün Yeah... I guess this makes sense. (We'd have to undo or change it if we ever end up needing to use a different set of credentials, e.g. from ->f_cred, but I guess that's really something we should avoid anyway.) Reviewed-by: Jann Horn with one nit: [...] > /* Returns 0 on success, -errno on denial. */ > static int __ptrace_may_access(struct task_struct *task, unsigned int mode) > { > - const struct cred *cred = current_cred(), *tcred; > + const struct cred *const cred = current_cred(), *tcred; This is an unrelated change, and almost no kernel code marks local pointer variables as "const". I would drop this change from the patch. > struct mm_struct *mm; > kuid_t caller_uid; > kgid_t caller_gid;
Re: [PATCH v6 3/9] arm64, kfence: enable KFENCE for ARM64
On Thu, Oct 29, 2020 at 02:16:43PM +0100, Marco Elver wrote: > Add architecture specific implementation details for KFENCE and enable > KFENCE for the arm64 architecture. In particular, this implements the > required interface in . > > KFENCE requires that attributes for pages from its memory pool can > individually be set. Therefore, force the entire linear map to be mapped > at page granularity. Doing so may result in extra memory allocated for > page tables in case rodata=full is not set; however, currently > CONFIG_RODATA_FULL_DEFAULT_ENABLED=y is the default, and the common case > is therefore not affected by this change. > > Reviewed-by: Dmitry Vyukov > Co-developed-by: Alexander Potapenko > Signed-off-by: Alexander Potapenko > Signed-off-by: Marco Elver > --- > v5: > * Move generic page allocation code to core.c [suggested by Jann Horn]. > * Remove comment about HAVE_ARCH_KFENCE_STATIC_POOL, since we no longer > support static pools. > * Force page granularity for the linear map [suggested by Mark Rutland]. > --- > arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/kfence.h | 19 +++ > arch/arm64/mm/fault.c | 4 > arch/arm64/mm/mmu.c | 7 ++- > 4 files changed, 30 insertions(+), 1 deletion(-) > create mode 100644 arch/arm64/include/asm/kfence.h > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index f858c352f72a..2f8b328b 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -135,6 +135,7 @@ config ARM64 > select HAVE_ARCH_JUMP_LABEL_RELATIVE > select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48) > select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN > + select HAVE_ARCH_KFENCE if (!ARM64_16K_PAGES && !ARM64_64K_PAGES) Why does this depend on the page size? If this is functional, but has a larger overhead on 16K or 64K, I'd suggest removing the dependency, and just updating the Kconfig help text to explain that. Otherwise, this patch looks fine to me. Thanks, Mark.
[PATCH 4.14 ] ALSA: Corrects warning: missing braces around initializer
From: John Donnelly The assignment statement of a local variable "struct hpi_pci pci = { 0 }; " is not valid for all versions of compiler. Fixes: 9c3c9d37ae1e ("ALSA: asihpi: fix iounmap in error handler") Signed-off-by: John Donnelly --- sound/pci/asihpi/hpioctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/pci/asihpi/hpioctl.c b/sound/pci/asihpi/hpioctl.c index b4ccd9f92400..1cd7c197ab96 100644 --- a/sound/pci/asihpi/hpioctl.c +++ b/sound/pci/asihpi/hpioctl.c @@ -350,8 +350,9 @@ int asihpi_adapter_probe(struct pci_dev *pci_dev, struct hpi_message hm; struct hpi_response hr; struct hpi_adapter adapter; - struct hpi_pci pci = { 0 }; + struct hpi_pci pci; + memset(, 0, sizeof(pci)); memset(, 0, sizeof(adapter)); dev_printk(KERN_DEBUG, _dev->dev, -- 2.27.0
[PATCH] arm64: qcom: sc7180: trogdor: Add ADC nodes and thermal zone for charger thermistor
From: Antony Wang Trogdor has a thermistor to monitor the temperature of the charger IC. Add the ADC (monitor) nodes and a thermal zone for this thermistor. Signed-off-by: Antony Wang [ mka: tweaked commit message ] Signed-off-by: Matthias Kaehlcke --- arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 36 1 file changed, 36 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi index bf875589d364..f68305c35c74 100644 --- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi @@ -13,6 +13,23 @@ #include "pm6150.dtsi" #include "pm6150l.dtsi" +/ { + thermal-zones { + charger-thermal { + polling-delay-passive = <0>; + polling-delay = <0>; + + thermal-sensors = <_adc_tm 1>; + + trips { + temperature = <125000>; + hysteresis = <1000>; + type = "critical"; + }; + }; + }; +}; + /* * Reserved memory changes * @@ -733,6 +750,25 @@ { status = "okay"; }; +_adc { + charger-thermistor@4f { + reg = ; + qcom,ratiometric; + qcom,hw-settle-time = <200>; + }; +}; + +_adc_tm { + status = "okay"; + + charger-thermistor@1 { + reg = <1>; + io-channels = <_adc ADC5_AMUX_THM3_100K_PU>; + qcom,ratiometric; + qcom,hw-settle-time-us = <200>; + }; +}; + _pwrkey { status = "disabled"; }; -- 2.29.1.341.ge80a0c044ae-goog
Re: [PATCH v2 2/3] dt-bindings: mfd: Add QCOM PM8008 MFD bindings
On Thu, Oct 22, 2020 at 02:35:41PM -0700, Guru Das Srinagesh wrote: > Add device tree bindings for the driver for Qualcomm Technology Inc.'s > PM8008 MFD PMIC. > > Signed-off-by: Guru Das Srinagesh > --- > .../bindings/mfd/qcom,pm8008-irqchip.yaml | 102 > + > 1 file changed, 102 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/mfd/qcom,pm8008-irqchip.yaml > > diff --git a/Documentation/devicetree/bindings/mfd/qcom,pm8008-irqchip.yaml > b/Documentation/devicetree/bindings/mfd/qcom,pm8008-irqchip.yaml > new file mode 100644 > index 000..31d7b68 > --- /dev/null > +++ b/Documentation/devicetree/bindings/mfd/qcom,pm8008-irqchip.yaml > @@ -0,0 +1,102 @@ > +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/mfd/qcom,pm8008-irqchip.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Qualcomm Technologies, Inc. PM8008 Multi-Function Device PMIC > + > +maintainers: > + - Guru Das Srinagesh > + > +description: | > + PM8008 is a PMIC that contains 7 LDOs, 2 GPIOs, temperature monitoring, and > + can be interfaced over I2C. No bindings for all those functions? Bindings should be complete. > + > +properties: > + compatible: > +items: > + - const: qcom,pm8008-irqchip Why irqchip? > + > + reg: > +maxItems: 1 > + > + interrupt-names: > +items: > + - const: pm8008 > + > + interrupts: > +maxItems: 1 > + > + interrupt-controller: true > + > + "#address-cells": > +const: 1 > +description: Must be specified if child nodes are specified. > + > + "#size-cells": > +const: 0 > +description: Must be specified if child nodes are specified. > + > + "#interrupt-cells": > +const: 2 > +description: | > + The first cell is the IRQ number, the second cell is the IRQ trigger > flag. > + > +patternProperties: > + "^.*@[0-9a-f]+$": '^.*' can be dropped. That's redundant. > +type: object > +# Each peripheral in PM8008 must be represented as a child node with an > +# optional label for referencing as phandle elsewhere. This is optional. > +properties: > + compatible: > +description: The compatible string for the peripheral's driver. > + > + reg: > +maxItems: 1 What does the address represent? It's non-standard, so it needs to be defined. > + > + interrupts: > +maxItems: 1 > + > +required: > + - compatible > + - reg > + - interrupts > + > +required: > + - compatible > + - reg > + - interrupts > + - "#interrupt-cells" > + > +additionalProperties: false > + > +examples: > + - | > +#include > +qupv3_se13_i2c { > +#address-cells = <1>; > +#size-cells = <0>; > + > +pm8008i@8 { > +compatible = "qcom,pm8008-irqchip"; > +reg = <0x8>; > +#address-cells = <1>; > +#size-cells = <0>; > +interrupt-controller; > +#interrupt-cells = <2>; > + > +interrupt-names = "pm8008"; > +interrupt-parent = <>; > +interrupts = <32 IRQ_TYPE_EDGE_RISING>; > + > +pm8008_tz: qcom,temp-alarm@2400 { Must be documented. And don't use vendor prefixes in node names. > +compatible = "qcom,spmi-temp-alarm"; > +reg = <0x2400>; > +interrupts = <0x5 IRQ_TYPE_EDGE_BOTH>; > +#thermal-sensor-cells = <0>; > +}; > +}; > +}; > + > +... > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
[PATCH 1/9] mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
From: Arnd Bergmann On machines such as ARMv5 that trap unaligned accesses, these two functions can be slow when each access needs to be emulated, or they might not work at all. Change them so that each loop is only used when both the src and dst pointers are naturally aligned. Reviewed-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- mm/maccess.c | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/mm/maccess.c b/mm/maccess.c index 3bd70405f2d8..d3f1a1f0b1c1 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -24,13 +24,21 @@ bool __weak copy_from_kernel_nofault_allowed(const void *unsafe_src, long copy_from_kernel_nofault(void *dst, const void *src, size_t size) { + unsigned long align = 0; + + if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) + align = (unsigned long)dst | (unsigned long)src; + if (!copy_from_kernel_nofault_allowed(src, size)) return -ERANGE; pagefault_disable(); - copy_from_kernel_nofault_loop(dst, src, size, u64, Efault); - copy_from_kernel_nofault_loop(dst, src, size, u32, Efault); - copy_from_kernel_nofault_loop(dst, src, size, u16, Efault); + if (!(align & 7)) + copy_from_kernel_nofault_loop(dst, src, size, u64, Efault); + if (!(align & 3)) + copy_from_kernel_nofault_loop(dst, src, size, u32, Efault); + if (!(align & 1)) + copy_from_kernel_nofault_loop(dst, src, size, u16, Efault); copy_from_kernel_nofault_loop(dst, src, size, u8, Efault); pagefault_enable(); return 0; @@ -50,10 +58,18 @@ EXPORT_SYMBOL_GPL(copy_from_kernel_nofault); long copy_to_kernel_nofault(void *dst, const void *src, size_t size) { + unsigned long align = 0; + + if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) + align = (unsigned long)dst | (unsigned long)src; + pagefault_disable(); - copy_to_kernel_nofault_loop(dst, src, size, u64, Efault); - copy_to_kernel_nofault_loop(dst, src, size, u32, Efault); - copy_to_kernel_nofault_loop(dst, src, size, u16, Efault); + if (!(align & 7)) + copy_to_kernel_nofault_loop(dst, src, size, u64, Efault); + if (!(align & 3)) + copy_to_kernel_nofault_loop(dst, src, size, u32, Efault); + if (!(align & 1)) + copy_to_kernel_nofault_loop(dst, src, size, u16, Efault); copy_to_kernel_nofault_loop(dst, src, size, u8, Efault); pagefault_enable(); return 0; -- 2.27.0
Re: [PATCH RFC v2 10/21] kasan: inline random_tag for HW_TAGS
On Wed, Oct 28, 2020 at 12:08 PM Dmitry Vyukov wrote: > > On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov > wrote: > > > > Using random_tag() currently results in a function call. Move its > > definition to mm/kasan/kasan.h and turn it into a static inline function > > for hardware tag-based mode to avoid uneeded function call. > > > > Signed-off-by: Andrey Konovalov > > Link: > > https://linux-review.googlesource.com/id/Iac5b2faf9a912900e16cca6834d621f5d4abf427 > > --- > > mm/kasan/hw_tags.c | 5 - > > mm/kasan/kasan.h | 37 - > > 2 files changed, 20 insertions(+), 22 deletions(-) > > > > diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c > > index c3a0e83b5e7a..4c24bfcfeff9 100644 > > --- a/mm/kasan/hw_tags.c > > +++ b/mm/kasan/hw_tags.c > > @@ -36,11 +36,6 @@ void kasan_unpoison_memory(const void *address, size_t > > size) > > round_up(size, KASAN_GRANULE_SIZE), > > get_tag(address)); > > } > > > > -u8 random_tag(void) > > -{ > > - return get_random_tag(); > > -} > > - > > bool check_invalid_free(void *addr) > > { > > u8 ptr_tag = get_tag(addr); > > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h > > index 0ccbb3c4c519..94ba15c2f860 100644 > > --- a/mm/kasan/kasan.h > > +++ b/mm/kasan/kasan.h > > @@ -188,6 +188,12 @@ static inline bool addr_has_metadata(const void *addr) > > > > #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */ > > > > +#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS) > > +void print_tags(u8 addr_tag, const void *addr); > > +#else > > +static inline void print_tags(u8 addr_tag, const void *addr) { } > > +#endif > > + > > bool check_invalid_free(void *addr); > > > > void *find_first_bad_addr(void *addr, size_t size); > > @@ -223,23 +229,6 @@ static inline void quarantine_reduce(void) { } > > static inline void quarantine_remove_cache(struct kmem_cache *cache) { } > > #endif > > > > -#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS) > > - > > -void print_tags(u8 addr_tag, const void *addr); > > - > > -u8 random_tag(void); > > - > > -#else > > - > > -static inline void print_tags(u8 addr_tag, const void *addr) { } > > - > > -static inline u8 random_tag(void) > > -{ > > - return 0; > > -} > > - > > -#endif > > - > > #ifndef arch_kasan_set_tag > > static inline const void *arch_kasan_set_tag(const void *addr, u8 tag) > > { > > @@ -273,6 +262,20 @@ static inline const void *arch_kasan_set_tag(const > > void *addr, u8 tag) > > #define get_mem_tag(addr) arch_get_mem_tag(addr) > > #define set_mem_tag_range(addr, size, tag) > > arch_set_mem_tag_range((addr), (size), (tag)) > > > > +#ifdef CONFIG_KASAN_SW_TAGS > > +u8 random_tag(void); > > +#elif defined(CONFIG_KASAN_HW_TAGS) > > +static inline u8 random_tag(void) > > +{ > > + return get_random_tag(); > > What's the difference between random_tag() and get_random_tag()? Do we > need both? Not really. Will simplify this in the next version and give cleaner names.
[PATCH 3/9] ARM: oabi-compat: add epoll_pwait handler
From: Arnd Bergmann The epoll_wait() syscall has a special version for OABI compat mode to convert the arguments to the EABI structure layout of the kernel. However, the later epoll_pwait() syscall was added in arch/arm in linux-2.6.32 without this conversion. Use the same kind of handler for both. Fixes: 369842658a36 ("ARM: 5677/1: ARM support for TIF_RESTORE_SIGMASK/pselect6/ppoll/epoll_pwait") Cc: sta...@vger.kernel.org Reviewed-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- arch/arm/kernel/sys_oabi-compat.c | 37 --- arch/arm/tools/syscall.tbl| 2 +- 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c index 0203e545bbc8..a2b1ae01e5bf 100644 --- a/arch/arm/kernel/sys_oabi-compat.c +++ b/arch/arm/kernel/sys_oabi-compat.c @@ -264,9 +264,8 @@ asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd, return do_epoll_ctl(epfd, op, fd, , false); } -asmlinkage long sys_oabi_epoll_wait(int epfd, - struct oabi_epoll_event __user *events, - int maxevents, int timeout) +static long do_oabi_epoll_wait(int epfd, struct oabi_epoll_event __user *events, + int maxevents, int timeout) { struct epoll_event *kbuf; struct oabi_epoll_event e; @@ -299,6 +298,38 @@ asmlinkage long sys_oabi_epoll_wait(int epfd, return err ? -EFAULT : ret; } +SYSCALL_DEFINE4(oabi_epoll_wait, int, epfd, + struct oabi_epoll_event __user *, events, + int, maxevents, int, timeout) +{ + return do_oabi_epoll_wait(epfd, events, maxevents, timeout); +} + +/* + * Implement the event wait interface for the eventpoll file. It is the kernel + * part of the user space epoll_pwait(2). + */ +SYSCALL_DEFINE6(oabi_epoll_pwait, int, epfd, + struct oabi_epoll_event __user *, events, int, maxevents, + int, timeout, const sigset_t __user *, sigmask, + size_t, sigsetsize) +{ + int error; + + /* +* If the caller wants a certain signal mask to be set during the wait, +* we apply it here. +*/ + error = set_user_sigmask(sigmask, sigsetsize); + if (error) + return error; + + error = do_oabi_epoll_wait(epfd, events, maxevents, timeout); + restore_saved_sigmask_unless(error == -EINTR); + + return error; +} + struct oabi_sembuf { unsigned short sem_num; short sem_op; diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index d056a548358e..330cf0aa04c4 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -360,7 +360,7 @@ 343common vmsplicesys_vmsplice 344common move_pages sys_move_pages 345common getcpu sys_getcpu -346common epoll_pwait sys_epoll_pwait +346common epoll_pwait sys_epoll_pwait sys_oabi_epoll_pwait 347common kexec_load sys_kexec_load 348common utimensat sys_utimensat_time32 349common signalfdsys_signalfd -- 2.27.0
[PATCH 7/9] ARM: oabi-compat: rework fcntl64() emulation
From: Arnd Bergmann This is one of the last users of get_fs(), and this is fairly easy to change, since the infrastructure for it is already there. The replacement here is essentially a copy of the existing fcntl64() syscall entry function. Signed-off-by: Arnd Bergmann --- arch/arm/kernel/sys_oabi-compat.c | 93 --- 1 file changed, 60 insertions(+), 33 deletions(-) diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c index c3e63b73b6ae..3449e163ea88 100644 --- a/arch/arm/kernel/sys_oabi-compat.c +++ b/arch/arm/kernel/sys_oabi-compat.c @@ -194,56 +194,83 @@ struct oabi_flock64 { pid_t l_pid; } __attribute__ ((packed,aligned(4))); -static long do_locks(unsigned int fd, unsigned int cmd, -unsigned long arg) +static int get_oabi_flock(struct flock64 *kernel, struct oabi_flock64 __user *arg) { - struct flock64 kernel; struct oabi_flock64 user; - mm_segment_t fs; - long ret; if (copy_from_user(, (struct oabi_flock64 __user *)arg, sizeof(user))) return -EFAULT; - kernel.l_type = user.l_type; - kernel.l_whence = user.l_whence; - kernel.l_start = user.l_start; - kernel.l_len= user.l_len; - kernel.l_pid= user.l_pid; - - fs = get_fs(); - set_fs(KERNEL_DS); - ret = sys_fcntl64(fd, cmd, (unsigned long)); - set_fs(fs); - - if (!ret && (cmd == F_GETLK64 || cmd == F_OFD_GETLK)) { - user.l_type = kernel.l_type; - user.l_whence = kernel.l_whence; - user.l_start= kernel.l_start; - user.l_len = kernel.l_len; - user.l_pid = kernel.l_pid; - if (copy_to_user((struct oabi_flock64 __user *)arg, -, sizeof(user))) - ret = -EFAULT; - } - return ret; + + kernel->l_type = user.l_type; + kernel->l_whence = user.l_whence; + kernel->l_start = user.l_start; + kernel->l_len= user.l_len; + kernel->l_pid= user.l_pid; + + return 0; +} + +static int put_oabi_flock(struct flock64 *kernel, struct oabi_flock64 __user *arg) +{ + struct oabi_flock64 user; + + user.l_type = kernel->l_type; + user.l_whence = kernel->l_whence; + user.l_start= kernel->l_start; + user.l_len = kernel->l_len; + user.l_pid = kernel->l_pid; + + if (copy_to_user((struct oabi_flock64 __user *)arg, +, sizeof(user))) + return -EFAULT; + + return 0; } asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned int cmd, unsigned long arg) { + void __user *argp = (void __user *)arg; + struct fd f = fdget_raw(fd); + struct flock64 flock; + long err = -EBADF; + + if (!f.file) + goto out; + switch (cmd) { - case F_OFD_GETLK: - case F_OFD_SETLK: - case F_OFD_SETLKW: case F_GETLK64: + case F_OFD_GETLK: + err = security_file_fcntl(f.file, cmd, arg); + if (err) + break; + err = get_oabi_flock(, argp); + if (err) + break; + err = fcntl_getlk64(f.file, cmd, ); + if (!err) + err = put_oabi_flock(, argp); + break; case F_SETLK64: case F_SETLKW64: - return do_locks(fd, cmd, arg); - + case F_OFD_SETLK: + case F_OFD_SETLKW: + err = security_file_fcntl(f.file, cmd, arg); + if (err) + break; + err = get_oabi_flock(, argp); + if (err) + break; + err = fcntl_setlk64(fd, f.file, cmd, ); + break; default: - return sys_fcntl64(fd, cmd, arg); + err = sys_fcntl64(fd, cmd, arg); + break; } + fdput(f); +out: + return err; } struct oabi_epoll_event { -- 2.27.0
[PATCH 9/9] ARM: uaccess: remove set_fs() implementation
From: Arnd Bergmann There are no remaining callers of set_fs(), so just remove it along with all associated code that operates on thread_info->addr_limit. There are still further optimizations that can be done: - In get_user(), the address check could be moved entirely into the out of line code, rather than passing a constant as an argument, - I assume the DACR handling can be simplified as we now only change it during user access when CONFIG_CPU_SW_DOMAIN_PAN is set, but not during set_fs(). Signed-off-by: Arnd Bergmann --- arch/arm/Kconfig | 1 - arch/arm/include/asm/ptrace.h | 1 - arch/arm/include/asm/thread_info.h | 4 --- arch/arm/include/asm/uaccess-asm.h | 6 arch/arm/include/asm/uaccess.h | 46 +++--- arch/arm/kernel/asm-offsets.c | 2 -- arch/arm/kernel/entry-common.S | 9 -- arch/arm/kernel/process.c | 7 + arch/arm/kernel/signal.c | 8 -- arch/arm/lib/copy_from_user.S | 3 +- arch/arm/lib/copy_to_user.S| 3 +- 11 files changed, 7 insertions(+), 83 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index fe2f17eb2b50..55a8892dd5d8 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -120,7 +120,6 @@ config ARM select PCI_SYSCALL if PCI select PERF_USE_VMALLOC select RTC_LIB - select SET_FS select SYS_SUPPORTS_APM_EMULATION # Above selects are sorted alphabetically; please add new ones # according to that. Thanks. diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h index 91d6b7856be4..93051e2f402c 100644 --- a/arch/arm/include/asm/ptrace.h +++ b/arch/arm/include/asm/ptrace.h @@ -19,7 +19,6 @@ struct pt_regs { struct svc_pt_regs { struct pt_regs regs; u32 dacr; - u32 addr_limit; }; #define to_svc_pt_regs(r) container_of(r, struct svc_pt_regs, regs) diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index 536b6b979f63..8b705f611216 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -23,8 +23,6 @@ struct task_struct; #include -typedef unsigned long mm_segment_t; - struct cpu_context_save { __u32 r4; __u32 r5; @@ -46,7 +44,6 @@ struct cpu_context_save { struct thread_info { unsigned long flags; /* low level flags */ int preempt_count; /* 0 => preemptable, <0 => bug */ - mm_segment_taddr_limit; /* address limit */ struct task_struct *task; /* main task structure */ __u32 cpu;/* cpu */ __u32 cpu_domain; /* cpu domain */ @@ -72,7 +69,6 @@ struct thread_info { .task = , \ .flags = 0,\ .preempt_count = INIT_PREEMPT_COUNT, \ - .addr_limit = KERNEL_DS,\ } /* diff --git a/arch/arm/include/asm/uaccess-asm.h b/arch/arm/include/asm/uaccess-asm.h index 907571fd05c6..6451a433912c 100644 --- a/arch/arm/include/asm/uaccess-asm.h +++ b/arch/arm/include/asm/uaccess-asm.h @@ -84,12 +84,8 @@ * if \disable is set. */ .macro uaccess_entry, tsk, tmp0, tmp1, tmp2, disable - ldr \tmp1, [\tsk, #TI_ADDR_LIMIT] - mov \tmp2, #TASK_SIZE - str \tmp2, [\tsk, #TI_ADDR_LIMIT] DACR( mrc p15, 0, \tmp0, c3, c0, 0) DACR( str \tmp0, [sp, #SVC_DACR]) - str \tmp1, [sp, #SVC_ADDR_LIMIT] .if \disable && IS_ENABLED(CONFIG_CPU_SW_DOMAIN_PAN) /* kernel=client, user=no access */ mov \tmp2, #DACR_UACCESS_DISABLE @@ -106,9 +102,7 @@ /* Restore the user access state previously saved by uaccess_entry */ .macro uaccess_exit, tsk, tmp0, tmp1 - ldr \tmp1, [sp, #SVC_ADDR_LIMIT] DACR( ldr \tmp0, [sp, #SVC_DACR]) - str \tmp1, [\tsk, #TI_ADDR_LIMIT] DACR( mcr p15, 0, \tmp0, c3, c0, 0) .endm diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 4f60638755c4..084d1c07c2d0 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -52,32 +52,8 @@ static __always_inline void uaccess_restore(unsigned int flags) extern int __get_user_bad(void); extern int __put_user_bad(void); -/* - * Note that this is actually 0x1,, - */ -#define KERNEL_DS 0x - #ifdef CONFIG_MMU -#define USER_DSTASK_SIZE -#define get_fs() (current_thread_info()->addr_limit) - -static inline void set_fs(mm_segment_t fs) -{ - current_thread_info()->addr_limit = fs; - - /* -* Prevent a mispredicted conditional call to set_fs from forwarding -* the wrong address limit to access_ok
Re: [PATCH v2 0/6] ARM: dts: sun8i: v3s: Enable video decoder
Hi! On Fri, Oct 30, 2020 at 12:06:10PM +0100, Hans Verkuil wrote: > Maxime, > > Are you OK with this series? It looks good to me. I am, you can take it. I'll merge the dt patches through arm-soc Thanks! Maxime signature.asc Description: PGP signature
[PATCH 4/9] ARM: syscall: always store thread_info->syscall
From: Arnd Bergmann The system call number is used in a a couple of places, in particular ptrace, seccomp and /proc//syscall. The last one apparently never worked reliably on ARM for tasks that are not currently getting traced. Storing the syscall number in the normal entry path makes it work, as well as allowing us to see if the current system call is for OABI compat mode, which is the next thing I want to hook into. Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/syscall.h | 5 - arch/arm/kernel/asm-offsets.c | 1 + arch/arm/kernel/entry-common.S | 8 ++-- arch/arm/kernel/ptrace.c | 9 + 4 files changed, 16 insertions(+), 7 deletions(-) diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h index fd02761ba06c..89898497edd6 100644 --- a/arch/arm/include/asm/syscall.h +++ b/arch/arm/include/asm/syscall.h @@ -22,7 +22,10 @@ extern const unsigned long sys_call_table[]; static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { - return task_thread_info(task)->syscall; + if (IS_ENABLED(CONFIG_AEABI) && !IS_ENABLED(CONFIG_OABI_COMPAT)) + return task_thread_info(task)->syscall; + + return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE; } static inline void syscall_rollback(struct task_struct *task, diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c index a1570c8bab25..97af6735172b 100644 --- a/arch/arm/kernel/asm-offsets.c +++ b/arch/arm/kernel/asm-offsets.c @@ -46,6 +46,7 @@ int main(void) DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); DEFINE(TI_CPU_DOMAIN,offsetof(struct thread_info, cpu_domain)); DEFINE(TI_CPU_SAVE, offsetof(struct thread_info, cpu_context)); + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); DEFINE(TI_TP_VALUE, offsetof(struct thread_info, tp_value)); DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 271cb8a1eba1..9a76467bbb47 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -223,6 +223,7 @@ ENTRY(vector_swi) /* saved_psr and saved_pc are now dead */ uaccess_disable tbl + get_thread_info tsk adr tbl, sys_call_table @ load syscall table pointer @@ -234,13 +235,17 @@ ENTRY(vector_swi) * get the old ABI syscall table address. */ bicsr10, r10, #0xff00 + strne r10, [tsk, #TI_SYSCALL] + streq scno, [tsk, #TI_SYSCALL] eorne scno, r10, #__NR_OABI_SYSCALL_BASE ldrne tbl, =sys_oabi_call_table #elif !defined(CONFIG_AEABI) bic scno, scno, #0xff00 @ mask off SWI op-code + str scno, [tsk, #TI_SYSCALL] eor scno, scno, #__NR_SYSCALL_BASE @ check OS number +#else + str scno, [tsk, #TI_SYSCALL] #endif - get_thread_info tsk /* * Reload the registers that may have been corrupted on entry to * the syscall assembly (by tracing or context tracking.) @@ -285,7 +290,6 @@ ENDPROC(vector_swi) * context switches, and waiting for our parent to respond. */ __sys_trace: - mov r1, scno add r0, sp, #S_OFF bl syscall_trace_enter mov scno, r0 diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c index 2771e682220b..683edb8b627d 100644 --- a/arch/arm/kernel/ptrace.c +++ b/arch/arm/kernel/ptrace.c @@ -25,6 +25,7 @@ #include #include +#include #include #define CREATE_TRACE_POINTS @@ -885,9 +886,9 @@ static void tracehook_report_syscall(struct pt_regs *regs, regs->ARM_ip = ip; } -asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno) +asmlinkage int syscall_trace_enter(struct pt_regs *regs) { - current_thread_info()->syscall = scno; + int scno; if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER); @@ -898,11 +899,11 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno) return -1; #else /* XXX: remove this once OABI gets fixed */ - secure_computing_strict(current_thread_info()->syscall); + secure_computing_strict(syscall_get_nr(current, regs)); #endif /* Tracer or seccomp may have changed syscall. */ - scno = current_thread_info()->syscall; + scno = syscall_get_nr(current, regs); if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) trace_sys_enter(regs, scno); -- 2.27.0
[PATCH 8/9] ARM: uaccess: add __{get,put}_kernel_nofault
From: Arnd Bergmann These mimic the behavior of get_user and put_user, except for domain switching, address limit checking and handling of mismatched sizes, none of which are relevant here. To work with pre-Armv6 kernels, this has to avoid TUSER() inside of the new macros, the new approach passes the "t" string along with the opcode, which is a bit uglier but avoids duplicating more code. As there is no __get_user_asm_dword(), I work around it by copying 32 bit at a time, which is possible because the output size is known. Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/uaccess.h | 123 ++--- 1 file changed, 83 insertions(+), 40 deletions(-) diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index a13d90206472..4f60638755c4 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -308,11 +308,11 @@ static inline void set_fs(mm_segment_t fs) #define __get_user(x, ptr) \ ({ \ long __gu_err = 0; \ - __get_user_err((x), (ptr), __gu_err); \ + __get_user_err((x), (ptr), __gu_err, TUSER()); \ __gu_err; \ }) -#define __get_user_err(x, ptr, err)\ +#define __get_user_err(x, ptr, err, __t) \ do { \ unsigned long __gu_addr = (unsigned long)(ptr); \ unsigned long __gu_val; \ @@ -321,18 +321,19 @@ do { \ might_fault(); \ __ua_flags = uaccess_save_and_enable(); \ switch (sizeof(*(ptr))) { \ - case 1: __get_user_asm_byte(__gu_val, __gu_addr, err); break; \ - case 2: __get_user_asm_half(__gu_val, __gu_addr, err); break; \ - case 4: __get_user_asm_word(__gu_val, __gu_addr, err); break; \ + case 1: __get_user_asm_byte(__gu_val, __gu_addr, err, __t); break; \ + case 2: __get_user_asm_half(__gu_val, __gu_addr, err, __t); break; \ + case 4: __get_user_asm_word(__gu_val, __gu_addr, err, __t); break; \ default: (__gu_val) = __get_user_bad(); \ } \ uaccess_restore(__ua_flags);\ (x) = (__typeof__(*(ptr)))__gu_val; \ } while (0) +#endif #define __get_user_asm(x, addr, err, instr)\ __asm__ __volatile__( \ - "1: " TUSER(instr) " %1, [%2], #0\n"\ + "1: " instr " %1, [%2], #0\n" \ "2:\n" \ " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n"\ @@ -348,40 +349,38 @@ do { \ : "r" (addr), "i" (-EFAULT) \ : "cc") -#define __get_user_asm_byte(x, addr, err) \ - __get_user_asm(x, addr, err, ldrb) +#define __get_user_asm_byte(x, addr, err, __t) \ + __get_user_asm(x, addr, err, "ldrb" __t) #if __LINUX_ARM_ARCH__ >= 6 -#define __get_user_asm_half(x, addr, err) \ - __get_user_asm(x, addr, err, ldrh) +#define __get_user_asm_half(x, addr, err, __t) \ + __get_user_asm(x, addr, err, "ldrh" __t) #else #ifndef __ARMEB__ -#define __get_user_asm_half(x, __gu_addr, err) \ +#define __get_user_asm_half(x, __gu_addr, err, __t)\ ({ \ unsigned long __b1, __b2; \ - __get_user_asm_byte(__b1, __gu_addr, err); \ - __get_user_asm_byte(__b2, __gu_addr + 1, err); \ + __get_user_asm_byte(__b1, __gu_addr, err, __t); \ + __get_user_asm_byte(__b2, __gu_addr + 1, err, __t); \ (x) = __b1 | (__b2 << 8); \ }) #else -#define __get_user_asm_half(x, __gu_addr, err) \ +#define __get_user_asm_half(x, __gu_addr, err, __t)\ ({ \ unsigned long __b1, __b2; \ - __get_user_asm_byte(__b1, __gu_addr, err); \ -
[PATCH 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation
From: Arnd Bergmann The epoll_wait() system call wrapper is one of the remaining users of the set_fs() infrasturcture for Arm. Changing it to not require set_fs() is rather complex unfortunately. The approach I'm taking here is to allow architectures to override the code that copies the output to user space, and let the oabi-compat implementation check whether it is getting called from an EABI or OABI system call based on the thread_info->syscall value. The in_oabi_syscall() check here mirrors the in_compat_syscall() and in_x32_syscall() helpers for 32-bit compat implementations on other architectures. Overall, the amount of code goes down, at least with the newly added sys_oabi_epoll_pwait() helper getting removed again. The downside is added complexity in the source code for the native implementation. There should be no difference in runtime performance except for Arm kernels with CONFIG_OABI_COMPAT enabled that now have to go through an external function call to check which of the two variants to use. Signed-off-by: Arnd Bergmann --- arch/arm/include/asm/syscall.h| 11 + arch/arm/kernel/sys_oabi-compat.c | 75 +++ arch/arm/tools/syscall.tbl| 4 +- fs/eventpoll.c| 5 +-- include/linux/eventpoll.h | 18 5 files changed, 49 insertions(+), 64 deletions(-) diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h index 89898497edd6..9efb7b3384e5 100644 --- a/arch/arm/include/asm/syscall.h +++ b/arch/arm/include/asm/syscall.h @@ -28,6 +28,17 @@ static inline int syscall_get_nr(struct task_struct *task, return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE; } +static inline bool __in_oabi_syscall(struct task_struct *task) +{ + return IS_ENABLED(CONFIG_OABI_COMPAT) && + (task_thread_info(task)->syscall & __NR_OABI_SYSCALL_BASE); +} + +static inline bool in_oabi_syscall(void) +{ + return __in_oabi_syscall(current); +} + static inline void syscall_rollback(struct task_struct *task, struct pt_regs *regs) { diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c index a2b1ae01e5bf..f9d8e5be6ba0 100644 --- a/arch/arm/kernel/sys_oabi-compat.c +++ b/arch/arm/kernel/sys_oabi-compat.c @@ -83,6 +83,8 @@ #include #include +#include + struct oldabi_stat64 { unsigned long long st_dev; unsigned int__pad1; @@ -264,70 +266,25 @@ asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd, return do_epoll_ctl(epfd, op, fd, , false); } -static long do_oabi_epoll_wait(int epfd, struct oabi_epoll_event __user *events, - int maxevents, int timeout) +struct epoll_event __user * +epoll_put_uevent(__poll_t revents, __u64 data, +struct epoll_event __user *uevent) { - struct epoll_event *kbuf; - struct oabi_epoll_event e; - mm_segment_t fs; - long ret, err, i; + if (in_oabi_syscall()) { + struct oabi_epoll_event __user *oevent = (void __user *)uevent; - if (maxevents <= 0 || - maxevents > (INT_MAX/sizeof(*kbuf)) || - maxevents > (INT_MAX/sizeof(*events))) - return -EINVAL; - if (!access_ok(events, sizeof(*events) * maxevents)) - return -EFAULT; - kbuf = kmalloc_array(maxevents, sizeof(*kbuf), GFP_KERNEL); - if (!kbuf) - return -ENOMEM; - fs = get_fs(); - set_fs(KERNEL_DS); - ret = sys_epoll_wait(epfd, kbuf, maxevents, timeout); - set_fs(fs); - err = 0; - for (i = 0; i < ret; i++) { - e.events = kbuf[i].events; - e.data = kbuf[i].data; - err = __copy_to_user(events, , sizeof(e)); - if (err) - break; - events++; - } - kfree(kbuf); - return err ? -EFAULT : ret; -} + if (__put_user(revents, >events) || + __put_user(data, >data)) + return NULL; -SYSCALL_DEFINE4(oabi_epoll_wait, int, epfd, - struct oabi_epoll_event __user *, events, - int, maxevents, int, timeout) -{ - return do_oabi_epoll_wait(epfd, events, maxevents, timeout); -} - -/* - * Implement the event wait interface for the eventpoll file. It is the kernel - * part of the user space epoll_pwait(2). - */ -SYSCALL_DEFINE6(oabi_epoll_pwait, int, epfd, - struct oabi_epoll_event __user *, events, int, maxevents, - int, timeout, const sigset_t __user *, sigmask, - size_t, sigsetsize) -{ - int error; - - /* -* If the caller wants a certain signal mask to be set during the wait, -* we apply it here. -*/ - error = set_user_sigmask(sigmask, sigsetsize); - if (error) - return error; +
[PATCH 6/9] ARM: oabi-compat: rework sys_semtimedop emulation
From: Arnd Bergmann sys_oabi_semtimedop() is one of the last users of set_fs() on Arm. To remove this one, expose the internal code of the actual implementation that operates on a kernel pointer and call it directly after copying. There should be no measurable impact on the normal execution of this function, and it makes the overly long function a little shorter, which may help readability. While reworking the oabi version, make it behave a little more like the native one, using kvmalloc_array() and restructure the code flow in a similar way. The naming of __do_semtimedop() is not very good, I hope someone can come up with a better name. One regression was spotted by kernel test robot and fixed before the first mailing list submission. Signed-off-by: Arnd Bergmann --- arch/arm/kernel/sys_oabi-compat.c | 38 -- include/linux/syscalls.h | 3 ++ ipc/sem.c | 84 +++ 3 files changed, 77 insertions(+), 48 deletions(-) diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c index f9d8e5be6ba0..c3e63b73b6ae 100644 --- a/arch/arm/kernel/sys_oabi-compat.c +++ b/arch/arm/kernel/sys_oabi-compat.c @@ -80,6 +80,7 @@ #include #include #include +#include #include #include @@ -294,46 +295,51 @@ struct oabi_sembuf { unsigned short __pad; }; +#define sc_semopm sem_ctls[2] + asmlinkage long sys_oabi_semtimedop(int semid, struct oabi_sembuf __user *tsops, unsigned nsops, const struct old_timespec32 __user *timeout) { + struct ipc_namespace *ns; struct sembuf *sops; - struct old_timespec32 local_timeout; long err; int i; + ns = current->nsproxy->ipc_ns; + if (nsops > ns->sc_semopm) + return -E2BIG; if (nsops < 1 || nsops > SEMOPM) return -EINVAL; - if (!access_ok(tsops, sizeof(*tsops) * nsops)) - return -EFAULT; - sops = kmalloc_array(nsops, sizeof(*sops), GFP_KERNEL); + sops = kvmalloc_array(nsops, sizeof(*sops), GFP_KERNEL); if (!sops) return -ENOMEM; err = 0; for (i = 0; i < nsops; i++) { struct oabi_sembuf osb; - err |= __copy_from_user(, tsops, sizeof(osb)); + err |= copy_from_user(, tsops, sizeof(osb)); sops[i].sem_num = osb.sem_num; sops[i].sem_op = osb.sem_op; sops[i].sem_flg = osb.sem_flg; tsops++; } - if (timeout) { - /* copy this as well before changing domain protection */ - err |= copy_from_user(_timeout, timeout, sizeof(*timeout)); - timeout = _timeout; - } if (err) { err = -EFAULT; - } else { - mm_segment_t fs = get_fs(); - set_fs(KERNEL_DS); - err = sys_semtimedop_time32(semid, sops, nsops, timeout); - set_fs(fs); + goto out; + } + + if (timeout) { + struct timespec64 ts; + err = get_old_timespec32(, timeout); + if (err) + goto out; + err = __do_semtimedop(semid, sops, nsops, , ns); + goto out; } - kfree(sops); + err = __do_semtimedop(semid, sops, nsops, NULL, ns); +out: + kvfree(sops); return err; } diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 37bea07c12f2..fc7340fa4702 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1342,6 +1342,9 @@ long ksys_old_shmctl(int shmid, int cmd, struct shmid_ds __user *buf); long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems, unsigned int nsops, const struct old_timespec32 __user *timeout); +long __do_semtimedop(int semid, struct sembuf *tsems, unsigned int nsops, +const struct timespec64 *timeout, +struct ipc_namespace *ns); int __sys_getsockopt(int fd, int level, int optname, char __user *optval, int __user *optlen); diff --git a/ipc/sem.c b/ipc/sem.c index f6c30a85dadf..c7725031c2ba 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -1978,46 +1978,34 @@ static struct sem_undo *find_alloc_undo(struct ipc_namespace *ns, int semid) return un; } -static long do_semtimedop(int semid, struct sembuf __user *tsops, - unsigned nsops, const struct timespec64 *timeout) +long __do_semtimedop(int semid, struct sembuf *sops, + unsigned nsops, const struct timespec64 *timeout, + struct ipc_namespace *ns) { int error = -EINVAL; struct sem_array *sma; - struct sembuf fast_sops[SEMOPM_FAST]; - struct sembuf *sops = fast_sops, *sop; +
[PATCH 2/9] ARM: traps: use get_kernel_nofault instead of set_fs()
From: Arnd Bergmann ARM uses set_fs() and __get_user() to allow the stack dumping code to access possibly invalid pointers carefully. These can be changed to the simpler get_kernel_nofault(), and allow the eventual removal of set_fs(). dump_instr() will print either kernel or user space pointers, depending on how it was called. For dump_mem(), I assume we are only interested in kernel pointers, and the only time that this is called with user_mode(regs)==true is when the regs themselves are unreliable as a result of the condition that caused the trap. Reviewed-by: Christoph Hellwig Signed-off-by: Arnd Bergmann --- arch/arm/kernel/traps.c | 47 ++--- 1 file changed, 16 insertions(+), 31 deletions(-) diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 17d5a785df28..c3964a283b63 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -122,17 +122,8 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom, unsigned long top) { unsigned long first; - mm_segment_t fs; int i; - /* -* We need to switch to kernel mode so that we can use __get_user -* to safely read from kernel space. Note that we now dump the -* code first, just in case the backtrace kills us. -*/ - fs = get_fs(); - set_fs(KERNEL_DS); - printk("%s%s(0x%08lx to 0x%08lx)\n", lvl, str, bottom, top); for (first = bottom & ~31; first < top; first += 32) { @@ -145,7 +136,7 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom, for (p = first, i = 0; i < 8 && p < top; i++, p += 4) { if (p >= bottom && p < top) { unsigned long val; - if (__get_user(val, (unsigned long *)p) == 0) + if (get_kernel_nofault(val, (unsigned long *)p)) sprintf(str + i * 9, " %08lx", val); else sprintf(str + i * 9, " "); @@ -153,11 +144,9 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom, } printk("%s%04lx:%s\n", lvl, first & 0x, str); } - - set_fs(fs); } -static void __dump_instr(const char *lvl, struct pt_regs *regs) +static void dump_instr(const char *lvl, struct pt_regs *regs) { unsigned long addr = instruction_pointer(regs); const int thumb = thumb_mode(regs); @@ -173,10 +162,20 @@ static void __dump_instr(const char *lvl, struct pt_regs *regs) for (i = -4; i < 1 + !!thumb; i++) { unsigned int val, bad; - if (thumb) - bad = get_user(val, &((u16 *)addr)[i]); - else - bad = get_user(val, &((u32 *)addr)[i]); + if (!user_mode(regs)) { + if (thumb) { + u16 val16; + bad = get_kernel_nofault(val16, &((u16 *)addr)[i]); + val = val16; + } else { + bad = get_kernel_nofault(val, &((u32 *)addr)[i]); + } + } else { + if (thumb) + bad = get_user(val, &((u16 *)addr)[i]); + else + bad = get_user(val, &((u32 *)addr)[i]); + } if (!bad) p += sprintf(p, i == 0 ? "(%0*x) " : "%0*x ", @@ -189,20 +188,6 @@ static void __dump_instr(const char *lvl, struct pt_regs *regs) printk("%sCode: %s\n", lvl, str); } -static void dump_instr(const char *lvl, struct pt_regs *regs) -{ - mm_segment_t fs; - - if (!user_mode(regs)) { - fs = get_fs(); - set_fs(KERNEL_DS); - __dump_instr(lvl, regs); - set_fs(fs); - } else { - __dump_instr(lvl, regs); - } -} - #ifdef CONFIG_ARM_UNWIND static inline void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk, const char *loglvl) -- 2.27.0
Re: [PATCH v2 1/5] dt-bindings: display: mediatek: disp: add documentation for MT8167 SoC
On Fri, 23 Oct 2020 15:31:26 +0200, Fabien Parent wrote: > Add binding documentation for the MT8167 SoC > > Signed-off-by: Fabien Parent > Reviewed-by: Chun-Kuang Hu > --- > > Changelog: > > V2: No change > > .../devicetree/bindings/display/mediatek/mediatek,disp.txt| 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > Acked-by: Rob Herring
RE: [PATCH v11 00/10] NTFS read-write driver GPL implementation by Paragon Software
From: Pali Rohár Sent: Friday, October 30, 2020 6:25 PM > To: Konstantin Komarov > Cc: linux-fsde...@vger.kernel.org; v...@zeniv.linux.org.uk; > linux-kernel@vger.kernel.org; dste...@suse.cz; aap...@suse.com; > wi...@infradead.org; rdun...@infradead.org; j...@perches.com; > m...@harmstone.com; nbori...@suse.com; linux-ntfs- > d...@lists.sourceforge.net; an...@tuxera.com > Subject: Re: [PATCH v11 00/10] NTFS read-write driver GPL implementation by > Paragon Software > > Hello and thanks for update! > > I have just two comments for the last v11 version. > > I really do not like nls_alt mount option and I do not think we should > merge this mount option into ntfs kernel driver. Details I described in: > https://lore.kernel.org/linux-fsdevel/20201009154734.andv4es3azkkskm5@pali/ > > tl;dr it is not systematic solution and is incompatible with existing > in-kernel ntfs driver, also incompatible with in-kernel vfat, udf and > ext4 (with UNICODE support) drivers. In my opinion, all kernel fs > drivers which deals with UNICODE should handle it in similar way. > Hello Pali! First of all, apologies for not providing a feedback on your previous message regarding the 'nls_alt'. We had internal discussions on the topic and overall conclusion is that: we do not want to compromise Kernel standards with our submission. So we will remove the 'nls_alt' option in the next version. However, there are still few points we have on the topic, please read below. > It would be really bad if userspace application need to behave > differently for this new ntfs driver and differently for all other > UNICODE drivers. > The option does not anyhow affect userspace applications. For the "default" example of unzip/tar: 1 - if this option is not applied (e.g. "vfat case"), trying to unzip an archive with, e.g. CP-1251, names inside to the target fs volume, will return error, and issued file(s) won't be unzipped; 2 - if this option is applied and "nls_alt" is set, the above case will result in unzipping all the files; Also, this issue in general only applies to "non-native" filesystems. I.e. ext4 is not affected by it in any case, as it just stores the name as bytes, no matter what those bytes are. The above case won't give an unzip error on ext4. The only symptom of this would be, maybe, "incorrect encoding" marking within the listing of such files (in File Manager or Terminal, e.g. in Ubuntu), but there won't be an unzip process termination with incomplete unarchived fileset, unlike it is for vfat/exfat/ntfs without "nls_alt". > Second comment is simplification of usage nls_load() with UTF-8 parameter > which I described in older email: > https://lore.kernel.org/linux-fsdevel/948ac894450d494ea15496c2e5b8c...@paragon-software.com/ > > You wrote that you have applied it, but seems it was lost (maybe during > rebase?) as it is not present in the last v11 version. > > I suggested to not use nls_load() with UTF-8 at all. Your version of > ntfs driver does not use kernel's nls utf8 module for UTF-8 support, so > trying to load it should be avoided. Also kernel can be compiled without > utf8 nls module (which is moreover broken) and with my above suggestion, > ntfs driver would work correctly. Without that suggestion, mounting > would fail. Thanks for pointing that out. It is likely the "nls_load()" fixes were lost during rebase. Will recheck it and return them to the v12. Best regards!
Re: [PATCH -next] fs: Fix memory leaks in do_renameat2() error paths
On Fri, 2020-10-30 at 09:27 -0600, Jens Axboe wrote: > On 10/30/20 9:24 AM, Qian Cai wrote: > > We will need to call putname() before do_renameat2() returning -EINVAL > > to avoid memory leaks. > > Thanks, should mention that this isn't final by any stretch (which is > why it hasn't been posted yet), just pushed out for some exposure. I don't know what other people think about this, but I do find a bit discouraging in testing those half-baked patches in linux-next where it does not even ready to post for a review. https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=3c5499fa56f568005648e6e38201f8ae9ab88015
[PATCH 4.14 ] platform/x86: Corrects warning: missing braces around initializer
From: John Donnelly The assignment statement of a local variable "struct tp_nvram_state s[2] = {0}; " is not valid for all versions of compilers (UEK6 on OL7). Fixes: 515ded02bc4b ("platform/x86: thinkpad_acpi: initialize tp_nvram_state variable") Signed-off-by: John Donnelly --- drivers/platform/x86/thinkpad_acpi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index ffaaccded34e..c41ac0385304 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -2477,7 +2477,7 @@ static void hotkey_compare_and_issue_event(struct tp_nvram_state *oldn, */ static int hotkey_kthread(void *data) { - struct tp_nvram_state s[2] = { 0 }; + struct tp_nvram_state s[2]; u32 poll_mask, event_mask; unsigned int si, so; unsigned long t; @@ -2488,6 +2488,8 @@ static int hotkey_kthread(void *data) if (tpacpi_lifecycle == TPACPI_LIFE_EXITING) goto exit; + memset(, 0, sizeof(s)); + set_freezable(); so = 0; -- 2.27.0
RE: [PATCH v10 02/10] fs/ntfs3: Add initialization of super block
From: Matthew Wilcox Sent: Friday, October 23, 2020 9:25 PM > To: Konstantin Komarov > Cc: linux-fsde...@vger.kernel.org; v...@zeniv.linux.org.uk; > linux-kernel@vger.kernel.org; p...@kernel.org; dste...@suse.cz; > aap...@suse.com; rdun...@infradead.org; j...@perches.com; m...@harmstone.com; > nbori...@suse.com; linux-ntfs- > d...@lists.sourceforge.net; an...@tuxera.com > Subject: Re: [PATCH v10 02/10] fs/ntfs3: Add initialization of super block > > On Fri, Oct 23, 2020 at 06:44:23PM +0300, Konstantin Komarov wrote: > > + > > +/*ntfs_readpage*/ > > +/*ntfs_readpages*/ > > +/*ntfs_writepage*/ > > +/*ntfs_writepages*/ > > +/*ntfs_block_truncate_page*/ > > What are these for? > > > +int ntfs_readpage(struct file *file, struct page *page) > > +{ > > + int err; > > + struct address_space *mapping = page->mapping; > > + struct inode *inode = mapping->host; > > + struct ntfs_inode *ni = ntfs_i(inode); > > + u64 vbo = (u64)page->index << PAGE_SHIFT; > > + u64 valid; > > + struct ATTRIB *attr; > > + const char *data; > > + u32 data_size; > > + > [...] > > + > > + if (is_compressed(ni)) { > > + if (PageUptodate(page)) { > > + unlock_page(page); > > + return 0; > > + } > > You can skip this -- the readpage op won't be called for pages which > are Uptodate. > > > + /* normal + sparse files */ > > + err = mpage_readpage(page, ntfs_get_block); > > + if (err) > > + goto out; > > It would be nice to use iomap instead of mpage, but that's a big ask. > > > + valid = ni->i_valid; > > + if (vbo < valid && valid < vbo + PAGE_SIZE) { > > + if (PageLocked(page)) > > + wait_on_page_bit(page, PG_locked); > > + if (PageError(page)) { > > + ntfs_inode_warn(inode, "file garbage at 0x%llx", valid); > > + goto out; > > + } > > + zero_user_segment(page, valid & (PAGE_SIZE - 1), PAGE_SIZE); > > Nono, you can't zero data after the page has been unlocked. You can > handle this case in ntfs_get_block(). If the block is entirely beyond > i_size, returning a hole will cause mpage_readpage() to zero it. If it > straddles i_size, you can either ensure that the on-media block contains > zeroes after the EOF, or if you can't depend on that, you can read it > in synchronously in your get_block() and then zero the tail and set the > buffer Uptodate. Not the most appetising solution, but what you have here > is racy with the user writing to it after reading. Hello Matthew! Thanks a lot for this feedback. Fixed in v11, please check out. Cheers!
Re:(2) [PATCH] Input: add SW_COVER_ATTACHED and SW_EXT_PEN_ATTACHED
> Hi, > > On Fri, Oct 30, 2020 at 10:15:52PM +0900, Jungrae Kim wrote: > > From 23aed4567e234b7e108c31abadb9f3a3f7d2 Mon Sep 17 00:00:00 2001 > > From: Jungrae Kim > > Date: Fri, 30 Oct 2020 21:23:12 +0900 > > Subject: [PATCH] Input: add SW_COVER_ATTACHED and SW_EXT_PEN_ATTACHED > > > > SW_COVER_ATTACHED represents the connected state of a removable cover > > of a device. Value 0 means cover was attached with device, value 1 means > > removed it. > > Any reason against using SW_MACHINE_COVER? That was introduced for Nokia > N900, where you actually remove the cover to access battery/SD card/ > SIM card (so there is state 0 = cover removed/open and state 1 = cover > attached/closed). > > -- Sebastian > > > SW_EXT_PEN_ATTACHED represents the state of the pen. > > Some device have internal pen slot. but other some device have external pen > > slot. These two cases has different use case in userspace. So need to > > separate a event. Value 0 means pen was detach on external pen slot on > > device, value 1 means pen was attached external pen slot on device. > > > > Signed-off-by: Jungrae Kim > > --- > > include/linux/mod_devicetable.h| 2 +- > > include/uapi/linux/input-event-codes.h | 4 +++- > > 2 files changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/mod_devicetable.h > > b/include/linux/mod_devicetable.h > > index 5b08a473cdba..897f5a3e7721 100644 > > --- a/include/linux/mod_devicetable.h > > +++ b/include/linux/mod_devicetable.h > > @@ -320,7 +320,7 @@ struct pcmcia_device_id { > > #define INPUT_DEVICE_ID_LED_MAX0x0f > > #define INPUT_DEVICE_ID_SND_MAX0x07 > > #define INPUT_DEVICE_ID_FF_MAX 0x7f > > -#define INPUT_DEVICE_ID_SW_MAX 0x10 > > +#define INPUT_DEVICE_ID_SW_MAX 0x12 > > #define INPUT_DEVICE_ID_PROP_MAX 0x1f > > > > #define INPUT_DEVICE_ID_MATCH_BUS 1 > > diff --git a/include/uapi/linux/input-event-codes.h > > b/include/uapi/linux/input-event-codes.h > > index ee93428ced9a..a0506369de6d 100644 > > --- a/include/uapi/linux/input-event-codes.h > > +++ b/include/uapi/linux/input-event-codes.h > > @@ -893,7 +893,9 @@ > > #define SW_MUTE_DEVICE 0x0e /* set = device disabled */ > > #define SW_PEN_INSERTED0x0f /* set = pen inserted */ > > #define SW_MACHINE_COVER 0x10 /* set = cover closed */ > > -#define SW_MAX 0x10 > > +#define SW_COVER_ATTACHED 0x11 /* set = cover attached */ > > +#define SW_EXT_PEN_ATTACHED0x12 /* set = external pen attached */ > > +#define SW_MAX 0x12 > > #define SW_CNT (SW_MAX+1) > > > > /* > > -- > > 2.17.1 We need 2 kind of event, cover open/close and cover attach/detach. The open/close of the cover must work only if the cover is attached. So we will check cover open/close status using SW_MACHINE_COVER. Thanks Jungrae Kim 68,1 Bot
Re: [PATCH v6 3/9] arm64, kfence: enable KFENCE for ARM64
On Fri, 30 Oct 2020 at 16:47, Mark Rutland wrote: > > On Thu, Oct 29, 2020 at 02:16:43PM +0100, Marco Elver wrote: > > Add architecture specific implementation details for KFENCE and enable > > KFENCE for the arm64 architecture. In particular, this implements the > > required interface in . > > > > KFENCE requires that attributes for pages from its memory pool can > > individually be set. Therefore, force the entire linear map to be mapped > > at page granularity. Doing so may result in extra memory allocated for > > page tables in case rodata=full is not set; however, currently > > CONFIG_RODATA_FULL_DEFAULT_ENABLED=y is the default, and the common case > > is therefore not affected by this change. > > > > Reviewed-by: Dmitry Vyukov > > Co-developed-by: Alexander Potapenko > > Signed-off-by: Alexander Potapenko > > Signed-off-by: Marco Elver > > --- > > v5: > > * Move generic page allocation code to core.c [suggested by Jann Horn]. > > * Remove comment about HAVE_ARCH_KFENCE_STATIC_POOL, since we no longer > > support static pools. > > * Force page granularity for the linear map [suggested by Mark Rutland]. > > --- > > arch/arm64/Kconfig | 1 + > > arch/arm64/include/asm/kfence.h | 19 +++ > > arch/arm64/mm/fault.c | 4 > > arch/arm64/mm/mmu.c | 7 ++- > > 4 files changed, 30 insertions(+), 1 deletion(-) > > create mode 100644 arch/arm64/include/asm/kfence.h > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index f858c352f72a..2f8b328b 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -135,6 +135,7 @@ config ARM64 > > select HAVE_ARCH_JUMP_LABEL_RELATIVE > > select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48) > > select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN > > + select HAVE_ARCH_KFENCE if (!ARM64_16K_PAGES && !ARM64_64K_PAGES) > > Why does this depend on the page size? > > If this is functional, but has a larger overhead on 16K or 64K, I'd > suggest removing the dependency, and just updating the Kconfig help text > to explain that. Good point, I don't think anything is requiring us to force 4K pages. Let's remove it. Thanks, -- Marco > Otherwise, this patch looks fine to me. > > Thanks, > Mark.
Re: [PATCH net-next] net: phy: realtek: Add support for RTL8221B-CG series
On Fri, Oct 30, 2020 at 01:56:20PM +0800, Willy Liu wrote: > Realtek single-port 2.5Gbps Ethernet PHYs are list as below: > RTL8226-CG: the 1st generation 2.5Gbps single port PHY > RTL8226B-CG/RTL8221B-CG: the 2nd generation 2.5Gbps single port PHY > RTL8221B-VB-CG: the 3rd generation 2.5Gbps single port PHY > RTL8221B-VM-CG: the 2.5Gbps single port PHY with MACsec feature > > This patch adds the minimal drivers to manage these transceivers. > > Signed-off-by: Willy Liu Reviewed-by: Andrew Lunn Andrew
[PATCH 4.14 v2 ] platform/x86: Corrects warning: missing braces around initializer
From: John Donnelly The assignment statement of a local variable "struct tp_nvram_state s[2] = {0}; is not valid for all versions of compilers. Fixes: 515ded02bc4b ("platform/x86: thinkpad_acpi: initialize tp_nvram_state variable") Signed-off-by: John Donnelly --- drivers/platform/x86/thinkpad_acpi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index ffaaccded34e..c41ac0385304 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -2477,7 +2477,7 @@ static void hotkey_compare_and_issue_event(struct tp_nvram_state *oldn, */ static int hotkey_kthread(void *data) { - struct tp_nvram_state s[2] = { 0 }; + struct tp_nvram_state s[2]; u32 poll_mask, event_mask; unsigned int si, so; unsigned long t; @@ -2488,6 +2488,8 @@ static int hotkey_kthread(void *data) if (tpacpi_lifecycle == TPACPI_LIFE_EXITING) goto exit; + memset(, 0, sizeof(s)); + set_freezable(); so = 0; -- 2.27.0
Re: [PATCH 4.14 ] platform/x86: Corrects warning: missing braces around initializer
> On Oct 30, 2020, at 10:52 AM, john.p.donne...@oracle.com wrote: > > From: John Donnelly > > The assignment statement of a local variable "struct tp_nvram_state s[2] = > {0}; " > is not valid for all versions of compilers (UEK6 on OL7). > > Fixes: 515ded02bc4b ("platform/x86: thinkpad_acpi: initialize tp_nvram_state > variable") > > Signed-off-by: John Donnelly > --- > drivers/platform/x86/thinkpad_acpi.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/platform/x86/thinkpad_acpi.c > b/drivers/platform/x86/thinkpad_acpi.c > index ffaaccded34e..c41ac0385304 100644 > --- a/drivers/platform/x86/thinkpad_acpi.c > +++ b/drivers/platform/x86/thinkpad_acpi.c > @@ -2477,7 +2477,7 @@ static void hotkey_compare_and_issue_event(struct > tp_nvram_state *oldn, > */ > static int hotkey_kthread(void *data) > { > - struct tp_nvram_state s[2] = { 0 }; > + struct tp_nvram_state s[2]; > u32 poll_mask, event_mask; > unsigned int si, so; > unsigned long t; > @@ -2488,6 +2488,8 @@ static int hotkey_kthread(void *data) > if (tpacpi_lifecycle == TPACPI_LIFE_EXITING) > goto exit; > > + memset(, 0, sizeof(s)); > + > set_freezable(); > > so = 0; > -- > 2.27.0 > Please ignore and use : PATCH 4.14 v2 ] platform/x86: Corrects warning: missing braces around initializer
[PATCH v2 2/2] mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate.
From: Zi Yan In isolate_migratepages_block, if we have too many isolated pages and nr_migratepages is not zero, we should try to migrate what we have without wasting time on isolating. Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”) Suggested-by: Vlastimil Babka Signed-off-by: Zi Yan Cc: --- mm/compaction.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/compaction.c b/mm/compaction.c index 3e834ac402f1..4d237a7c3830 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -817,6 +817,10 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * delay for some time until fewer pages are isolated */ while (unlikely(too_many_isolated(pgdat))) { + /* stop isolation if there are still pages not migrated */ + if (cc->nr_migratepages) + return 0; + /* async migration should just abort */ if (cc->mode == MIGRATE_ASYNC) return 0; -- 2.28.0
[PATCH v2 1/2] mm/compaction: count pages and stop correctly during page isolation.
From: Zi Yan In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages, nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped in too_many_isolated while loop, since the actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32, since we stop isolation after cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX. In addition, after we fix the issue above, cc->nr_migratepages could never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page isolation could not stop as we intended. Change the isolation stop condition to >=. The issue can be triggered as follows: In a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA region and mlocked), reserving 6 1GB hugetlb pages via /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping in too_many_isolated function) until we kill either task. With the patch applied, oom will kill the application with 10GB THPs and let hugetlb page reservation finish. Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”) Signed-off-by: Zi Yan Reviewed-by: Yang Shi Cc: --- mm/compaction.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index ee1f8439369e..3e834ac402f1 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, isolate_success: list_add(>lru, >migratepages); - cc->nr_migratepages++; - nr_isolated++; + cc->nr_migratepages += compound_nr(page); + nr_isolated += compound_nr(page); /* * Avoid isolating too much unless this block is being @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * or a lock is contended. For contention, isolate quickly to * potentially remove one source of contention. */ - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX && + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX && !cc->rescan && !cc->contended) { ++low_pfn; break; @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, if (!pfn) break; - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX) break; } -- 2.28.0
Re: [PATCH v6 3/9] arm64, kfence: enable KFENCE for ARM64
On Fri, Oct 30, 2020 at 03:49:26AM +0100, Jann Horn wrote: > On Thu, Oct 29, 2020 at 2:17 PM Marco Elver wrote: > > @@ -312,6 +313,9 @@ static void __do_kernel_fault(unsigned long addr, > > unsigned int esr, > > "Ignoring spurious kernel translation fault at virtual address > > %016lx\n", addr)) > > return; > > > > + if (kfence_handle_page_fault(addr)) > > + return; > > As in the X86 case, we may want to ensure that this doesn't run for > permission faults, only for non-present pages. Maybe move this down > into the third branch of the "if" block below (neither permission > fault nor NULL deref)? I think that'd make sense. Those cases *should* be mutually exclusive, but it'd be more robust to do the KFENCE checks in that last block so that if something goes wrong wrong within KFENCE we can't get stuck in a loop failing to service an instruction abort or similar. Either that, or factor out an is_el1_translation_fault() and only do the KFENCE check and is_spurious_el1_translation_fault() check under that. Thanks, Mark.
Re: [PATCH v2 1/2] init/Kconfig: Fix CPU number in LOG_CPU_MAX_BUF_SHIFT description
Dear Petr, Am 11.08.20 um 11:29 schrieb Paul Menzel: Currently, LOG_BUF_SHIFT defaults to 17, which is 2 ^ 17 bytes = 128 KB, and LOG_CPU_MAX_BUF_SHIFT defaults to 12, which is 2 ^ 12 bytes = 4 KB. Half of 128 KB is 64 KB, so more than 16 CPUs are required for the value to be used, as then the sum of contributions is greater than 64 KB for the first time. My guess is, that the description was written with the configuration values used in the SUSE in mind. Fixes: 23b2899f7f ("printk: allow increasing the ring buffer depending on the number of CPUs") Cc: Luis R. Rodriguez Cc: linux-kernel@vger.kernel.org Reviewed-by: Petr Mladek Signed-off-by: Paul Menzel --- v2: Add Reviewed-by tag init/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/init/Kconfig b/init/Kconfig index d6a0b31b13dc..9dc607e3806f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -718,7 +718,7 @@ config LOG_CPU_MAX_BUF_SHIFT with more CPUs. Therefore this value is used only when the sum of contributions is greater than the half of the default kernel ring buffer as defined by LOG_BUF_SHIFT. The default values are set - so that more than 64 CPUs are needed to trigger the allocation. + so that more than 16 CPUs are needed to trigger the allocation. Also this option is ignored when "log_buf_len" kernel parameter is used as it forces an exact (power of two) size of the ring buffer. Could you please apply this trivial patch from the two patches already, so I do not have to resend it? Kind regards, Paul
Re: [RESEND PATCH v18 2/4] overlayfs: handle XATTR_NOSECURITY flag for get xattr method
On 10/30/20 8:07 AM, Miklos Szeredi wrote: On Wed, Oct 21, 2020 at 5:19 PM Mark Salyzyn wrote: Because of the overlayfs getxattr recursion, the incoming inode fails to update the selinux sid resulting in avc denials being reported against a target context of u:object_r:unlabeled:s0. Solution is to respond to the XATTR_NOSECURITY flag in get xattr method that calls the __vfs_getxattr handler instead so that the context can be read in, rather than being denied with an -EACCES when vfs_getxattr handler is called. For the use case where access is to be blocked by the security layer. The path then would be security(dentry) -> __vfs_getxattr({dentry...XATTR_NOSECURITY}) -> handler->get({dentry...XATTR_NOSECURITY}) -> __vfs_getxattr({realdentry...XATTR_NOSECURITY}) -> lower_handler->get({realdentry...XATTR_NOSECURITY}) which would report back through the chain data and success as expected, the logging security layer at the top would have the data to determine the access permissions and report back to the logs and the caller that the target context was blocked. For selinux this would solve the cosmetic issue of the selinux log and allow audit2allow to correctly report the rule needed to address the access problem. Check impure, opaque, origin & meta xattr with no sepolicy audit (using __vfs_getxattr) since these operations are internal to overlayfs operations and do not disclose any data. This became an issue for credential override off since sys_admin would have been required by the caller; whereas would have been inherently present for the creator since it performed the mount. This is a change in operations since we do not check in the new ovl_do_getxattr function if the credential override is off or not. Reasoning is that the sepolicy check is unnecessary overhead, especially since the check can be expensive. Because for override credentials off, this affects _everyone_ that underneath performs private xattr calls without the appropriate sepolicy permissions and sys_admin capability. Providing blanket support for sys_admin would be bad for all possible callers. For the override credentials on, this will affect only the mounter, should it lack sepolicy permissions. Not considered a security problem since mounting by definition has sys_admin capabilities, but sepolicy contexts would still need to be crafted. This would be a problem when unprivileged mounting of overlay is introduced. I'd really like to avoid weakening the current security model. The current security model does not deal with non-overlapping security contexts between init (which on android has MAC permissions only when necessary, only enough permissions to perform the mount and other mundane operations, missing exec and read permissions in key spots) and user calls. We are only weakening (that is actually an incorrect statement, security is there, just not double security of both mounter and caller) the security around calls that retrieve the xattr for administrative and internal purposes. No data is exposed to the caller that it would not otherwise have permissions for. This patch becomes necessary when matched with the PATCH v18 3/4 of the series which fixes the user space break introduced in ~4.6 that formerly used the callers credentials for all accesses in all places. Security is weakened already as-is in overlayfs with all the overriding of the credentials for internal accesses to overlayfs mechanics based on the mounter credentials. Using the mounter credentials as a wider security hole is the problem, at least with PATCH v18 3/4 of the series we go back optionally to only using the caller's credentials to perform the operations. Admittedly some of the internal operations like mknod are privileged, but at least in Android's use case we are not using them with callers without the necessary credentials. Android does not give the mounter more credentials than the callers, there is very little overlap in the MAC security. The big API churn in the 1/4 patch also seems excessive considering that this seems to be mostly a cosmetic issue for android. Am I missing something? Breaks sepolicy, it no longer has access to the context data at the overlayfs security boundary. unknown is a symptom of being denied based on the denial to xattr data from the underlying filesystem layer. Being denied the security context of the target is not a good thing within the sepolicy security layer. Thanks, Miklos
Re: [PATCH 1/4] prandom.h: add *_state variant of prandom_u32_max
On Sun 2020-10-25 22:48:39, Rasmus Villemoes wrote: > It is useful for test modules that make use of random numbers to allow > the exact same series of test cases to be repeated (e.g., after fixing > a bug in the code being tested). For that, the test module needs to > obtain its random numbers from a private state that can be seeded by a > known seed, e.g. given as a module parameter (and using a random seed > when that parameter is not given). > > There's a few test modules I'm going to modify to follow that > scheme. As preparation, add a _state variant of the existing > prandom_u32_max(), and for convenience, also add a variant that > produces a value in a given range. > > Signed-off-by: Rasmus Villemoes > --- > include/linux/prandom.h | 29 + > 1 file changed, 29 insertions(+) > > diff --git a/include/linux/prandom.h b/include/linux/prandom.h > index aa16e6468f91e79e1f31..58ffcd56c705be34fb98 100644 > --- a/include/linux/prandom.h > +++ b/include/linux/prandom.h > @@ -46,6 +46,35 @@ static inline u32 prandom_u32_max(u32 ep_ro) > return (u32)(((u64) prandom_u32() * ep_ro) >> 32); > } > > +/** > + * prandom_u32_max_state - get pseudo-random number in internal [0, hi) s/internal/interval/ > + * > + * Like prandom_u32_max, but use the given state structure. > + * @state: pointer to state structure > + * @hi: (exclusive) upper bound > + * > + * Exception: If @hi == 0, this returns 0. > + */ > +static inline u32 prandom_u32_max_state(struct rnd_state *state, u32 hi) > +{ > + return ((u64)prandom_u32_state(state) * hi) >> 32; > +} > + > +/** > + * prandom_u32_range_state - get pseudo-random number in internal [lo, hi) same here > + * > + * @state: pointer to state structure > + * @lo: (inclusive) lower bound > + * @hi: (exclusive) upper bound > + * > + * Exception: If @lo == @hi, this returns @lo. Results are unspecified > + * for @lo > @hi. > + */ > +static inline u32 prandom_u32_range_state(struct rnd_state *state, u32 lo, > u32 hi) > +{ > + return lo + prandom_u32_max_state(state, hi - lo); > +} With the above typo fixes: Reviewed-by: Petr Mladek Well, I guess that we need ack from Willy. Best Regards, Petr
Re: [PATCH 2/4] kselftest_module.h: unconditionally expand the KSTM_MODULE_GLOBALS() macro
On Sun 2020-10-25 22:48:40, Rasmus Villemoes wrote: > Two out of three users of the kselftest_module.h header > manually define the failed_tests/total_tests variables instead of > making use of the KSTM_MODULE_GLOBALS() macro. However, instead of > just replacing those definitions with an invocation of that macro, > just unconditionally define them in the header file itself. > > A coming change will add a few more global variables, and at least one > of those will be referenced from kstm_report() - however, that's not > possible currently, since when the definition is postponed until the > test module invokes KSTM_MODULE_GLOBALS(), the variable is not defined > by the time the compiler parses kstm_report(). > > Signed-off-by: Rasmus Villemoes Reviewed-by: Petr Mladek Best Regards, Petr
Re: [PATCH v1 1/2] ptrace: Set PF_SUPERPRIV when checking capability
On 30/10/2020 16:47, Jann Horn wrote: > On Fri, Oct 30, 2020 at 1:39 PM Mickaël Salaün wrote: >> Commit 69f594a38967 ("ptrace: do not audit capability check when outputing >> /proc/pid/stat") replaced the use of ns_capable() with >> has_ns_capability{,_noaudit}() which doesn't set PF_SUPERPRIV. >> >> Commit 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in >> ptrace_has_cap()") replaced has_ns_capability{,_noaudit}() with >> security_capable(), which doesn't set PF_SUPERPRIV neither. >> >> Since commit 98f368e9e263 ("kernel: Add noaudit variant of ns_capable()"), a >> new ns_capable_noaudit() helper is available. Let's use it! >> >> As a result, the signature of ptrace_has_cap() is restored to its original >> one. >> >> Cc: Christian Brauner >> Cc: Eric Paris >> Cc: Jann Horn >> Cc: Kees Cook >> Cc: Oleg Nesterov >> Cc: Serge E. Hallyn >> Cc: Tyler Hicks >> Cc: sta...@vger.kernel.org >> Fixes: 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in >> ptrace_has_cap()") >> Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing >> /proc/pid/stat") >> Signed-off-by: Mickaël Salaün > > Yeah... I guess this makes sense. (We'd have to undo or change it if > we ever end up needing to use a different set of credentials, e.g. > from ->f_cred, but I guess that's really something we should avoid > anyway.) > > Reviewed-by: Jann Horn > > with one nit: > > > [...] >> /* Returns 0 on success, -errno on denial. */ >> static int __ptrace_may_access(struct task_struct *task, unsigned int mode) >> { >> - const struct cred *cred = current_cred(), *tcred; >> + const struct cred *const cred = current_cred(), *tcred; > > This is an unrelated change, and almost no kernel code marks local > pointer variables as "const". I would drop this change from the patch. This give guarantee that the cred variable will not be used for something else than current_cred(), which kinda prove that this patch doesn't change the behavior of __ptrace_may_access() by not using cred in ptrace_has_cap(). It doesn't hurt and I think it could be useful to spot issues when backporting. > >> struct mm_struct *mm; >> kuid_t caller_uid; >> kgid_t caller_gid;
Re: [PATCH RFC v2 10/21] kasan: inline random_tag for HW_TAGS
On Fri, Oct 30, 2020 at 4:48 PM Andrey Konovalov wrote: > > On Wed, Oct 28, 2020 at 12:08 PM Dmitry Vyukov wrote: > > > > On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov > > wrote: > > > > > > Using random_tag() currently results in a function call. Move its > > > definition to mm/kasan/kasan.h and turn it into a static inline function > > > for hardware tag-based mode to avoid uneeded function call. > > > > > > Signed-off-by: Andrey Konovalov > > > Link: > > > https://linux-review.googlesource.com/id/Iac5b2faf9a912900e16cca6834d621f5d4abf427 > > > --- > > > mm/kasan/hw_tags.c | 5 - > > > mm/kasan/kasan.h | 37 - > > > 2 files changed, 20 insertions(+), 22 deletions(-) > > > > > > diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c > > > index c3a0e83b5e7a..4c24bfcfeff9 100644 > > > --- a/mm/kasan/hw_tags.c > > > +++ b/mm/kasan/hw_tags.c > > > @@ -36,11 +36,6 @@ void kasan_unpoison_memory(const void *address, size_t > > > size) > > > round_up(size, KASAN_GRANULE_SIZE), > > > get_tag(address)); > > > } > > > > > > -u8 random_tag(void) > > > -{ > > > - return get_random_tag(); > > > -} > > > - > > > bool check_invalid_free(void *addr) > > > { > > > u8 ptr_tag = get_tag(addr); > > > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h > > > index 0ccbb3c4c519..94ba15c2f860 100644 > > > --- a/mm/kasan/kasan.h > > > +++ b/mm/kasan/kasan.h > > > @@ -188,6 +188,12 @@ static inline bool addr_has_metadata(const void > > > *addr) > > > > > > #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */ > > > > > > +#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS) > > > +void print_tags(u8 addr_tag, const void *addr); > > > +#else > > > +static inline void print_tags(u8 addr_tag, const void *addr) { } > > > +#endif > > > + > > > bool check_invalid_free(void *addr); > > > > > > void *find_first_bad_addr(void *addr, size_t size); > > > @@ -223,23 +229,6 @@ static inline void quarantine_reduce(void) { } > > > static inline void quarantine_remove_cache(struct kmem_cache *cache) { } > > > #endif > > > > > > -#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS) > > > - > > > -void print_tags(u8 addr_tag, const void *addr); > > > - > > > -u8 random_tag(void); > > > - > > > -#else > > > - > > > -static inline void print_tags(u8 addr_tag, const void *addr) { } > > > - > > > -static inline u8 random_tag(void) > > > -{ > > > - return 0; > > > -} > > > - > > > -#endif > > > - > > > #ifndef arch_kasan_set_tag > > > static inline const void *arch_kasan_set_tag(const void *addr, u8 tag) > > > { > > > @@ -273,6 +262,20 @@ static inline const void *arch_kasan_set_tag(const > > > void *addr, u8 tag) > > > #define get_mem_tag(addr) arch_get_mem_tag(addr) > > > #define set_mem_tag_range(addr, size, tag) > > > arch_set_mem_tag_range((addr), (size), (tag)) > > > > > > +#ifdef CONFIG_KASAN_SW_TAGS > > > +u8 random_tag(void); > > > +#elif defined(CONFIG_KASAN_HW_TAGS) > > > +static inline u8 random_tag(void) > > > +{ > > > + return get_random_tag(); > > > > What's the difference between random_tag() and get_random_tag()? Do we > > need both? > > Not really. Will simplify this in the next version and give cleaner names. Actually I think I'll keep both for the next version, but rename get_random_tag() into hw_get_random_tag() along with other hw-specific calls. The idea is to have hw_*() calls for things that are implemented by the hardware for HW_TAGS, and then define random_tag() based on that for HW_TAGS and based on a software implementation for SW_TAGS.
Re: [PATCH 1/2] mailbox: stm32-ipcc: add COMPILE_TEST dependency
Hi Martin, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on fujitsu-integration/mailbox-for-next] [also build test WARNING on stm32/stm32-next linus/master linux/master v5.10-rc1 next-20201030] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Martin-Kaiser/mailbox-stm32-ipcc-add-COMPILE_TEST-dependency/20201024-220512 base: https://git.linaro.org/landing-teams/working/fujitsu/integration.git mailbox-for-next config: x86_64-randconfig-r023-20201030 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 772aaa602383cf82795572ebcd86b0e660f3579f) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # https://github.com/0day-ci/linux/commit/6e22aaac25dcdd4c098c57d29363fa2c204e411e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Martin-Kaiser/mailbox-stm32-ipcc-add-COMPILE_TEST-dependency/20201024-220512 git checkout 6e22aaac25dcdd4c098c57d29363fa2c204e411e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): >> drivers/mailbox/stm32-ipcc.c:147:22: warning: cast to smaller integer type >> 'unsigned int' from 'void *' [-Wvoid-pointer-to-int-cast] unsigned int chan = (unsigned int)link->con_priv; ^~~~ drivers/mailbox/stm32-ipcc.c:166:22: warning: cast to smaller integer type 'unsigned int' from 'void *' [-Wvoid-pointer-to-int-cast] unsigned int chan = (unsigned int)link->con_priv; ^~~~ drivers/mailbox/stm32-ipcc.c:186:22: warning: cast to smaller integer type 'unsigned int' from 'void *' [-Wvoid-pointer-to-int-cast] unsigned int chan = (unsigned int)link->con_priv; ^~~~ >> drivers/mailbox/stm32-ipcc.c:310:40: warning: cast to 'void *' from smaller >> integer type 'unsigned int' [-Wint-to-void-pointer-cast] ipcc->controller.chans[i].con_priv = (void *)i; ^ 4 warnings generated. vim +147 drivers/mailbox/stm32-ipcc.c ffbded7dee97563 Fabien Dessenne 2018-05-31 144 ffbded7dee97563 Fabien Dessenne 2018-05-31 145 static int stm32_ipcc_send_data(struct mbox_chan *link, void *data) ffbded7dee97563 Fabien Dessenne 2018-05-31 146 { ffbded7dee97563 Fabien Dessenne 2018-05-31 @147unsigned int chan = (unsigned int)link->con_priv; ffbded7dee97563 Fabien Dessenne 2018-05-31 148struct stm32_ipcc *ipcc = container_of(link->mbox, struct stm32_ipcc, ffbded7dee97563 Fabien Dessenne 2018-05-31 149 controller); ffbded7dee97563 Fabien Dessenne 2018-05-31 150 ffbded7dee97563 Fabien Dessenne 2018-05-31 151 dev_dbg(ipcc->controller.dev, "%s: chan:%d\n", __func__, chan); ffbded7dee97563 Fabien Dessenne 2018-05-31 152 ffbded7dee97563 Fabien Dessenne 2018-05-31 153/* set channel n occupied */ dba9a3dfe912dc4 Arnaud Pouliquen 2019-05-22 154 stm32_ipcc_set_bits(>lock, ipcc->reg_proc + IPCC_XSCR, dba9a3dfe912dc4 Arnaud Pouliquen 2019-05-22 155 TX_BIT_CHAN(chan)); ffbded7dee97563 Fabien Dessenne 2018-05-31 156 ffbded7dee97563 Fabien Dessenne 2018-05-31 157/* unmask 'tx channel free' interrupt */ dba9a3dfe912dc4 Arnaud Pouliquen 2019-05-22 158 stm32_ipcc_clr_bits(>lock, ipcc->reg_proc + IPCC_XMR, dba9a3dfe912dc4 Arnaud Pouliquen 2019-05-22 159 TX_BIT_CHAN(chan)); ffbded7dee97563 Fabien Dessenne 2018-05-31 160 ffbded7dee97563 Fabien Dessenne 2018-05-31 161return 0; ffbded7dee97563 Fabien Dessenne 2018-05-31 162 } ffbded7dee97563 Fabien Dessenne 2018-05-31 163 ffbded7dee97563 Fabien Dessenne 2018-05-31 164 static int stm32_ipcc_startup(struct mbox_chan *link) ffbded7dee97563 Fabien Dessenne 2018-05-31 165 { ffbded7dee97563 Fabien Dessenne 2018-05-31 166unsigned int chan = (unsigned int)link->con_priv; ffbded7dee97563 Fabien Dessenne 2018-05-31 167struct stm32_ipcc *ipcc = container_of(link->mbox, struct stm32_ipcc, ffbded7dee97563 Fabien Dess
Re: [PATCH v10 01/15] ASoC: sun4i-i2s: Fix lrck_period computation for I2S justified mode
On Fri, Oct 30, 2020 at 03:46:34PM +0100, Clément Péron wrote: > Left and Right justified mode are computed using the same formula > as DSP_A and DSP_B mode. > Which is wrong and the user manual explicitly says: > > LRCK_PERDIOD: > PCM Mode: Number of BCLKs within (Left + Right) channel width. > I2S/Left-Justified/Right-Justified Mode: Number of BCLKs within each > individual channel width(Left or Right) > > Fix this by using the same formula as the I2S mode. > > Fixes: 7ae7834ec446 ("ASoC: sun4i-i2s: Add support for DSP formats") > Signed-off-by: Clément Péron Acked-by: Maxime Ripard Thanks! Maxime signature.asc Description: PGP signature
[PATCH 00/24] x86/resctrl: Merge the CDP resources
Hi folks, This series re-folds the resctrl code so the CDP resources (L3CODE et al) behaviour is all contained in the filesystem parts, with a minimum amount of arch specific code. Arm have some CPU support for dividing caches into portions, and applying bandwidth limits at various points in the SoC. The collective term for these features is MPAM: Memory Partitioning and Monitoring. MPAM is similar enough to Intel RDT, that it should use the defacto linux interface: resctrl. This filesystem currently lives under arch/x86, and is tightly coupled to the architecture. Ultimately, my plan is to split the existing resctrl code up to have an arch<->fs abstraction, then move all the bits out to fs/resctrl. From there MPAM can be wired up. x86 might have two resources with cache controls, (L2 and L3) but has extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled if CDP is enabled for the corresponding cache. MPAM has an equivalent feature to CDP, but its a property of the CPU, not the cache. Resctrl needs to have x86's odd/even behaviour, as that its the ABI, but this isn't how the MPAM hardware works. It is entirely possible that an in-kernel user of MPAM would not be using CDP, whereas resctrl is. Pretending L3CODE and L3DATA are entirely separate resources is a neat trick, but doing this is specific to x86. Doing this leaves the arch code in control of various parts of the filesystem ABI: the resources names, and the way the schemata are parsed. Allowing this stuff to vary between architectures is bad for user space. This series collapses the CODE/DATA resources, moving all the user-visible resctrl ABI into the filesystem code. CDP becomes the type of configuration being applied to a cache. This is done by adding a struct resctrl_schema to the parts of resctrl that will move to fs. This holds the arch-code resource that is in use for this schema, along with other properties like the name, and whether the configuration being applied is CODE/DATA/BOTH. This lets us fold the extra resources out of the arch code so that they don't need to be duplicated if the equivalent feature to CDP is missing, or implemented in a different way. The first two patches split the resource and domain structs to have an arch specific 'hw' portion, and the rest that is visible to resctrl. Future series massage the resctrl code so there are no accesses to 'hw' structures in the parts of resctrl that will move to fs, providing helpers where necessary. Since anyone last looked at this, the CDP property has been made per-resource instead of global. MPAM will need to make this global in the arch code, as CODE/DATA closid are based on how the CPU tags traffic, not how the cache interprets it. resctrl sets CDP enabled on a resource, but reads it back on each one. The attempt to keep closids as-used-by-resctrl and closids as-written-to-hw appart has been dropped. There are two copies of num_closid. The version private to the arch code is the value discovered from hardware. resctrl has its own version, which it may write to, which is exposed to user-space. This lets resctrl do its odd/even thing, even if thats not how the hardware works. This series adds temporary scaffolding, which it removes a few patches later. This is to allow things like the ctrlval arrays and resources to be merged separately, which should make is easier to bisect. These things are marked temporary, and should all be gone by the end of the series. This series is a little rough around the monitors, would a fake struct resctrl_schema for the monitors simplify things, or be a source of bugs? This series is based on v5.10-rc1, and can be retrieved from: git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/resctrl_merge_cdp/v1 Parts were previously posted as an RFC here: https://lore.kernel.org/lkml/20200214182947.39194-1-james.mo...@arm.com/ Thanks, James Morse (24): x86/resctrl: Split struct rdt_resource x86/resctrl: Split struct rdt_domain x86/resctrl: Add resctrl_arch_get_num_closid() x86/resctrl: Add a separate schema list for resctrl x86/resctrl: Pass the schema in resdir's private pointer x86/resctrl: Store the effective num_closid in the schema x86/resctrl: Label the resources with their configuration type x86/resctrl: Walk the resctrl schema list instead of an arch list x86/resctrl: Change rdt_resource to resctrl_schema in pseudo_lock_region x86/resctrl: Move the schema names into struct resctrl_schema x86/resctrl: Group staged configuration into a separate struct x86/resctrl: Add closid to the staged config x86/resctrl: Allow different CODE/DATA configurations to be staged x86/resctrl: Make update_domains() learn the affected closids x86/resctrl: Add a helper to read a closid's configuration x86/resctrl: Add a helper to read/set the CDP configuration x86/resctrl: Use cdp_enabled in rdt_domain_reconfigure_cdp() x86/resctrl: Pass configuration type to resctrl_arch_get_config()
[PATCH 04/24] x86/resctrl: Add a separate schema list for resctrl
To support multiple architectures, the resctrl code needs to be split into a 'fs' specific part in core code, and an arch-specific backend. It should be difficult for the arch-specific backends to diverge, supporting slightly different ABIs for user-space. For example, generating, parsing and validating the schema configuration values should be done in what becomes the core code to prevent divergence. Today, the schema emerge from which entries in the rdt_resources_all array the arch code has chosen to enable. Start by creating a struct resctrl_schema, which will eventually hold the name and pending configuration values for resctrl. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/rdtgroup.c | 43 +- include/linux/resctrl.h| 9 ++ 3 files changed, 52 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index f7aab9245259..682e84aebd14 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -106,6 +106,7 @@ extern unsigned int resctrl_cqm_threshold; extern bool rdt_alloc_capable; extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; +extern struct list_head resctrl_all_schema; enum rdt_group_type { RDTCTRL_GROUP = 0, diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index df10135f021e..f79a5e548138 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -39,6 +39,9 @@ static struct kernfs_root *rdt_root; struct rdtgroup rdtgroup_default; LIST_HEAD(rdt_all_groups); +/* list of entries for the schemata file */ +LIST_HEAD(resctrl_all_schema); + /* Kernel fs node for "info" directory under root */ static struct kernfs_node *kn_info; @@ -2117,6 +2120,35 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx) return ret; } +static int create_schemata_list(void) +{ + struct resctrl_schema *s; + struct rdt_resource *r; + + for_each_alloc_enabled_rdt_resource(r) { + s = kzalloc(sizeof(*s), GFP_KERNEL); + if (!s) + return -ENOMEM; + + s->res = r; + + INIT_LIST_HEAD(>list); + list_add(>list, _all_schema); + } + + return 0; +} + +static void destroy_schemata_list(void) +{ + struct resctrl_schema *s, *tmp; + + list_for_each_entry_safe(s, tmp, _all_schema, list) { + list_del(>list); + kfree(s); + } +} + static int rdt_get_tree(struct fs_context *fc) { struct rdt_fs_context *ctx = rdt_fc2context(fc); @@ -2138,11 +2170,17 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_cdp; + ret = create_schemata_list(); + if (ret) { + destroy_schemata_list(); + goto out_mba; + } + closid_init(); ret = rdtgroup_create_info_dir(rdtgroup_default.kn); if (ret < 0) - goto out_mba; + goto out_schemata_free; if (rdt_mon_capable) { ret = mongroup_create_dir(rdtgroup_default.kn, @@ -2194,6 +2232,8 @@ static int rdt_get_tree(struct fs_context *fc) kernfs_remove(kn_mongrp); out_info: kernfs_remove(kn_info); +out_schemata_free: + destroy_schemata_list(); out_mba: if (ctx->enable_mba_mbps) set_mba_sc(false); @@ -2439,6 +2479,7 @@ static void rdt_kill_sb(struct super_block *sb) rmdir_all_sub(); rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; + destroy_schemata_list(); static_branch_disable_cpuslocked(_alloc_enable_key); static_branch_disable_cpuslocked(_mon_enable_key); static_branch_disable_cpuslocked(_enable_key); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index dfb0f32b73a1..de6cbc725753 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -163,6 +163,15 @@ struct rdt_resource { }; +/** + * @list: Member of resctrl's schema list + * @res: The rdt_resource for this entry + */ +struct resctrl_schema { + struct list_headlist; + struct rdt_resource *res; +}; + /* The number of closid supported by this resource regardless of CDP */ u32 resctrl_arch_get_num_closid(struct rdt_resource *r); -- 2.28.0
[PATCH 07/24] x86/resctrl: Label the resources with their configuration type
Before the name for the schema can be generated, the type of the configuration being applied to the resource needs to be known. Label all the entries in rdt_resources_all[], and copy that value in to struct resctrl_schema. Subsequent patches will generate the schema names in what will become the fs code. Eventually the fs code will generate pairs of CODE/DATA if the platform supports CDP for this resource. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 7 +++ arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 + include/linux/resctrl.h| 8 4 files changed, 17 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 5d5b566c4359..1ed5e04031e6 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -62,6 +62,7 @@ mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_hw_resource rdt_resources_all[] = { [RDT_RESOURCE_L3] = { + .conf_type = CDP_BOTH, .resctrl = { .rid= RDT_RESOURCE_L3, .name = "L3", @@ -81,6 +82,7 @@ struct rdt_hw_resource rdt_resources_all[] = { }, [RDT_RESOURCE_L3DATA] = { + .conf_type = CDP_DATA, .resctrl = { .rid= RDT_RESOURCE_L3DATA, .name = "L3DATA", @@ -100,6 +102,7 @@ struct rdt_hw_resource rdt_resources_all[] = { }, [RDT_RESOURCE_L3CODE] = { + .conf_type = CDP_CODE, .resctrl = { .rid= RDT_RESOURCE_L3CODE, .name = "L3CODE", @@ -119,6 +122,7 @@ struct rdt_hw_resource rdt_resources_all[] = { }, [RDT_RESOURCE_L2] = { + .conf_type = CDP_BOTH, .resctrl = { .rid= RDT_RESOURCE_L2, .name = "L2", @@ -138,6 +142,7 @@ struct rdt_hw_resource rdt_resources_all[] = { }, [RDT_RESOURCE_L2DATA] = { + .conf_type = CDP_DATA, .resctrl = { .rid= RDT_RESOURCE_L2DATA, .name = "L2DATA", @@ -157,6 +162,7 @@ struct rdt_hw_resource rdt_resources_all[] = { }, [RDT_RESOURCE_L2CODE] = { + .conf_type = CDP_CODE, .resctrl = { .rid= RDT_RESOURCE_L2CODE, .name = "L2CODE", @@ -176,6 +182,7 @@ struct rdt_hw_resource rdt_resources_all[] = { }, [RDT_RESOURCE_MBA] = { + .conf_type = CDP_BOTH, .resctrl = { .rid= RDT_RESOURCE_MBA, .name = "MB", diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 682e84aebd14..6c87a81946b1 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -367,6 +367,7 @@ struct rdt_parse_data { * @mon_scale: cqm counter * mon_scale = occupancy in bytes */ struct rdt_hw_resource { + enum resctrl_conf_type conf_type; struct rdt_resource resctrl; int num_closid; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 1bd785b1920c..628e5eb4d7a9 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2141,6 +2141,7 @@ static int create_schemata_list(void) s->res = r; s->num_closid = resctrl_arch_get_num_closid(r); + s->conf_type = resctrl_to_arch_res(r)->conf_type; INIT_LIST_HEAD(>list); list_add(>list, _all_schema); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index b32152968bca..20d8b6dd4af4 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -15,6 +15,12 @@ int proc_resctrl_show(struct seq_file *m, #endif +enum resctrl_conf_type { + CDP_BOTH, + CDP_CODE, + CDP_DATA, +}; + /** * struct rdt_domain - group of cpus sharing an RDT resource * @list: all instances of this resource @@ -165,11 +171,13 @@ struct rdt_resource { /** * @list: Member of resctrl's schema list + * @cdp_type: Whether this entry is for code/data/both * @res: The rdt_resource for this entry * @num_closid Number of CLOSIDs available for
[PATCH 03/24] x86/resctrl: Add resctrl_arch_get_num_closid()
resctrl chooses whether to enable CDP, once it does, half the number of closid are available. MPAM doesn't behave like this, an in-kernel user of MPAM could be 'using CDP' while resctrl is not. To move the 'half the closids' behaviour to be part of the core code, each schema would have a num_closids. This may be different from the single resources num_closid if CDP is in use. Add a helper to read the resource's num_closid, this should return the number of closid that the resource supports, regardless of whether CDP is in use. For now return the hw_res->num_closid, which is already adjusted for CDP. Once the CODE/DATA/BOTH resources are merged, resctrl can make the adjustment when copying the value to the schema's num_closid. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c| 5 + arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 14 +- include/linux/resctrl.h | 3 +++ 4 files changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 97040a54cc9a..5d5b566c4359 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -443,6 +443,11 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r) return NULL; } +u32 resctrl_arch_get_num_closid(struct rdt_resource *r) +{ + return resctrl_to_arch_res(r)->num_closid; +} + void rdt_ctrl_update(void *arg) { struct msr_param *m = arg; diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 2e7466659af3..14ea6a40993f 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -286,12 +286,11 @@ int update_domains(struct rdt_resource *r, int closid) static int rdtgroup_parse_resource(char *resname, char *tok, struct rdtgroup *rdtgrp) { - struct rdt_hw_resource *hw_res; struct rdt_resource *r; for_each_alloc_enabled_rdt_resource(r) { - hw_res = resctrl_to_arch_res(r); - if (!strcmp(resname, r->name) && rdtgrp->closid < hw_res->num_closid) + if (!strcmp(resname, r->name) && +rdtgrp->closid < resctrl_arch_get_num_closid(r)) return parse_line(tok, r, rdtgrp); } rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname); @@ -400,7 +399,6 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid) int rdtgroup_schemata_show(struct kernfs_open_file *of, struct seq_file *s, void *v) { - struct rdt_hw_resource *hw_res; struct rdtgroup *rdtgrp; struct rdt_resource *r; int ret = 0; @@ -425,8 +423,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, } else { closid = rdtgrp->closid; for_each_alloc_enabled_rdt_resource(r) { - hw_res = resctrl_to_arch_res(r); - if (closid < hw_res->num_closid) + if (closid < resctrl_arch_get_num_closid(r)) show_doms(s, r, closid); } } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index b55861ff4e34..df10135f021e 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -100,15 +100,13 @@ int closids_supported(void) static void closid_init(void) { - struct rdt_hw_resource *hw_res; + u32 rdt_min_closid = 32; struct rdt_resource *r; - int rdt_min_closid = 32; /* Compute rdt_min_closid across all resources */ - for_each_alloc_enabled_rdt_resource(r) { - hw_res = resctrl_to_arch_res(r); - rdt_min_closid = min(rdt_min_closid, hw_res->num_closid); - } + for_each_alloc_enabled_rdt_resource(r) + rdt_min_closid = min(rdt_min_closid, +resctrl_arch_get_num_closid(r)); closid_free_map = BIT_MASK(rdt_min_closid) - 1; @@ -847,10 +845,8 @@ static int rdt_num_closids_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { struct rdt_resource *r = of->kn->parent->priv; - struct rdt_hw_resource *hw_res; - hw_res = resctrl_to_arch_res(r); - seq_printf(seq, "%d\n", hw_res->num_closid); + seq_printf(seq, "%d\n", resctrl_arch_get_num_closid(r)); return 0; } diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index f5af59b8f2a9..dfb0f32b73a1 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -163,4 +163,7 @@ struct rdt_resource { }; +/* The number of closid
[PATCH 06/24] x86/resctrl: Store the effective num_closid in the schema
resctrl_schema holds properties that vary with the style of configuration that resctrl applies to a resource. Once the arch code has a single resource per cache that can be configured, resctrl will need to keep track of the num_closid itself. Add num_closid to resctrl_schema. Change callers like rdtgroup_schemata_show() to walk the schema instead. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 13 - arch/x86/kernel/cpu/resctrl/rdtgroup.c| 11 +-- include/linux/resctrl.h | 2 ++ 3 files changed, 15 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 14ea6a40993f..8ac104c634fe 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -286,11 +286,12 @@ int update_domains(struct rdt_resource *r, int closid) static int rdtgroup_parse_resource(char *resname, char *tok, struct rdtgroup *rdtgrp) { + struct resctrl_schema *s; struct rdt_resource *r; - for_each_alloc_enabled_rdt_resource(r) { - if (!strcmp(resname, r->name) && -rdtgrp->closid < resctrl_arch_get_num_closid(r)) + list_for_each_entry(s, _all_schema, list) { + r = s->res; + if (!strcmp(resname, r->name) && rdtgrp->closid < s->num_closid) return parse_line(tok, r, rdtgrp); } rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname); @@ -399,6 +400,7 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid) int rdtgroup_schemata_show(struct kernfs_open_file *of, struct seq_file *s, void *v) { + struct resctrl_schema *schema; struct rdtgroup *rdtgrp; struct rdt_resource *r; int ret = 0; @@ -422,8 +424,9 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, } } else { closid = rdtgrp->closid; - for_each_alloc_enabled_rdt_resource(r) { - if (closid < resctrl_arch_get_num_closid(r)) + list_for_each_entry(schema, _all_schema, list) { + r = schema->res; + if (closid < schema->num_closid) show_doms(s, r, closid); } } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index cb16454a6b0e..1bd785b1920c 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -103,13 +103,12 @@ int closids_supported(void) static void closid_init(void) { + struct resctrl_schema *s; u32 rdt_min_closid = 32; - struct rdt_resource *r; /* Compute rdt_min_closid across all resources */ - for_each_alloc_enabled_rdt_resource(r) - rdt_min_closid = min(rdt_min_closid, -resctrl_arch_get_num_closid(r)); + list_for_each_entry(s, _all_schema, list) + rdt_min_closid = min(rdt_min_closid, s->num_closid); closid_free_map = BIT_MASK(rdt_min_closid) - 1; @@ -848,9 +847,8 @@ static int rdt_num_closids_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { struct resctrl_schema *s = of->kn->parent->priv; - struct rdt_resource *r = s->res; - seq_printf(seq, "%d\n", resctrl_arch_get_num_closid(r)); + seq_printf(seq, "%d\n", s->num_closid); return 0; } @@ -2142,6 +2140,7 @@ static int create_schemata_list(void) return -ENOMEM; s->res = r; + s->num_closid = resctrl_arch_get_num_closid(r); INIT_LIST_HEAD(>list); list_add(>list, _all_schema); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index de6cbc725753..b32152968bca 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -166,10 +166,12 @@ struct rdt_resource { /** * @list: Member of resctrl's schema list * @res: The rdt_resource for this entry + * @num_closid Number of CLOSIDs available for this resource */ struct resctrl_schema { struct list_headlist; struct rdt_resource *res; + u32 num_closid; }; /* The number of closid supported by this resource regardless of CDP */ -- 2.28.0
[PATCH 02/24] x86/resctrl: Split struct rdt_domain
resctrl is the defacto Linux ABI for SoC resource partitioning features. To support it on another architecture, it needs to be abstracted from Intel RDT, and moved it to /fs/. Split struct rdt_domain up too. Move everything that that is particular to resctrl into a new header file. resctrl code paths touching a 'hw' struct indicates where an abstraction is needed. No change in behaviour, this patch just moves types around. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c| 32 +++--- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 -- arch/x86/kernel/cpu/resctrl/internal.h| 40 +-- arch/x86/kernel/cpu/resctrl/monitor.c | 8 +++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 29 ++-- include/linux/resctrl.h | 35 +++- 6 files changed, 94 insertions(+), 60 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 470661f2eb68..97040a54cc9a 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -385,10 +385,11 @@ static void mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); unsigned int i; for (i = m->low; i < m->high; i++) - wrmsrl(hw_res->msr_base + i, d->ctrl_val[i]); + wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]); } /* @@ -410,21 +411,23 @@ mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); unsigned int i; /* Write the delay values for mba. */ for (i = m->low; i < m->high; i++) - wrmsrl(hw_res->msr_base + i, delay_bw_map(d->ctrl_val[i], r)); + wrmsrl(hw_res->msr_base + i, delay_bw_map(hw_dom->ctrl_val[i], r)); } static void cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); unsigned int i; for (i = m->low; i < m->high; i++) - wrmsrl(hw_res->msr_base + cbm_idx(r, i), d->ctrl_val[i]); + wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]); } struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r) @@ -510,21 +513,22 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm) static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); struct msr_param m; u32 *dc, *dm; - dc = kmalloc_array(hw_res->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL); + dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL); if (!dc) return -ENOMEM; - dm = kmalloc_array(hw_res->num_closid, sizeof(*d->mbps_val), GFP_KERNEL); + dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val), GFP_KERNEL); if (!dm) { kfree(dc); return -ENOMEM; } - d->ctrl_val = dc; - d->mbps_val = dm; + hw_dom->ctrl_val = dc; + hw_dom->mbps_val = dm; setup_default_ctrlval(r, dc, dm); m.low = 0; @@ -586,6 +590,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) { int id = get_cpu_cacheinfo_id(cpu, r->cache_level); struct list_head *add_pos = NULL; + struct rdt_hw_domain *hw_dom; struct rdt_domain *d; d = rdt_find_domain(r, id, _pos); @@ -599,10 +604,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) return; } - d = kzalloc_node(sizeof(*d), GFP_KERNEL, cpu_to_node(cpu)); - if (!d) + hw_dom = kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu)); + if (!hw_dom) return; + d = _dom->resctrl; d->id = id; cpumask_set_cpu(cpu, >cpu_mask); @@ -631,6 +637,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) static void domain_remove_cpu(int cpu, struct rdt_resource *r) { int id = get_cpu_cacheinfo_id(cpu, r->cache_level); + struct rdt_hw_domain *hw_dom; struct rdt_domain *d; d = rdt_find_domain(r, id, NULL); @@ -638,6 +645,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) pr_warn("Couldn't find cache id for CPU %d\n", cpu); return; } + hw_dom = resctrl_to_arch_dom(d); cpumask_clear_cpu(cpu, >cpu_mask); if (cpumask_empty(>cpu_mask)) { @@ -670,12 +678,12
[PATCH 08/24] x86/resctrl: Walk the resctrl schema list instead of an arch list
Now that resctrl has its own list of resources it is using, walk that list instead of the architectures list. This means resctrl has somewhere to keep schema properties with the resource that is using them. Most users of for_each_alloc_enabled_rdt_resource() are per-schema, and also want a schema property, like the conf_type. Switch these to walk the schema list. Schema were only created for alloc_enabled resources so these two lists are currently equivalent. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 38 ++- arch/x86/kernel/cpu/resctrl/internal.h| 6 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 34 +--- include/linux/resctrl.h | 5 +-- 4 files changed, 53 insertions(+), 30 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 8ac104c634fe..d3f9d142f58a 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -57,9 +57,10 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r) return true; } -int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r, +int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, struct rdt_domain *d) { + struct rdt_resource *r = s->res; unsigned long bw_val; if (d->have_new_ctrl) { @@ -125,10 +126,11 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) * Read one cache bit mask (hex). Check that it is valid for the current * resource type. */ -int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r, +int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, struct rdt_domain *d) { struct rdtgroup *rdtgrp = data->rdtgrp; + struct rdt_resource *r = s->res; u32 cbm_val; if (d->have_new_ctrl) { @@ -160,12 +162,12 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r, * The CBM may not overlap with the CBM of another closid if * either is exclusive. */ - if (rdtgroup_cbm_overlaps(r, d, cbm_val, rdtgrp->closid, true)) { + if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, true)) { rdt_last_cmd_puts("Overlaps with exclusive group\n"); return -EINVAL; } - if (rdtgroup_cbm_overlaps(r, d, cbm_val, rdtgrp->closid, false)) { + if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, false)) { if (rdtgrp->mode == RDT_MODE_EXCLUSIVE || rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) { rdt_last_cmd_puts("Overlaps with other group\n"); @@ -185,9 +187,10 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r, * separated by ";". The "id" is in decimal, and must match one of * the "id"s for this resource. */ -static int parse_line(char *line, struct rdt_resource *r, +static int parse_line(char *line, struct resctrl_schema *s, struct rdtgroup *rdtgrp) { + struct rdt_resource *r = s->res; struct rdt_parse_data data; char *dom = NULL, *id; struct rdt_domain *d; @@ -213,7 +216,8 @@ static int parse_line(char *line, struct rdt_resource *r, if (d->id == dom_id) { data.buf = dom; data.rdtgrp = rdtgrp; - if (r->parse_ctrlval(, r, d)) + + if (r->parse_ctrlval(, s, d)) return -EINVAL; if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) { /* @@ -289,10 +293,12 @@ static int rdtgroup_parse_resource(char *resname, char *tok, struct resctrl_schema *s; struct rdt_resource *r; + lockdep_assert_held(_mutex); + list_for_each_entry(s, _all_schema, list) { r = s->res; if (!strcmp(resname, r->name) && rdtgrp->closid < s->num_closid) - return parse_line(tok, r, rdtgrp); + return parse_line(tok, s, rdtgrp); } rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname); return -EINVAL; @@ -301,6 +307,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok, ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { + struct resctrl_schema *s; struct rdtgroup *rdtgrp; struct rdt_domain *dom; struct rdt_resource *r; @@ -331,8 +338,8 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, goto out; } - for_each_alloc_enabled_rdt_resource(r) { - list_for_each_entry(dom, >domains, list) + list_for_each_entry(s, _all_schema, list) { + list_for_each_entry(dom,
[PATCH 05/24] x86/resctrl: Pass the schema in resdir's private pointer
Moving properties that resctrl exposes to user-space into the core 'fs' code, (e.g. the name of the schema), means some of the functions that back the filesystem need the schema struct, but currently take the resource. Once the CDP resources are merged, the resource doesn't reflect the right level of information. For the info dirs that represent a control, the information needed is in the schema, as this is how the resource is being used. For the monitors, its the resource as L3CODE_MON doesn't make sense, and would monitor data too. This difference means the type of the private pointers varies between control and monitor info dirs. If the flags are RF_MON_INFO, its a struct rdt_resource. If the flags are RF_CTRL_INFO, its a struct resctrl_schema. Nothing in res_common_files[] has both flags. Signed-off-by: James Morse --- Fake schema for monitors may simplify this if anyone thinks that is preferable. --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 37 +- 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index f79a5e548138..cb16454a6b0e 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -847,7 +847,8 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of, static int rdt_num_closids_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%d\n", resctrl_arch_get_num_closid(r)); return 0; @@ -856,7 +857,8 @@ static int rdt_num_closids_show(struct kernfs_open_file *of, static int rdt_default_ctrl_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%x\n", r->default_ctrl); return 0; @@ -865,7 +867,8 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of, static int rdt_min_cbm_bits_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%u\n", r->cache.min_cbm_bits); return 0; @@ -874,7 +877,8 @@ static int rdt_min_cbm_bits_show(struct kernfs_open_file *of, static int rdt_shareable_bits_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%x\n", r->cache.shareable_bits); return 0; @@ -897,13 +901,14 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of, static int rdt_bit_usage_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; /* * Use unsigned long even though only 32 bits are used to ensure * test_bit() is used safely. */ unsigned long sw_shareable = 0, hw_shareable = 0; unsigned long exclusive = 0, pseudo_locked = 0; + struct rdt_resource *r = s->res; struct rdt_domain *dom; int i, hwb, swb, excl, psl; enum rdtgrp_mode mode; @@ -975,7 +980,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, static int rdt_min_bw_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%u\n", r->membw.min_bw); return 0; @@ -1006,7 +1012,8 @@ static int rdt_mon_features_show(struct kernfs_open_file *of, static int rdt_bw_gran_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%u\n", r->membw.bw_gran); return 0; @@ -1015,7 +1022,8 @@ static int rdt_bw_gran_show(struct kernfs_open_file *of, static int rdt_delay_linear_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { - struct rdt_resource *r = of->kn->parent->priv; + struct resctrl_schema *s = of->kn->parent->priv; + struct rdt_resource *r = s->res; seq_printf(seq, "%u\n",
[PATCH 01/24] x86/resctrl: Split struct rdt_resource
resctrl is the defacto Linux ABI for SoC resource partitioning features. To support it on another architecture, it needs to be abstracted from Intel RDT, and moved it to /fs/. Start by splitting struct rdt_resource, (the name is kept to keep the noise down), and add some type-trickery to keep the foreach helpers working. Move everything that that is particular to resctrl into a new header file, keeping the x86 hardware accessors where they are. resctrl code paths touching a 'hw' struct indicates where an abstraction is needed. Splitting rdt_domain up in a similar way happens in the next patch. No change in behaviour, this patch just moves types around. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c| 258 -- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 14 +- arch/x86/kernel/cpu/resctrl/internal.h| 138 +++- arch/x86/kernel/cpu/resctrl/monitor.c | 32 +-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 69 +++--- include/linux/resctrl.h | 117 ++ 7 files changed, 362 insertions(+), 270 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index e5f4ee8f4c3b..470661f2eb68 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -57,120 +57,134 @@ static void mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r); -#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains) +#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].resctrl.domains) -struct rdt_resource rdt_resources_all[] = { +struct rdt_hw_resource rdt_resources_all[] = { [RDT_RESOURCE_L3] = { - .rid= RDT_RESOURCE_L3, - .name = "L3", - .domains= domain_init(RDT_RESOURCE_L3), + .resctrl = { + .rid= RDT_RESOURCE_L3, + .name = "L3", + .cache_level= 3, + .cache = { + .min_cbm_bits = 1, + .cbm_idx_mult = 1, + .cbm_idx_offset = 0, + }, + .domains= domain_init(RDT_RESOURCE_L3), + .parse_ctrlval = parse_cbm, + .format_str = "%d=%0*x", + .fflags = RFTYPE_RES_CACHE, + }, .msr_base = MSR_IA32_L3_CBM_BASE, .msr_update = cat_wrmsr, - .cache_level= 3, - .cache = { - .min_cbm_bits = 1, - .cbm_idx_mult = 1, - .cbm_idx_offset = 0, - }, - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .fflags = RFTYPE_RES_CACHE, }, [RDT_RESOURCE_L3DATA] = { - .rid= RDT_RESOURCE_L3DATA, - .name = "L3DATA", - .domains= domain_init(RDT_RESOURCE_L3DATA), + .resctrl = { + .rid= RDT_RESOURCE_L3DATA, + .name = "L3DATA", + .cache_level= 3, + .cache = { + .min_cbm_bits = 1, + .cbm_idx_mult = 2, + .cbm_idx_offset = 0, + }, + .domains= domain_init(RDT_RESOURCE_L3DATA), + .parse_ctrlval = parse_cbm, + .format_str = "%d=%0*x", + .fflags = RFTYPE_RES_CACHE, + }, .msr_base = MSR_IA32_L3_CBM_BASE, .msr_update = cat_wrmsr, - .cache_level= 3, - .cache = { - .min_cbm_bits = 1, - .cbm_idx_mult = 2, - .cbm_idx_offset = 0, - }, - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .fflags = RFTYPE_RES_CACHE, }, [RDT_RESOURCE_L3CODE] = { - .rid= RDT_RESOURCE_L3CODE, - .name = "L3CODE", - .domains= domain_init(RDT_RESOURCE_L3CODE), + .resctrl = { + .rid=
[PATCH 15/24] x86/resctrl: Add a helper to read a closid's configuration
The hardware configuration may look completely different to the values resctrl gets from user-space. The staged configuration and resctrl_arch_update_domains() allow the architecture to convert or translate these values. (e.g. Arm's MPAM may back MBA's percentage control using the 'BWPBM' bitmap) Resctrl shouldn't read or write these values directly. As a step towards taking direct access away, add a helper to read the current configuration. This will allow another architecture to scale the bitmaps if necessary, and possibly use controls that don't take the user-space control format at all. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++--- arch/x86/kernel/cpu/resctrl/monitor.c | 6 +++- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 43 ++- include/linux/resctrl.h | 2 ++ 4 files changed, 37 insertions(+), 30 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 91864c2e5795..0cf2f24e5c3b 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -428,22 +428,30 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, return ret ?: nbytes; } +void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, +u32 closid, u32 *value) +{ + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); + + if (!is_mba_sc(r)) + *value = hw_dom->ctrl_val[closid]; + else + *value = hw_dom->mbps_val[closid]; +} + static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid) { struct rdt_resource *r = schema->res; - struct rdt_hw_domain *hw_dom; struct rdt_domain *dom; bool sep = false; u32 ctrl_val; seq_printf(s, "%*s:", RESCTRL_NAME_LEN, schema->name); list_for_each_entry(dom, >domains, list) { - hw_dom = resctrl_to_arch_dom(dom); if (sep) seq_puts(s, ";"); - ctrl_val = (!is_mba_sc(r) ? hw_dom->ctrl_val[closid] : - hw_dom->mbps_val[closid]); + resctrl_arch_get_config(r, dom, closid, _val); seq_printf(s, r->format_str, dom->id, max_data_width, ctrl_val); sep = true; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 8b7d7ebfcd4b..6a62f1323b27 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -379,8 +379,12 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) hw_dom_mba = resctrl_to_arch_dom(dom_mba); cur_bw = pmbm_data->prev_bw; - user_bw = hw_dom_mba->mbps_val[closid]; + resctrl_arch_get_config(r_mba, dom_mba, closid, _bw); delta_bw = pmbm_data->delta_bw; + /* +* resctrl_arch_get_config() chooses the mbps/ctrl value to return +* based on is_mba_sc(). For now, reach into the hw_dom. +*/ cur_msr_val = hw_dom_mba->ctrl_val[closid]; /* diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index c6689cad1ce7..f168f5a39242 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -911,27 +911,27 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, int i, hwb, swb, excl, psl; enum rdtgrp_mode mode; bool sep = false; - u32 *ctrl; + u32 ctrl_val; mutex_lock(_mutex); hw_shareable = r->cache.shareable_bits; list_for_each_entry(dom, >domains, list) { if (sep) seq_putc(seq, ';'); - ctrl = resctrl_to_arch_dom(dom)->ctrl_val; sw_shareable = 0; exclusive = 0; seq_printf(seq, "%d=", dom->id); - for (i = 0; i < closids_supported(); i++, ctrl++) { + for (i = 0; i < closids_supported(); i++) { if (!closid_allocated(i)) continue; + resctrl_arch_get_config(r, dom, i, _val); mode = rdtgroup_mode_by_closid(i); switch (mode) { case RDT_MODE_SHAREABLE: - sw_shareable |= *ctrl; + sw_shareable |= ctrl_val; break; case RDT_MODE_EXCLUSIVE: - exclusive |= *ctrl; + exclusive |= ctrl_val; break; case RDT_MODE_PSEUDO_LOCKSETUP: /* @@ -1190,7 +1190,6 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain
[PATCH 09/24] x86/resctrl: Change rdt_resource to resctrl_schema in pseudo_lock_region
struct pseudo_lock_region points to the rdt_resource. Once the resources are merged, this won't be unique. The resource name is moving into the schema, so that eventually resctrl can generate it. Change pseudo_lock_region's rdt_resource pointer for a schema pointer. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 ++-- arch/x86/kernel/cpu/resctrl/internal.h| 6 +++--- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 8 arch/x86/kernel/cpu/resctrl/rdtgroup.c| 4 ++-- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index d3f9d142f58a..a65ff53394ed 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -228,7 +228,7 @@ static int parse_line(char *line, struct resctrl_schema *s, * the required initialization for single * region and return. */ - rdtgrp->plr->r = r; + rdtgrp->plr->s = s; rdtgrp->plr->d = d; rdtgrp->plr->cbm = d->new_ctrl; d->plr = rdtgrp->plr; @@ -429,7 +429,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, ret = -ENODEV; } else { seq_printf(s, "%s:%d=%x\n", - rdtgrp->plr->r->name, + rdtgrp->plr->s->res->name, rdtgrp->plr->d->id, rdtgrp->plr->cbm); } diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 1e1f2493a87f..27671a654f8b 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -158,8 +158,8 @@ struct mongroup { /** * struct pseudo_lock_region - pseudo-lock region information - * @r: RDT resource to which this pseudo-locked region - * belongs + * @s: Resctrl schema for the resource to which this + * pseudo-locked region belongs * @d: RDT domain to which this pseudo-locked region * belongs * @cbm: bitmask of the pseudo-locked region @@ -179,7 +179,7 @@ struct mongroup { * @pm_reqs: Power management QoS requests related to this region */ struct pseudo_lock_region { - struct rdt_resource *r; + struct resctrl_schema *s; struct rdt_domain *d; u32 cbm; wait_queue_head_t lock_thread_wq; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index d58f2ffa65e0..d9d9861f244f 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -246,7 +246,7 @@ static void pseudo_lock_region_clear(struct pseudo_lock_region *plr) plr->line_size = 0; kfree(plr->kmem); plr->kmem = NULL; - plr->r = NULL; + plr->s = NULL; if (plr->d) plr->d->plr = NULL; plr->d = NULL; @@ -290,10 +290,10 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr) ci = get_cpu_cacheinfo(plr->cpu); - plr->size = rdtgroup_cbm_to_size(plr->r, plr->d, plr->cbm); + plr->size = rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm); for (i = 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level == plr->r->cache_level) { + if (ci->info_list[i].level == plr->s->res->cache_level) { plr->line_size = ci->info_list[i].coherency_line_size; return 0; } @@ -796,7 +796,7 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm unsigned long cbm_b; if (d->plr) { - cbm_len = d->plr->r->cache.cbm_len; + cbm_len = d->plr->s->res->cache.cbm_len; cbm_b = d->plr->cbm; if (bitmap_intersects(, _b, cbm_len)) return true; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 592a517afd6a..311a3890bc53 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1441,8 +1441,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of, ret = -ENODEV; } else { seq_printf(s, "%*s:", max_name_width, - rdtgrp->plr->r->name); - size = rdtgroup_cbm_to_size(rdtgrp->plr->r, +
[PATCH 21/24] x86/resctrl: Calculate the index from the configuration type
resctrl uses cbm_idx() to map a closid to an index in the configuration array. This is based on whether this is a CODE, DATA or BOTH resource. To merge the resources, resctrl needs to make this decision based on something else, as there will only be one resource. Decide based on the staged configuration type. This makes the static mult and offset parameters set by the arch code redundant. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c| 12 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 17 +++-- include/linux/resctrl.h | 6 -- 3 files changed, 11 insertions(+), 24 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 79b17ece4528..e2f5ea129be2 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -69,8 +69,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .cache_level= 3, .cache = { .min_cbm_bits = 1, - .cbm_idx_mult = 1, - .cbm_idx_offset = 0, }, .domains= domain_init(RDT_RESOURCE_L3), .parse_ctrlval = parse_cbm, @@ -89,8 +87,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .cache_level= 3, .cache = { .min_cbm_bits = 1, - .cbm_idx_mult = 2, - .cbm_idx_offset = 0, }, .domains= domain_init(RDT_RESOURCE_L3DATA), .parse_ctrlval = parse_cbm, @@ -109,8 +105,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .cache_level= 3, .cache = { .min_cbm_bits = 1, - .cbm_idx_mult = 2, - .cbm_idx_offset = 1, }, .domains= domain_init(RDT_RESOURCE_L3CODE), .parse_ctrlval = parse_cbm, @@ -129,8 +123,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .cache_level= 2, .cache = { .min_cbm_bits = 1, - .cbm_idx_mult = 1, - .cbm_idx_offset = 0, }, .domains= domain_init(RDT_RESOURCE_L2), .parse_ctrlval = parse_cbm, @@ -149,8 +141,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .cache_level= 2, .cache = { .min_cbm_bits = 1, - .cbm_idx_mult = 2, - .cbm_idx_offset = 0, }, .domains= domain_init(RDT_RESOURCE_L2DATA), .parse_ctrlval = parse_cbm, @@ -169,8 +159,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .cache_level= 2, .cache = { .min_cbm_bits = 1, - .cbm_idx_mult = 2, - .cbm_idx_offset = 1, }, .domains= domain_init(RDT_RESOURCE_L2CODE), .parse_ctrlval = parse_cbm, diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 28a251cf3c60..cb91dcd0f329 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -249,12 +249,17 @@ static int parse_line(char *line, struct resctrl_schema *s, return -EINVAL; } -static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid) +static u32 get_config_index(u32 closid, enum resctrl_conf_type type) { - if (r->rid == RDT_RESOURCE_MBA) + switch (type) { + default: + case CDP_BOTH: return closid; - - return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset; + case CDP_CODE: + return (closid * 2) + 1; + case CDP_DATA: + return (closid * 2); + } } /* @@ -305,7 +310,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r) if (!cfg->have_new_ctrl) continue; - idx = cbm_idx(r, cfg->closid); + idx = get_config_index(cfg->closid, t); if (!apply_config(hw_dom, cfg, cpu_mask, idx, mba_sc))
[PATCH 20/24] x86/resctrl: Apply offset correction when config is staged
When resctrl comes to write the CAT MSR values, it applies an adjustment based on the style of the resource. CODE and DATA resources have their closid mapped into an odd/even range. Previously the ctrlval array was increased to be the same size regardless of CODE/DATA/BOTH. Move the arithmetic into apply_config() so that odd/even slots in the ctrlval array are used. This makes it possible to merge the resources. In future, the arithmetic will be based on the style of the configuration, not the resource. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c| 15 +-- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 15 --- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 7 --- 3 files changed, 13 insertions(+), 24 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index b2fda4cd88ba..79b17ece4528 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -195,11 +195,6 @@ struct rdt_hw_resource rdt_resources_all[] = { }, }; -static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid) -{ - return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset; -} - /* * cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs * as they do not have CPUID enumeration support for Cache allocation. @@ -438,7 +433,7 @@ cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r) unsigned int i; for (i = m->low; i < m->high; i++) - wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]); + wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]); } struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r) @@ -549,14 +544,6 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d) m.low = 0; m.high = hw_res->num_closid; - - /* -* temporary: the array is full-size, but cat_wrmsr() still re-maps -* the index. -*/ - if (hw_res->conf_type != CDP_BOTH) - m.high /= 2; - hw_res->msr_update(d, , r); return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index f6b4049c67c2..28a251cf3c60 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -249,6 +249,14 @@ static int parse_line(char *line, struct resctrl_schema *s, return -EINVAL; } +static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid) +{ + if (r->rid == RDT_RESOURCE_MBA) + return closid; + + return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset; +} + /* * Merge the staged config with the domains configuration array. * Return true if changes were made. @@ -297,7 +305,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r) if (!cfg->have_new_ctrl) continue; - idx = cfg->closid; + idx = cbm_idx(r, cfg->closid); if (!apply_config(hw_dom, cfg, cpu_mask, idx, mba_sc)) continue; @@ -432,11 +440,12 @@ void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type, u32 *value) { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); + u32 idx = cbm_idx(r, closid); if (!is_mba_sc(r)) - *value = hw_dom->ctrl_val[closid]; + *value = hw_dom->ctrl_val[idx]; else - *value = hw_dom->mbps_val[closid]; + *value = hw_dom->mbps_val[idx]; } static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 4fa6c386d751..162e415d5d09 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2379,13 +2379,6 @@ static int reset_all_ctrls(struct rdt_resource *r) msr_param.low = 0; msr_param.high = hw_res->num_closid; - /* -* temporary: the array is full-sized, but cat_wrmsr() still re-maps -* the index. -*/ - if (hw_res->cdp_enabled) - msr_param.high /= 2; - /* * Disable resource control for this resource by setting all * CBMs in all domains to the maximum mask value. Pick one CPU -- 2.28.0
[PATCH 10/24] x86/resctrl: Move the schema names into struct resctrl_schema
Move the names used for the schemata file out of the resource and into struct resctrl_schema. This allows one resource to have two different names, based on the other schema properties. This patch copies the names, eventually resctrl will generate them. Remove the arch code's max_name_width, this is now resctrl's problem. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c| 9 ++--- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 +++--- arch/x86/kernel/cpu/resctrl/internal.h| 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 17 - include/linux/resctrl.h | 7 +++ 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 1ed5e04031e6..cda071009fed 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -37,10 +37,10 @@ DEFINE_MUTEX(rdtgroup_mutex); DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state); /* - * Used to store the max resource name width and max resource data width + * Used to store the max resource data width * to display the schemata in a tabular format */ -int max_name_width, max_data_width; +int max_data_width; /* * Global boolean for rdt_alloc which is true if any @@ -776,13 +776,8 @@ static int resctrl_offline_cpu(unsigned int cpu) static __init void rdt_init_padding(void) { struct rdt_resource *r; - int cl; for_each_alloc_capable_rdt_resource(r) { - cl = strlen(r->name); - if (cl > max_name_width) - max_name_width = cl; - if (r->data_width > max_data_width) max_data_width = r->data_width; } diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index a65ff53394ed..28d69c78c29e 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -291,13 +291,11 @@ static int rdtgroup_parse_resource(char *resname, char *tok, struct rdtgroup *rdtgrp) { struct resctrl_schema *s; - struct rdt_resource *r; lockdep_assert_held(_mutex); list_for_each_entry(s, _all_schema, list) { - r = s->res; - if (!strcmp(resname, r->name) && rdtgrp->closid < s->num_closid) + if (!strcmp(resname, s->name) && rdtgrp->closid < s->num_closid) return parse_line(tok, s, rdtgrp); } rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname); @@ -391,7 +389,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo bool sep = false; u32 ctrl_val; - seq_printf(s, "%*s:", max_name_width, r->name); + seq_printf(s, "%*s:", RESCTRL_NAME_LEN, schema->name); list_for_each_entry(dom, >domains, list) { hw_dom = resctrl_to_arch_dom(dom); if (sep) @@ -411,7 +409,6 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, { struct resctrl_schema *schema; struct rdtgroup *rdtgrp; - struct rdt_resource *r; int ret = 0; u32 closid; @@ -419,8 +416,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, if (rdtgrp) { if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) { list_for_each_entry(schema, _all_schema, list) { - r = schema->res; - seq_printf(s, "%s:uninitialized\n", r->name); + seq_printf(s, "%s:uninitialized\n", schema->name); } } else if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) { if (!rdtgrp->plr->d) { diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 27671a654f8b..5294ae0c3ed9 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -248,7 +248,7 @@ struct rdtgroup { /* List of all resource groups */ extern struct list_head rdt_all_groups; -extern int max_name_width, max_data_width; +extern int max_data_width; int __init rdtgroup_init(void); void __exit rdtgroup_exit(void); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 311a3890bc53..48f4d6783647 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1440,8 +1440,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of, rdt_last_cmd_puts("Cache domain offline\n"); ret = -ENODEV; } else { - seq_printf(s, "%*s:", max_name_width, - rdtgrp->plr->s->res->name); + seq_printf(s, "%*s:", RESCTRL_NAME_LEN, +
[PATCH 14/24] x86/resctrl: Make update_domains() learn the affected closids
Now that the closid is present in the staged configuration, update_domains() can learn which low/high values it should update, instead of being explicitly told. This paves the way for multiple configuration changes being staged, affecting different indexes in the ctrlval array. Remove the single passed in closid, and update msr_param as each staged config is applied. Once the L2/L2CODE/L2DATA resources are merged this will allow update_domains() to be called once for the single resource, even when CDP is in use. This results in both CODE and DATA configurations being applied and the two consecutive closids being updated with a single smp_call_function_many(). This keeps the CDP odd/even behaviour inside the arch code for resctrl, so that architectures that don't do this don't need to emulate it. As update_domains() applies the staged configuration to the hw_dom's configuration array, and updates the hardware, make it part of the arch code interface. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 40 +-- arch/x86/kernel/cpu/resctrl/internal.h| 6 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 2 +- include/linux/resctrl.h | 1 + 4 files changed, 35 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index f7152c7fdc1b..91864c2e5795 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -249,37 +249,44 @@ static int parse_line(char *line, struct resctrl_schema *s, return -EINVAL; } -static void apply_config(struct rdt_hw_domain *hw_dom, +/* + * Merge the staged config with the domains configuration array. + * Return true if changes were made. + */ +static bool apply_config(struct rdt_hw_domain *hw_dom, struct resctrl_staged_config *cfg, -cpumask_var_t cpu_mask, bool mba_sc) +cpumask_var_t cpu_mask, u32 idx, bool mba_sc) { struct rdt_domain *dom = _dom->resctrl; u32 *dc = mba_sc ? hw_dom->mbps_val : hw_dom->ctrl_val; - if (cfg->new_ctrl != dc[cfg->closid]) { + cfg->have_new_ctrl = false; + if (cfg->new_ctrl != dc[idx]) { cpumask_set_cpu(cpumask_any(>cpu_mask), cpu_mask); - dc[cfg->closid] = cfg->new_ctrl; + dc[idx] = cfg->new_ctrl; + + return true; } - cfg->have_new_ctrl = false; + return false; } -int update_domains(struct rdt_resource *r, int closid) +int resctrl_arch_update_domains(struct rdt_resource *r) { struct resctrl_staged_config *cfg; struct rdt_hw_domain *hw_dom; + bool msr_param_init = false; struct msr_param msr_param; enum resctrl_conf_type t; cpumask_var_t cpu_mask; struct rdt_domain *d; bool mba_sc; int cpu; + u32 idx; if (!zalloc_cpumask_var(_mask, GFP_KERNEL)) return -ENOMEM; - msr_param.low = closid; - msr_param.high = msr_param.low + 1; msr_param.res = r; mba_sc = is_mba_sc(r); @@ -290,10 +297,23 @@ int update_domains(struct rdt_resource *r, int closid) if (!cfg->have_new_ctrl) continue; - apply_config(hw_dom, cfg, cpu_mask, mba_sc); + idx = cfg->closid; + if (!apply_config(hw_dom, cfg, cpu_mask, idx, mba_sc)) + continue; + + if (!msr_param_init) { + msr_param.low = idx; + msr_param.high = idx; + msr_param_init = true; + } else { + msr_param.low = min(msr_param.low, idx); + msr_param.high = max(msr_param.high, idx); + } } } + msr_param.high += 1; + /* * Avoid writing the control msr with control values when * MBA software controller is enabled @@ -387,7 +407,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, list_for_each_entry(s, _all_schema, list) { r = s->res; - ret = update_domains(r, rdtgrp->closid); + ret = resctrl_arch_update_domains(r); if (ret) goto out; } diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 5294ae0c3ed9..e86550d888cc 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -324,8 +324,8 @@ static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r) */ struct msr_param { struct rdt_resource *res; - int low; - int
[PATCH 12/24] x86/resctrl: Add closid to the staged config
Once the L2/L2CODE/L2DATA resources are merged, there may be two configurations staged for one resource when CDP is enabled. The closid should always be passed with the type of configuration to the arch code. Because update_domains() will eventually apply a set of configurations, it should take the closid from the same place, so they pair up. Move the closid to be a staged parameter. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++ arch/x86/kernel/cpu/resctrl/rdtgroup.c| 6 -- include/linux/resctrl.h | 2 ++ 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 0c95ed83eb05..b107c0202cfb 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -72,6 +72,7 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, if (!bw_validate(data->buf, _val, r)) return -EINVAL; cfg->new_ctrl = bw_val; + cfg->closid = data->rdtgrp->closid; cfg->have_new_ctrl = true; return 0; @@ -178,6 +179,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, } cfg->new_ctrl = cbm_val; + cfg->closid = data->rdtgrp->closid; cfg->have_new_ctrl = true; return 0; @@ -245,15 +247,15 @@ static int parse_line(char *line, struct resctrl_schema *s, } static void apply_config(struct rdt_hw_domain *hw_dom, -struct resctrl_staged_config *cfg, int closid, +struct resctrl_staged_config *cfg, cpumask_var_t cpu_mask, bool mba_sc) { struct rdt_domain *dom = _dom->resctrl; u32 *dc = mba_sc ? hw_dom->mbps_val : hw_dom->ctrl_val; - if (cfg->new_ctrl != dc[closid]) { + if (cfg->new_ctrl != dc[cfg->closid]) { cpumask_set_cpu(cpumask_any(>cpu_mask), cpu_mask); - dc[closid] = cfg->new_ctrl; + dc[cfg->closid] = cfg->new_ctrl; } cfg->have_new_ctrl = false; @@ -284,7 +286,7 @@ int update_domains(struct rdt_resource *r, int closid) if (!cfg->have_new_ctrl) continue; - apply_config(hw_dom, cfg, closid, cpu_mask, mba_sc); + apply_config(hw_dom, cfg, cpu_mask, mba_sc); } } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index c307170ee45f..1092631ac0b3 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2806,6 +2806,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s, rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id); return -ENOSPC; } + cfg->closid = closid; cfg->have_new_ctrl = true; return 0; @@ -2836,7 +2837,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) } /* Initialize MBA resource with default values. */ -static void rdtgroup_init_mba(struct rdt_resource *r) +static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid) { struct resctrl_staged_config *cfg; struct rdt_domain *d; @@ -2844,6 +2845,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r) list_for_each_entry(d, >domains, list) { cfg = >staged_config[0]; cfg->new_ctrl = is_mba_sc(r) ? MBA_MAX_MBPS : r->default_ctrl; + cfg->closid = closid; cfg->have_new_ctrl = true; } } @@ -2860,7 +2862,7 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) list_for_each_entry(s, _all_schema, list) { r = s->res; if (r->rid == RDT_RESOURCE_MBA) { - rdtgroup_init_mba(r); + rdtgroup_init_mba(r, rdtgrp->closid); } else { ret = rdtgroup_init_cat(s, rdtgrp->closid); if (ret < 0) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index f1164bbb66c5..695247c08ba3 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -28,10 +28,12 @@ enum resctrl_conf_type { /** * struct resctrl_staged_config - parsed configuration to be applied + * @closid:The closid the new configuration applies to * @new_ctrl: new ctrl value to be loaded * @have_new_ctrl: did user provide new_ctrl for this domain */ struct resctrl_staged_config { + u32 closid; u32 new_ctrl; boolhave_new_ctrl; }; -- 2.28.0
[GIT PULL] Btrfs fixes for 5.10-rc2
Hi, please pull the following branch with fixes. Thanks. - lockdep fixes - drop path locks before manipulating sysfs objects or qgroups - preliminary fixes before tree locks get switched to rwsem - use annotated seqlock - build warning fixes (printk format) - fix relocation vs fallocate race - tree checker properly validates number of stripes and parity - readahead vs device replace fixes - iomap dio fix for unnecessary buffered io fallback The following changes since commit 1fd4033dd011a3525bacddf37ab9eac425d25c4f: btrfs: rename BTRFS_INODE_ORDERED_DATA_CLOSE flag (2020-10-07 12:18:00 +0200) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.10-rc1-tag for you to fetch changes up to d5c8238849e7bae6063dfc16c08ed62cee7ee688: btrfs: convert data_seqcount to seqcount_mutex_t (2020-10-27 15:11:51 +0100) Daniel Xu (1): btrfs: tree-checker: validate number of chunk stripes and parity Davidlohr Bueso (1): btrfs: convert data_seqcount to seqcount_mutex_t Filipe Manana (3): btrfs: fix relocation failure due to race with fallocate btrfs: fix use-after-free on readahead extent after failure to create it btrfs: fix readahead hang and use-after-free after removing a device Johannes Thumshirn (1): btrfs: don't fallback to buffered read if we don't need to Josef Bacik (3): btrfs: drop the path before adding block group sysfs files btrfs: drop the path before adding qgroup items when enabling qgroups btrfs: add a helper to read the tree_root commit root for backref lookup Pujin Shi (1): btrfs: tree-checker: fix incorrect printk format fs/btrfs/backref.c | 13 - fs/btrfs/block-group.c | 1 + fs/btrfs/ctree.h| 2 + fs/btrfs/dev-replace.c | 5 ++ fs/btrfs/disk-io.c | 139 ++-- fs/btrfs/disk-io.h | 3 ++ fs/btrfs/extent-tree.c | 2 +- fs/btrfs/file.c | 3 +- fs/btrfs/inode.c| 8 ++- fs/btrfs/qgroup.c | 18 +++ fs/btrfs/reada.c| 47 fs/btrfs/tree-checker.c | 18 +++ fs/btrfs/volumes.c | 5 +- fs/btrfs/volumes.h | 12 ++--- 14 files changed, 225 insertions(+), 51 deletions(-)
[PATCH 23/24] x86/resctrl: Remove rdt_cdp_peer_get()
Now that the configuration can be read from either resource, as they share the ctrlval array, rdt_cdp_peer_get() is not needed to to map the resource and search for the corresponding domain. Replace it with a helper to return the 'other' CODE/DATA type, and use the existing get-config helper. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 99 -- 1 file changed, 14 insertions(+), 85 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 162e415d5d09..0d561679f7e8 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1094,82 +1094,17 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of, return 0; } -/** - * rdt_cdp_peer_get - Retrieve CDP peer if it exists - * @r: RDT resource to which RDT domain @d belongs - * @d: Cache instance for which a CDP peer is requested - * @r_cdp: RDT resource that shares hardware with @r (RDT resource peer) - * Used to return the result. - * @d_cdp: RDT domain that shares hardware with @d (RDT domain peer) - * Used to return the result. - * @peer_type: The CDP configuration type of the peer resource. - * - * RDT resources are managed independently and by extension the RDT domains - * (RDT resource instances) are managed independently also. The Code and - * Data Prioritization (CDP) RDT resources, while managed independently, - * could refer to the same underlying hardware. For example, - * RDT_RESOURCE_L2CODE and RDT_RESOURCE_L2DATA both refer to the L2 cache. - * - * When provided with an RDT resource @r and an instance of that RDT - * resource @d rdt_cdp_peer_get() will return if there is a peer RDT - * resource and the exact instance that shares the same hardware. - * - * Return: 0 if a CDP peer was found, <0 on error or if no CDP peer exists. - * If a CDP peer was found, @r_cdp will point to the peer RDT resource - * and @d_cdp will point to the peer RDT domain. - */ -static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d, - struct rdt_resource **r_cdp, - struct rdt_domain **d_cdp, - enum resctrl_conf_type *peer_type) +static enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type) { - struct rdt_resource *_r_cdp = NULL; - struct rdt_domain *_d_cdp = NULL; - int ret = 0; - - switch (r->rid) { - case RDT_RESOURCE_L3DATA: - _r_cdp = _resources_all[RDT_RESOURCE_L3CODE].resctrl; - *peer_type = CDP_CODE; - break; - case RDT_RESOURCE_L3CODE: - _r_cdp = _resources_all[RDT_RESOURCE_L3DATA].resctrl; - *peer_type = CDP_DATA; - break; - case RDT_RESOURCE_L2DATA: - _r_cdp = _resources_all[RDT_RESOURCE_L2CODE].resctrl; - *peer_type = CDP_CODE; - break; - case RDT_RESOURCE_L2CODE: - _r_cdp = _resources_all[RDT_RESOURCE_L2DATA].resctrl; - *peer_type = CDP_DATA; - break; + switch (my_type) { + case CDP_CODE: + return CDP_DATA; + case CDP_DATA: + return CDP_CODE; default: - ret = -ENOENT; - goto out; - } - - /* -* When a new CPU comes online and CDP is enabled then the new -* RDT domains (if any) associated with both CDP RDT resources -* are added in the same CPU online routine while the -* rdtgroup_mutex is held. It should thus not happen for one -* RDT domain to exist and be associated with its RDT CDP -* resource but there is no RDT domain associated with the -* peer RDT CDP resource. Hence the WARN. -*/ - _d_cdp = rdt_find_domain(_r_cdp, d->id, NULL); - if (WARN_ON(IS_ERR_OR_NULL(_d_cdp))) { - _r_cdp = NULL; - _d_cdp = NULL; - ret = -EINVAL; + case CDP_BOTH: + return CDP_BOTH; } - -out: - *r_cdp = _r_cdp; - *d_cdp = _d_cdp; - - return ret; } /** @@ -1250,19 +1185,16 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d, unsigned long cbm, int closid, bool exclusive) { - enum resctrl_conf_type peer_type; + enum resctrl_conf_type peer_type = resctrl_peer_type(s->conf_type); struct rdt_resource *r = s->res; - struct rdt_resource *r_cdp; - struct rdt_domain *d_cdp; if (__rdtgroup_cbm_overlaps(r, d, cbm, closid, s->conf_type, exclusive)) return true; - if (rdt_cdp_peer_get(r, d, _cdp, _cdp, _type) < 0) + if (!resctrl_arch_get_cdp_enabled(r->rid))
[PATCH 16/24] x86/resctrl: Add a helper to read/set the CDP configuration
Currently whether CDP is enabled is described in the alloc_enabled and alloc_capable flags, which are set differently between the L3 and L3CODE+L3DATA resources. To merge these resources, to give us one configuration, the CDP state of the resource needs tracking explicitly. Add cdp_capable as something visible to resctrl, and cdp_enabled as something the arch code manages. resctrl_arch_set_cdp_enabled() lets resctrl enable or disable CDP on a resource. resctrl_arch_get_cdp_enabled() lets it read the current state. With Arm's MPAM, separate code and data closids is a part of the CPU configuration. Enabling CDP for one resource means all resources see the different closid values. Signed-off-by: James Morse --- It may be possible for MPAM to apply the same 'L3' configuration to the two closid that are in use, giving the illusion that CDP is enabled for some resources, but disabled for others ... but this will complicate monitoring. --- arch/x86/kernel/cpu/resctrl/core.c| 4 ++ arch/x86/kernel/cpu/resctrl/internal.h| 11 +++- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 67 +-- 4 files changed, 55 insertions(+), 31 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index cda071009fed..7e98869ba006 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -369,11 +369,15 @@ static void rdt_get_cdp_config(int level, int type) r->cache.shareable_bits = r_l->cache.shareable_bits; r->data_width = (r->cache.cbm_len + 3) / 4; r->alloc_capable = true; + hw_res_l->cdp_capable = true; + hw_res->cdp_capable = true; /* * By default, CDP is disabled. CDP can be enabled by mount parameter * "cdp" during resctrl file system mount time. */ r->alloc_enabled = false; + hw_res_l->cdp_enabled = false; + hw_res->cdp_enabled = false; } static void rdt_get_cdp_l3_config(void) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index e86550d888cc..f039fd9f4f4f 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -365,6 +365,8 @@ struct rdt_parse_data { * @msr_base: Base MSR address for CBMs * @msr_update:Function pointer to update QOS MSRs * @mon_scale: cqm counter * mon_scale = occupancy in bytes + * @cdp_capable: Is the CDP feature available on this resource + * @cdp_enabled: CDP state of this resource */ struct rdt_hw_resource { enum resctrl_conf_type conf_type; @@ -377,6 +379,8 @@ struct rdt_hw_resource { struct rdt_resource *r); unsigned intmon_scale; unsigned intmbm_width; + boolcdp_capable; + boolcdp_enabled; }; static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r) @@ -397,7 +401,7 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); extern struct dentry *debugfs_resctrl; -enum { +enum resctrl_res_level { RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA, RDT_RESOURCE_L3CODE, @@ -418,6 +422,11 @@ static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res) return _res->resctrl; } +static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l) +{ + return rdt_resources_all[l].cdp_enabled; +} + #define for_each_rdt_resource(r) \ for (r = _resources_all[0].resctrl; \ r < _resources_all[RDT_NUM_RESOURCES].resctrl; \ diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index d9d9861f244f..f126d442a65f 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -684,8 +684,8 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp) * resource, the portion of cache used by it should be made * unavailable to all future allocations from both resources. */ - if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled || - rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled) { + if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3) || + resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2)) { rdt_last_cmd_puts("CDP enabled\n"); return -EINVAL; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index f168f5a39242..6e150560c3c1 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1995,51 +1995,62 @@ static int cdp_enable(int level, int data_type, int code_type) r_l->alloc_enabled = false;
Re: [PATCH v11 01/14] s390/vfio-ap: No need to disable IRQ after queue reset
On 10/29/20 7:29 PM, Tony Krowiak wrote: On 10/27/20 2:48 AM, Halil Pasic wrote: On Thu, 22 Oct 2020 13:11:56 -0400 Tony Krowiak wrote: The queues assigned to a matrix mediated device are currently reset when: * The VFIO_DEVICE_RESET ioctl is invoked * The mdev fd is closed by userspace (QEMU) * The mdev is removed from sysfs. What about the situation when vfio_ap_mdev_group_notifier() is called to tell us that our pointer to KVM is about to become invalid? Do we need to clean up the IRQ stuff there? After reading this question, I decided to do some tracing using printk's and learned that the vfio_ap_mdev_group_notifier() function does not get called when the guest is shutdown. The reason for this is because the vfio_ap_mdev_release() function, which is called before the KVM pointer is invalidated, unregisters the group notifier. I took a look at some of the other drivers that register a group notifier in the mdev_parent_ops.open callback and each unregistered the notifier in the mdev_parent_ops.release callback. So, to answer your question, there is no need to cleanup the IRQ stuff in the vfio_ap_mdev_group_notifier() function since it will not get called when the KVM pointer is invalidated. The cleanup should be done in the vfio_ap_mdev_release() function that gets called when the mdev fd is closed. Immediately after the reset of a queue, a call is made to disable interrupts for the queue. This is entirely unnecessary because the reset of a queue disables interrupts, so this will be removed. Makes sense. Since interrupt processing may have been enabled by the guest, it may also be necessary to clean up the resources used for interrupt processing. Part of the cleanup operation requires a reference to KVM, so a check is also being added to ensure the reference to KVM exists. The reason is because the release callback - invoked when userspace closes the mdev fd - removes the reference to KVM. When the remove callback - called when the mdev is removed from sysfs - is subsequently invoked, there will be no reference to KVM when the cleanup is performed. Please see below in the code. This patch will also do a bit of refactoring due to the fact that the remove callback, implemented in vfio_ap_drv.c, disables the queue after resetting it. Instead of the remove callback making a call into the vfio_ap_ops.c to clean up the resources used for interrupt processing, let's move the probe and remove callbacks into the vfio_ap_ops.c file keep all code related to managing queues in a single file. It would have been helpful to split out the refactoring as a separate patch. This way it is harder to review the code that got moved, because it is intermingled with the changes that intend to change behavior. I suppose I can do that. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_drv.c | 45 +-- drivers/s390/crypto/vfio_ap_ops.c | 63 +++ drivers/s390/crypto/vfio_ap_private.h | 7 +-- 3 files changed, 52 insertions(+), 63 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index be2520cc010b..73bd073fd5d3 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -43,47 +43,6 @@ static struct ap_device_id ap_queue_ids[] = { MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids); -/** - * vfio_ap_queue_dev_probe: - * - * Allocate a vfio_ap_queue structure and associate it - * with the device as driver_data. - */ -static int vfio_ap_queue_dev_probe(struct ap_device *apdev) -{ - struct vfio_ap_queue *q; - - q = kzalloc(sizeof(*q), GFP_KERNEL); - if (!q) - return -ENOMEM; - dev_set_drvdata(>device, q); - q->apqn = to_ap_queue(>device)->qid; - q->saved_isc = VFIO_AP_ISC_INVALID; - return 0; -} - -/** - * vfio_ap_queue_dev_remove: - * - * Takes the matrix lock to avoid actions on this device while removing - * Free the associated vfio_ap_queue structure - */ -static void vfio_ap_queue_dev_remove(struct ap_device *apdev) -{ - struct vfio_ap_queue *q; - int apid, apqi; - - mutex_lock(_dev->lock); - q = dev_get_drvdata(>device); - dev_set_drvdata(>device, NULL); - apid = AP_QID_CARD(q->apqn); - apqi = AP_QID_QUEUE(q->apqn); - vfio_ap_mdev_reset_queue(apid, apqi, 1); - vfio_ap_irq_disable(q); - kfree(q); - mutex_unlock(_dev->lock); -} - static void vfio_ap_matrix_dev_release(struct device *dev) { struct ap_matrix_dev *matrix_dev = dev_get_drvdata(dev); @@ -186,8 +145,8 @@ static int __init vfio_ap_init(void) return ret; memset(_ap_drv, 0, sizeof(vfio_ap_drv)); - vfio_ap_drv.probe = vfio_ap_queue_dev_probe; - vfio_ap_drv.remove = vfio_ap_queue_dev_remove; + vfio_ap_drv.probe = vfio_ap_mdev_probe_queue; + vfio_ap_drv.remove = vfio_ap_mdev_remove_queue; vfio_ap_drv.ids = ap_queue_ids; ret = ap_driver_register(_ap_drv, THIS_MODULE,
[PATCH 17/24] x86/resctrl: Use cdp_enabled in rdt_domain_reconfigure_cdp()
rdt_domain_reconfigure_cdp() infers whether CDP is enabled by checking the alloc_capable and alloc_enabled flags of the data resources. Now that there is an explicit cdp_enabled, use that. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 6e150560c3c1..eeedafa7d5e7 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1944,14 +1944,16 @@ static int set_cache_qos_cfg(int level, bool enable) /* Restore the qos cfg state when a domain comes online */ void rdt_domain_reconfigure_cdp(struct rdt_resource *r) { - if (!r->alloc_capable) + struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); + + if (!hw_res->cdp_capable) return; if (r == _resources_all[RDT_RESOURCE_L2DATA].resctrl) - l2_qos_cfg_update(>alloc_enabled); + l2_qos_cfg_update(_res->cdp_enabled); if (r == _resources_all[RDT_RESOURCE_L3DATA].resctrl) - l3_qos_cfg_update(>alloc_enabled); + l3_qos_cfg_update(_res->cdp_enabled); } /* -- 2.28.0
[PATCH 11/24] x86/resctrl: Group staged configuration into a separate struct
Arm's MPAM may have surprisingly large bitmaps for its cache portions as the architecture allows up to 4K portions. The size exposed via resctrl may not be the same, some scaling may occur. The values written to hardware may be unlike the values received from resctrl, e.g. MBA percentages may be backed by a bitmap, or a maximum value that isn't a percentage. Today resctrl's ctrlval arrays are written to directly by the resctrl filesystem code. e.g. apply_config(). This is a problem if scaling or conversion is needed by the architecture. The arch code should own the ctrlval array (to allow scaling and conversion), and should only need a single copy of the array for the values currently applied in hardware. Move the new_ctrl bitmap value and flag into a struct for staged configuration changes. This is created as an array to allow one per type of configuration. Today there is only one element in the array, but eventually resctrl will use the array slots for CODE/DATA/BOTH to detect a duplicate schema being written. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 49 --- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 22 +- include/linux/resctrl.h | 17 +--- 3 files changed, 60 insertions(+), 28 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 28d69c78c29e..0c95ed83eb05 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -60,18 +60,19 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r) int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, struct rdt_domain *d) { + struct resctrl_staged_config *cfg = >staged_config[0]; struct rdt_resource *r = s->res; unsigned long bw_val; - if (d->have_new_ctrl) { + if (cfg->have_new_ctrl) { rdt_last_cmd_printf("Duplicate domain %d\n", d->id); return -EINVAL; } if (!bw_validate(data->buf, _val, r)) return -EINVAL; - d->new_ctrl = bw_val; - d->have_new_ctrl = true; + cfg->new_ctrl = bw_val; + cfg->have_new_ctrl = true; return 0; } @@ -129,11 +130,12 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, struct rdt_domain *d) { + struct resctrl_staged_config *cfg = >staged_config[0]; struct rdtgroup *rdtgrp = data->rdtgrp; struct rdt_resource *r = s->res; u32 cbm_val; - if (d->have_new_ctrl) { + if (cfg->have_new_ctrl) { rdt_last_cmd_printf("Duplicate domain %d\n", d->id); return -EINVAL; } @@ -175,8 +177,8 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, } } - d->new_ctrl = cbm_val; - d->have_new_ctrl = true; + cfg->new_ctrl = cbm_val; + cfg->have_new_ctrl = true; return 0; } @@ -190,6 +192,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, static int parse_line(char *line, struct resctrl_schema *s, struct rdtgroup *rdtgrp) { + struct resctrl_staged_config *cfg; struct rdt_resource *r = s->res; struct rdt_parse_data data; char *dom = NULL, *id; @@ -220,6 +223,7 @@ static int parse_line(char *line, struct resctrl_schema *s, if (r->parse_ctrlval(, s, d)) return -EINVAL; if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) { + cfg = >staged_config[0]; /* * In pseudo-locking setup mode and just * parsed a valid CBM that should be @@ -230,7 +234,7 @@ static int parse_line(char *line, struct resctrl_schema *s, */ rdtgrp->plr->s = s; rdtgrp->plr->d = d; - rdtgrp->plr->cbm = d->new_ctrl; + rdtgrp->plr->cbm = cfg->new_ctrl; d->plr = rdtgrp->plr; return 0; } @@ -240,15 +244,30 @@ static int parse_line(char *line, struct resctrl_schema *s, return -EINVAL; } +static void apply_config(struct rdt_hw_domain *hw_dom, +struct resctrl_staged_config *cfg, int closid, +cpumask_var_t cpu_mask, bool mba_sc) +{ + struct rdt_domain *dom = _dom->resctrl; + u32 *dc = mba_sc ? hw_dom->mbps_val : hw_dom->ctrl_val; + + if (cfg->new_ctrl != dc[closid]) { + cpumask_set_cpu(cpumask_any(>cpu_mask), cpu_mask); + dc[closid] =
[PATCH 22/24] x86/resctrl: Merge the ctrlval arrays
Now that the CODE/DATA resources don't use overlapping slots in the ctrlval arrays, they can be merged. This allows the cdp_peer configuration to be read from any resource's domain, instead of searching for the matching flavour. Add a helper to allocate the ctrlval array, that returns the value on the L2 or L3 resource if it already exists. This gets removed once the resources are merged, and there really is only one ctrlval array. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 79 +++--- 1 file changed, 72 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index e2f5ea129be2..01d010977367 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -509,6 +509,72 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm) } } +/* + * temporary. + * This relies on L2 or L3 being allocated before their CODE/DATA aliases + */ +static u32 *alloc_ctrlval_array(struct rdt_resource *r, struct rdt_domain *d, + bool mba_sc) +{ + /* these are for the underlying hardware, they may not match r/d */ + struct rdt_domain *underlying_domain; + struct rdt_hw_resource *hw_res; + struct rdt_hw_domain *hw_dom; + bool remapped; + + switch (r->rid) { + case RDT_RESOURCE_L3DATA: + case RDT_RESOURCE_L3CODE: + hw_res = _resources_all[RDT_RESOURCE_L3]; + remapped = true; + break; + case RDT_RESOURCE_L2DATA: + case RDT_RESOURCE_L2CODE: + hw_res = _resources_all[RDT_RESOURCE_L2]; + remapped = true; + break; + default: + hw_res = resctrl_to_arch_res(r); + remapped = false; + } + + /* +* If we changed the resource, we need to search for the underlying +* domain. Doing this for all resources would make it tricky to add the +* first resource, as domains aren't added to a resource list until +* after the ctrlval arrays have been allocated. +*/ + if (remapped) + underlying_domain = rdt_find_domain(_res->resctrl, d->id, + NULL); + else + underlying_domain = d; + hw_dom = resctrl_to_arch_dom(underlying_domain); + + if (mba_sc) { + if (hw_dom->mbps_val) + return hw_dom->mbps_val; + return kmalloc_array(hw_res->num_closid, +sizeof(*hw_dom->mbps_val), GFP_KERNEL); + } else { + if (hw_dom->ctrl_val) + return hw_dom->ctrl_val; + return kmalloc_array(hw_res->num_closid, +sizeof(*hw_dom->ctrl_val), GFP_KERNEL); + } +} + +/* Only kfree() for L2/L3, not the CODE/DATA aliases */ +static void free_ctrlval_arrays(struct rdt_resource *r, struct rdt_domain *d) +{ + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); + + if (r->rid == RDT_RESOURCE_L2 || r->rid == RDT_RESOURCE_L3) { + kfree(hw_dom->ctrl_val); + kfree(hw_dom->mbps_val); + } +} + static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); @@ -516,18 +582,18 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d) struct msr_param m; u32 *dc, *dm; - dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL); + dc = alloc_ctrlval_array(r, d, false); if (!dc) return -ENOMEM; + hw_dom->ctrl_val = dc; - dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val), GFP_KERNEL); + dm = alloc_ctrlval_array(r, d, true); if (!dm) { - kfree(dc); + free_ctrlval_arrays(r, d); return -ENOMEM; } - - hw_dom->ctrl_val = dc; hw_dom->mbps_val = dm; + setup_default_ctrlval(r, dc, dm); m.low = 0; @@ -677,8 +743,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) if (d->plr) d->plr->d = NULL; - kfree(hw_dom->ctrl_val); - kfree(hw_dom->mbps_val); + free_ctrlval_arrays(r, d); bitmap_free(d->rmid_busy_llc); kfree(d->mbm_total); kfree(d->mbm_local); -- 2.28.0
[PATCH 18/24] x86/resctrl: Pass configuration type to resctrl_arch_get_config()
Once the configuration arrays are merged, the get_config() helper needs to be told whether the CODE, DATA or BOTH configuration is being retrieved. Pass this information from the schema into resctrl_arch_get_config(). Nothing uses this yet, but it will later be used to map the closid to the index in the configuration array. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 5 ++-- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 35 +++ include/linux/resctrl.h | 3 +- 4 files changed, 29 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 0cf2f24e5c3b..f6b4049c67c2 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -429,7 +429,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, } void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, -u32 closid, u32 *value) +u32 closid, enum resctrl_conf_type type, u32 *value) { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -451,7 +451,8 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo if (sep) seq_puts(s, ";"); - resctrl_arch_get_config(r, dom, closid, _val); + resctrl_arch_get_config(r, dom, closid, schema->conf_type, + _val); seq_printf(s, r->format_str, dom->id, max_data_width, ctrl_val); sep = true; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 6a62f1323b27..ab6630b466d5 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -379,7 +379,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) hw_dom_mba = resctrl_to_arch_dom(dom_mba); cur_bw = pmbm_data->prev_bw; - resctrl_arch_get_config(r_mba, dom_mba, closid, _bw); + resctrl_arch_get_config(r_mba, dom_mba, closid, CDP_BOTH, _bw); delta_bw = pmbm_data->delta_bw; /* * resctrl_arch_get_config() chooses the mbps/ctrl value to return diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index eeedafa7d5e7..cb9ca56ce2e6 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -924,7 +924,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, for (i = 0; i < closids_supported(); i++) { if (!closid_allocated(i)) continue; - resctrl_arch_get_config(r, dom, i, _val); + resctrl_arch_get_config(r, dom, i, s->conf_type, + _val); mode = rdtgroup_mode_by_closid(i); switch (mode) { case RDT_MODE_SHAREABLE: @@ -1101,6 +1102,7 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of, * Used to return the result. * @d_cdp: RDT domain that shares hardware with @d (RDT domain peer) * Used to return the result. + * @peer_type: The CDP configuration type of the peer resource. * * RDT resources are managed independently and by extension the RDT domains * (RDT resource instances) are managed independently also. The Code and @@ -1118,7 +1120,8 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of, */ static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d, struct rdt_resource **r_cdp, - struct rdt_domain **d_cdp) + struct rdt_domain **d_cdp, + enum resctrl_conf_type *peer_type) { struct rdt_resource *_r_cdp = NULL; struct rdt_domain *_d_cdp = NULL; @@ -1127,15 +1130,19 @@ static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d, switch (r->rid) { case RDT_RESOURCE_L3DATA: _r_cdp = _resources_all[RDT_RESOURCE_L3CODE].resctrl; + *peer_type = CDP_CODE; break; case RDT_RESOURCE_L3CODE: _r_cdp = _resources_all[RDT_RESOURCE_L3DATA].resctrl; + *peer_type = CDP_DATA; break; case RDT_RESOURCE_L2DATA: _r_cdp = _resources_all[RDT_RESOURCE_L2CODE].resctrl; + *peer_type = CDP_CODE; break; case RDT_RESOURCE_L2CODE: _r_cdp = _resources_all[RDT_RESOURCE_L2DATA].resctrl; + *peer_type = CDP_DATA; break; default: ret = -ENOENT; @@ -1186,7 +1193,8 @@ static
[PATCH 13/24] x86/resctrl: Allow different CODE/DATA configurations to be staged
Now that the configuration is staged via an array, allow resctrl to stage more than configuration at a time for a single resource and closid. To detect that the same schema is being specified twice when the schemata file is written, the same slot in the staged_configuration array must be used for each schema. Use the conf_type enum directly as an index. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c| 5 +++-- include/linux/resctrl.h | 4 +++- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index b107c0202cfb..f7152c7fdc1b 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -60,10 +60,11 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r) int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, struct rdt_domain *d) { - struct resctrl_staged_config *cfg = >staged_config[0]; + struct resctrl_staged_config *cfg; struct rdt_resource *r = s->res; unsigned long bw_val; + cfg = >staged_config[s->conf_type]; if (cfg->have_new_ctrl) { rdt_last_cmd_printf("Duplicate domain %d\n", d->id); return -EINVAL; @@ -131,11 +132,12 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, struct rdt_domain *d) { - struct resctrl_staged_config *cfg = >staged_config[0]; struct rdtgroup *rdtgrp = data->rdtgrp; + struct resctrl_staged_config *cfg; struct rdt_resource *r = s->res; u32 cbm_val; + cfg = >staged_config[s->conf_type]; if (cfg->have_new_ctrl) { rdt_last_cmd_printf("Duplicate domain %d\n", d->id); return -EINVAL; @@ -194,6 +196,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, static int parse_line(char *line, struct resctrl_schema *s, struct rdtgroup *rdtgrp) { + enum resctrl_conf_type t = s->conf_type; struct resctrl_staged_config *cfg; struct rdt_resource *r = s->res; struct rdt_parse_data data; @@ -225,7 +228,7 @@ static int parse_line(char *line, struct resctrl_schema *s, if (r->parse_ctrlval(, s, d)) return -EINVAL; if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) { - cfg = >staged_config[0]; + cfg = >staged_config[t]; /* * In pseudo-locking setup mode and just * parsed a valid CBM that should be @@ -266,10 +269,11 @@ int update_domains(struct rdt_resource *r, int closid) struct resctrl_staged_config *cfg; struct rdt_hw_domain *hw_dom; struct msr_param msr_param; + enum resctrl_conf_type t; cpumask_var_t cpu_mask; struct rdt_domain *d; bool mba_sc; - int cpu, i; + int cpu; if (!zalloc_cpumask_var(_mask, GFP_KERNEL)) return -ENOMEM; @@ -281,8 +285,8 @@ int update_domains(struct rdt_resource *r, int closid) mba_sc = is_mba_sc(r); list_for_each_entry(d, >domains, list) { hw_dom = resctrl_to_arch_dom(d); - for (i = 0; i < ARRAY_SIZE(d->staged_config); i++) { - cfg = _dom->resctrl.staged_config[i]; + for (t = 0; t < ARRAY_SIZE(d->staged_config); t++) { + cfg = _dom->resctrl.staged_config[t]; if (!cfg->have_new_ctrl) continue; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 1092631ac0b3..5eb14dc9c579 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2747,6 +2747,7 @@ static u32 cbm_ensure_valid(u32 _val, struct rdt_resource *r) static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s, u32 closid) { + enum resctrl_conf_type t = s-> conf_type; struct rdt_resource *r_cdp = NULL; struct resctrl_staged_config *cfg; struct rdt_domain *d_cdp = NULL; @@ -2758,7 +2759,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s, int i; rdt_cdp_peer_get(r, d, _cdp, _cdp); - cfg = >staged_config[0]; + cfg = >staged_config[t]; cfg->have_new_ctrl = false; cfg->new_ctrl = r->cache.shareable_bits; used_b = r->cache.shareable_bits; @@ -2843,7 +2844,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid)
[PATCH 24/24] x86/resctrl: Merge the CDP resources
Now that resctrl uses the schema's configuration type as the source of CODE/DATA configuration styles, and there is only one configuration array between the three views of the resource, remove the CODE and DATA aliases. This means the arch code only needs to describe the hardware to resctrl, which will then create the separate CODE/DATA schema for its ABI. Add a helper to add schema with a the CDP suffix if CDP is enabled. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 193 ++--- arch/x86/kernel/cpu/resctrl/internal.h | 4 - arch/x86/kernel/cpu/resctrl/rdtgroup.c | 113 --- 3 files changed, 72 insertions(+), 238 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 01d010977367..57d4131fdd80 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -78,42 +78,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .msr_base = MSR_IA32_L3_CBM_BASE, .msr_update = cat_wrmsr, }, - [RDT_RESOURCE_L3DATA] = - { - .conf_type = CDP_DATA, - .resctrl = { - .rid= RDT_RESOURCE_L3DATA, - .name = "L3DATA", - .cache_level= 3, - .cache = { - .min_cbm_bits = 1, - }, - .domains= domain_init(RDT_RESOURCE_L3DATA), - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .fflags = RFTYPE_RES_CACHE, - }, - .msr_base = MSR_IA32_L3_CBM_BASE, - .msr_update = cat_wrmsr, - }, - [RDT_RESOURCE_L3CODE] = - { - .conf_type = CDP_CODE, - .resctrl = { - .rid= RDT_RESOURCE_L3CODE, - .name = "L3CODE", - .cache_level= 3, - .cache = { - .min_cbm_bits = 1, - }, - .domains= domain_init(RDT_RESOURCE_L3CODE), - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .fflags = RFTYPE_RES_CACHE, - }, - .msr_base = MSR_IA32_L3_CBM_BASE, - .msr_update = cat_wrmsr, - }, [RDT_RESOURCE_L2] = { .conf_type = CDP_BOTH, @@ -132,42 +96,6 @@ struct rdt_hw_resource rdt_resources_all[] = { .msr_base = MSR_IA32_L2_CBM_BASE, .msr_update = cat_wrmsr, }, - [RDT_RESOURCE_L2DATA] = - { - .conf_type = CDP_DATA, - .resctrl = { - .rid= RDT_RESOURCE_L2DATA, - .name = "L2DATA", - .cache_level= 2, - .cache = { - .min_cbm_bits = 1, - }, - .domains= domain_init(RDT_RESOURCE_L2DATA), - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .fflags = RFTYPE_RES_CACHE, - }, - .msr_base = MSR_IA32_L2_CBM_BASE, - .msr_update = cat_wrmsr, - }, - [RDT_RESOURCE_L2CODE] = - { - .conf_type = CDP_CODE, - .resctrl = { - .rid= RDT_RESOURCE_L2CODE, - .name = "L2CODE", - .cache_level= 2, - .cache = { - .min_cbm_bits = 1, - }, - .domains= domain_init(RDT_RESOURCE_L2CODE), - .parse_ctrlval = parse_cbm, - .format_str = "%d=%0*x", - .fflags = RFTYPE_RES_CACHE, - }, - .msr_base = MSR_IA32_L2_CBM_BASE, - .msr_update = cat_wrmsr, - }, [RDT_RESOURCE_MBA] = { .conf_type = CDP_BOTH, @@ -339,40 +267,16 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
[PATCH 19/24] x86/resctrl: Make ctrlval arrays the same size
The CODE and DATA resources have their own ctrlval arrays which are half the size because num_closid was already adjusted. Prior to having one ctrlval array for the resource, move the num_closid correction into resctrl, so that the ctrlval arrays are all the same size. A shortlived quirk of this is that the caches are reset twice, once for CODE once for DATA. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 10 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 ++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 7e98869ba006..b2fda4cd88ba 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -363,7 +363,7 @@ static void rdt_get_cdp_config(int level, int type) struct rdt_resource *r = _resources_all[type].resctrl; struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); - hw_res->num_closid = hw_res_l->num_closid / 2; + hw_res->num_closid = hw_res_l->num_closid; r->cache.cbm_len = r_l->cache.cbm_len; r->default_ctrl = r_l->default_ctrl; r->cache.shareable_bits = r_l->cache.shareable_bits; @@ -549,6 +549,14 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d) m.low = 0; m.high = hw_res->num_closid; + + /* +* temporary: the array is full-size, but cat_wrmsr() still re-maps +* the index. +*/ + if (hw_res->conf_type != CDP_BOTH) + m.high /= 2; + hw_res->msr_update(d, , r); return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index cb9ca56ce2e6..4fa6c386d751 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2165,6 +2165,9 @@ static int create_schemata_list(void) s->res = r; s->num_closid = resctrl_arch_get_num_closid(r); + if (resctrl_arch_get_cdp_enabled(r->rid)) + s->num_closid /= 2; + s->conf_type = resctrl_to_arch_res(r)->conf_type; ret = snprintf(s->name, sizeof(s->name), r->name); @@ -2376,6 +2379,13 @@ static int reset_all_ctrls(struct rdt_resource *r) msr_param.low = 0; msr_param.high = hw_res->num_closid; + /* +* temporary: the array is full-sized, but cat_wrmsr() still re-maps +* the index. +*/ + if (hw_res->cdp_enabled) + msr_param.high /= 2; + /* * Disable resource control for this resource by setting all * CBMs in all domains to the maximum mask value. Pick one CPU -- 2.28.0
Re: [RFC PATCH v2] selinux: Fix kmemleak after disabling selinux runtime
On Fri, Oct 30, 2020 at 8:34 AM Casey Schaufler wrote: > On 10/30/2020 12:57 AM, Hou Tao wrote: > > Hi, > > > > On 2020/10/29 0:29, Casey Schaufler wrote: > >> On 10/27/2020 7:06 PM, Chen Jun wrote: > >>> From: Chen Jun > >>> > >>> Kmemleak will report a problem after using > >>> "echo 1 > /sys/fs/selinux/disable" to disable selinux on runtime. > >> Runtime disable of SELinux has been deprecated. It would be > >> wasteful to make these changes in support of a facility that > >> is going away. > >> > > But this sysfs file will still be present and workable on LTS kernel > > versions, so > > is the proposed fixe OK for these LTS kernel versions ? > > It's not my call to make. Paul Moore has the voice that matters here. > I think that the trivial memory leak here is inconsequential compared > to the overhead you're introducing by leaving the NO_DEL hooks enabled. Disabling SELinux at runtime is deprecated and will be removed in a future release, check the Documentation/ABI/obsolete/sysfs-selinux-disable in Linus' current tree for details. The recommended way to disable SELinux is at boot using the kernel command line, as described in the deprecation text: The preferred method of disabling SELinux is via the "selinux=0" boot parameter, but the selinuxfs "disable" node was created to make it easier for systems with primitive bootloaders that did not allow for easy modification of the kernel command line. Unfortunately, allowing for SELinux to be disabled at runtime makes it difficult to secure the kernel's LSM hooks using the "__ro_after_init" feature. Thankfully, the need for the SELinux runtime disable appears to be gone, the default Kconfig configuration disables this selinuxfs node, and only one of the major distributions, Fedora, supports disabling SELinux at runtime. Fedora is in the process of removing the selinuxfs "disable" node and once that is complete we will start the slow process of removing this code from the kernel. Because of the upcoming removal as well as the drawbacks and minimal gains provided by the patch in this thread, I would recommend against merging this patch. I would further recommend that distros and those building their own kernels leave CONFIG_SECURITY_SELINUX_DISABLE disabled and use the kernel command line instead. NACK. -- paul moore www.paul-moore.com
Re: [GIT PULL, staging, net-next] wimax: move to staging
On Fri, 30 Oct 2020 13:22:31 +0100 gregkh wrote: > On Thu, Oct 29, 2020 at 10:06:14PM +0100, Arnd Bergmann wrote: > > The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec: > > > > Linux 5.10-rc1 (2020-10-25 15:14:11 -0700) > > > > are available in the Git repository at: > > > > git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground.git > > tags/wimax-staging > > Line wrapping makes this hard :( > > Anyway, pulled into the staging-next branch now, so it's fine if this > also gets pulled into the networking branch/tree as well, and then all > should be fine. ..and pulled into net-next, thanks!
Re: [PATCH] vhost/vsock: add IOTLB API support
On Fri, Oct 30, 2020 at 07:44:43PM +0800, Jason Wang wrote: On 2020/10/30 下午6:54, Stefano Garzarella wrote: On Fri, Oct 30, 2020 at 06:02:18PM +0800, Jason Wang wrote: On 2020/10/30 上午1:43, Stefano Garzarella wrote: This patch enables the IOTLB API support for vhost-vsock devices, allowing the userspace to emulate an IOMMU for the guest. These changes were made following vhost-net, in details this patch: - exposes VIRTIO_F_ACCESS_PLATFORM feature and inits the iotlb device if the feature is acked - implements VHOST_GET_BACKEND_FEATURES and VHOST_SET_BACKEND_FEATURES ioctls - calls vq_meta_prefetch() before vq processing to prefetch vq metadata address in IOTLB - provides .read_iter, .write_iter, and .poll callbacks for the chardev; they are used by the userspace to exchange IOTLB messages This patch was tested with QEMU and a patch applied [1] to fix a simple issue: $ qemu -M q35,accel=kvm,kernel-irqchip=split \ -drive file=fedora.qcow2,format=qcow2,if=virtio \ -device intel-iommu,intremap=on \ -device vhost-vsock-pci,guest-cid=3,iommu_platform=on Patch looks good, but a question: It looks to me you don't enable ATS which means vhost won't get any invalidation request or did I miss anything? You're right, I didn't see invalidation requests, only miss and updates. Now I have tried to enable 'ats' and 'device-iotlb' but I still don't see any invalidation. How can I test it? (Sorry but I don't have much experience yet with vIOMMU) I guess it's because the batched unmap. Maybe you can try to use "intel_iommu=strict" in guest kernel command line to see if it works. Btw, make sure the qemu contains the patch [1]. Otherwise ATS won't be enabled for recent Linux Kernel in the guest. I tried with "intel_iommu=strict" in the guest kernel and QEMU patch applied, but I didn't see any invalidation. Maybe I did something wrong, you know it is friday, KVM Forum is ending, etc... ;-) I'll investigate better next week. Thanks for the useful info, Stefano
Re: [PATCH RFC v2 09/21] kasan: inline kasan_reset_tag for tag-based modes
On Wed, Oct 28, 2020 at 12:05 PM Dmitry Vyukov wrote: > > On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov > wrote: > > > > Using kasan_reset_tag() currently results in a function call. As it's > > called quite often from the allocator code this leads to a noticeable > > slowdown. Move it to include/linux/kasan.h and turn it into a static > > inline function. > > > > Signed-off-by: Andrey Konovalov > > Link: > > https://linux-review.googlesource.com/id/I4d2061acfe91d480a75df00b07c22d8494ef14b5 > > --- > > include/linux/kasan.h | 5 - > > mm/kasan/hw_tags.c| 5 - > > mm/kasan/kasan.h | 6 ++ > > mm/kasan/sw_tags.c| 5 - > > 4 files changed, 6 insertions(+), 15 deletions(-) > > > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > > index 93d9834b7122..6377d7d3a951 100644 > > --- a/include/linux/kasan.h > > +++ b/include/linux/kasan.h > > @@ -187,7 +187,10 @@ static inline void kasan_record_aux_stack(void *ptr) {} > > > > void __init kasan_init_tags(void); > > > > -void *kasan_reset_tag(const void *addr); > > +static inline void *kasan_reset_tag(const void *addr) > > +{ > > + return (void *)arch_kasan_reset_tag(addr); > > It seems that all implementations already return (void *), so the cast > is not needed. arch_kasan_reset_tag() (->__tag_reset() -> __untagged_addr()) preserves the type of the argument, so the cast is needed. > > > +} > > > > bool kasan_report(unsigned long addr, size_t size, > > bool is_write, unsigned long ip); > > diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c > > index b372421258c8..c3a0e83b5e7a 100644 > > --- a/mm/kasan/hw_tags.c > > +++ b/mm/kasan/hw_tags.c > > @@ -24,11 +24,6 @@ void __init kasan_init_tags(void) > > pr_info("KernelAddressSanitizer initialized\n"); > > } > > > > -void *kasan_reset_tag(const void *addr) > > -{ > > - return reset_tag(addr); > > -} > > - > > void kasan_poison_memory(const void *address, size_t size, u8 value) > > { > > set_mem_tag_range(reset_tag(address), > > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h > > index 456b264e5124..0ccbb3c4c519 100644 > > --- a/mm/kasan/kasan.h > > +++ b/mm/kasan/kasan.h > > @@ -246,15 +246,13 @@ static inline const void *arch_kasan_set_tag(const > > void *addr, u8 tag) > > return addr; > > } > > #endif > > -#ifndef arch_kasan_reset_tag > > -#define arch_kasan_reset_tag(addr) ((void *)(addr)) > > -#endif > > #ifndef arch_kasan_get_tag > > #define arch_kasan_get_tag(addr) 0 > > #endif > > > > +/* kasan_reset_tag() defined in include/linux/kasan.h. */ > > +#define reset_tag(addr)((void *)kasan_reset_tag(addr)) > > The cast is not needed. > > I would also now remove reset_tag entirely by replacing it with > kasan_reset_tag. Having 2 names for the same thing does not add > clarity. Will remove it.
Re: [PATCH v3] soundwire: SDCA: add helper macro to access controls
+#define SDW_SDCA_CTL(fun, ent, ctl, ch)(BIT(30) | \ +(((fun) & 0x7) << 22) | \ +(((ent) & 0x40) << 15) | \ +(((ent) & 0x3f) << 7) | \ +(((ctl) & 0x30) << 15) | \ +(((ctl) & 0x0f) << 3) | \ +(((ch) & 0x38) << 12) | \ +((ch) & 0x07)) + +#define SDW_SDCA_MBQ_CTL(reg) ((reg) | BIT(13)) +#define SDW_SDCA_NEXT_CTL(reg) ((reg) | BIT(14)) + #endif /* __SDW_REGISTERS_H */ No users of these macros? SDW_SDCA_CTL is used in sdca codec drivers which are not upstream yet. SDW_SDCA_MBQ_CTL will be used in a new regmap method. SDW_SDCA_NEXT_CTL can be used in sdca codec drivers, too. Well, the point is that it's hard to review without seeing how the code of actual users are. Agree, but our job is not made easy by the three-way dependency on regmap, SoundWire before we can submit ASoC codec drivers (developed by Realtek and tested by Intel). If you prefer us to send all patches for SDCA codec support in one shot, that would be fine with us. BTW, the bit definitions can be simplified with GENMASK(). I personally don't think GENMASK() necessarily good, but it may fit better in a case like this. we use this macro in switch cases, e.g. for regmap properties to define read/volatile registers: case SDW_SDCA_CTL(FUN_JACK_CODEC, RT711_SDCA_ENT_GE49, RT711_SDCA_CTL_SELECTED_MODE, 0): case SDW_SDCA_CTL(FUN_JACK_CODEC, RT711_SDCA_ENT_GE49, RT711_SDCA_CTL_DETECTED_MODE, 0): case SDW_SDCA_CTL(FUN_HID, RT711_SDCA_ENT_HID01, RT711_SDCA_CTL_HIDTX_CURRENT_OWNER, 0) ... SDW_SDCA_CTL(FUN_HID, RT711_SDCA_ENT_HID01, RT711_SDCA_CTL_HIDTX_MESSAGE_LENGTH, 0): case RT711_BUF_ADDR_HID1 ... RT711_BUF_ADDR_HID2: return true; https://github.com/thesofproject/linux/blob/70fe32e776dafb4b03581d62a4569f65c2f13ada/sound/soc/codecs/rt711-sdca-sdw.c#L35 and unfortunately all our attempts to use FIELD_PREP, FIELD_GET, u32_encode, as suggested by Vinod, failed for this case due to compilation issues (can't use these macros outside of a function scope). The errors were shared with Vinod. That's why we went back to the initial suggestion to deal with the shifts/masks by hand. For now we don't have a better solution that works in all cases were the macro is used. Thanks -Pierre
Re: [PATCH 2/2] ASoC: intel: sof_rt5682: Add quirk for Dooly
On 10/30/20 1:36 AM, Brent Lu wrote: This DMI product family string of this board is "Google_Hatch" so the DMI quirk will take place. However, this board is using rt1015 speaker amp instead of max98357a specified in the quirk. Therefore, we need an new DMI quirk for this board. Do you actually need a DMI quirk for this platform? the .driver_data below uses the exact same settings as what you would use with the generic solution based on ACPI IDs, see below. Wondering if patch1 would be enough? Signed-off-by: Brent Lu --- sound/soc/intel/boards/sof_rt5682.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/sound/soc/intel/boards/sof_rt5682.c b/sound/soc/intel/boards/sof_rt5682.c index 7701957e0eb7..dfcdf6d4b6c8 100644 --- a/sound/soc/intel/boards/sof_rt5682.c +++ b/sound/soc/intel/boards/sof_rt5682.c @@ -100,6 +100,20 @@ static const struct dmi_system_id sof_rt5682_quirk_table[] = { SOF_RT5682_MCLK_24MHZ | SOF_RT5682_SSP_CODEC(1)), }, + { + .callback = sof_rt5682_quirk_cb, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "HP"), + DMI_MATCH(DMI_PRODUCT_NAME, "Dooly"), + }, + .driver_data = (void *)(SOF_RT5682_MCLK_EN | + SOF_RT5682_MCLK_24MHZ | + SOF_RT5682_SSP_CODEC(0) | + SOF_SPEAKER_AMP_PRESENT | + SOF_RT1015_SPEAKER_AMP_PRESENT | + SOF_RT1015_SPEAKER_AMP_100FS | + SOF_RT5682_SSP_AMP(1)), + }, is this really needed? it's the same as the .driver_data below: @@ -875,6 +901,16 @@ static const struct platform_device_id board_ids[] = { SOF_MAX98360A_SPEAKER_AMP_PRESENT | SOF_RT5682_SSP_AMP(1)), }, + { + .name = "cml_rt1015_rt5682", + .driver_data = (kernel_ulong_t)(SOF_RT5682_MCLK_EN | + SOF_RT5682_MCLK_24MHZ | + SOF_RT5682_SSP_CODEC(0) | + SOF_SPEAKER_AMP_PRESENT | + SOF_RT1015_SPEAKER_AMP_PRESENT | + SOF_RT1015_SPEAKER_AMP_100FS | + SOF_RT5682_SSP_AMP(1)), + },
Re: [PATCH 3/4] kselftest_module.h: add struct rnd_state and seed parameter
On Sun 2020-10-25 22:48:41, Rasmus Villemoes wrote: > Some test suites make use of random numbers to increase the test > coverage when the test suite gets run on different machines and > increase the chance of some corner case bug being discovered - and I'm > planning on extending some existing ones in that direction as > well. However, should a bug be found this way, it's important that the > exact same series of tests can be repeated to verify the bug is > fixed. That means the random numbers must be obtained > deterministically from a generator private to the test module. > > To avoid adding boilerplate to various test modules, put some logic > into kselftest_module.h: If the module declares that it will use > random numbers, add a "seed" module parameter. If not explicitly given > when the module is loaded (or via kernel command line), obtain a > random one. In either case, print the seed used, and repeat that > information if there was at least one test failing. > > Signed-off-by: Rasmus Villemoes > --- > tools/testing/selftests/kselftest_module.h | 35 -- > 1 file changed, 32 insertions(+), 3 deletions(-) > > diff --git a/tools/testing/selftests/kselftest_module.h > b/tools/testing/selftests/kselftest_module.h > index c81c0b0c054befaf665b..43f3ca58fcd550b8ac83 100644 > --- a/tools/testing/selftests/kselftest_module.h > +++ b/tools/testing/selftests/kselftest_module.h > @@ -3,14 +3,31 @@ > #define __KSELFTEST_MODULE_H > > #include > +#include > +#include > > /* > * Test framework for writing test modules to be loaded by kselftest. > * See Documentation/dev-tools/kselftest.rst for an example test module. > */ > > +/* > + * If the test module makes use of random numbers, define KSTM_RANDOM > + * to 1 before including this header. Then a module parameter "seed" > + * will be defined. If not given, a random one will be obtained. In > + * either case, the used seed is reported, so the exact same series of > + * tests can be repeated by loading the module with that seed > + * given. > + */ > + > +#ifndef KSTM_RANDOM > +#define KSTM_RANDOM 0 > +#endif > + > static unsigned int total_tests __initdata; > static unsigned int failed_tests __initdata; > +static struct rnd_state rnd_state __initdata; > +static u64 seed __initdata; > > #define KSTM_CHECK_ZERO(x) do { > \ > total_tests++; \ > @@ -22,11 +39,13 @@ static unsigned int failed_tests __initdata; > > static inline int kstm_report(unsigned int total_tests, unsigned int > failed_tests) > { > - if (failed_tests == 0) > + if (failed_tests == 0) { > pr_info("all %u tests passed\n", total_tests); > - else > + } else { > pr_warn("failed %u out of %u tests\n", failed_tests, > total_tests); > - > + if (KSTM_RANDOM) > + pr_info("random seed used was 0x%016llx\n", seed); I have a bit mixed feelings about this. It is genial and dirty hack at the same time ;-) Well, it is basically the same approach as with IS_ENABLED(CONFIG_bla_bla). Reviewed-by: Petr Mladek Best Regards, Petr
Re: [PATCH v11 00/10] NTFS read-write driver GPL implementation by Paragon Software
Hello! On Friday 30 October 2020 15:51:10 Konstantin Komarov wrote: > From: Pali Rohár > Sent: Friday, October 30, 2020 6:25 PM > > To: Konstantin Komarov > > Cc: linux-fsde...@vger.kernel.org; v...@zeniv.linux.org.uk; > > linux-kernel@vger.kernel.org; dste...@suse.cz; aap...@suse.com; > > wi...@infradead.org; rdun...@infradead.org; j...@perches.com; > > m...@harmstone.com; nbori...@suse.com; linux-ntfs- > > d...@lists.sourceforge.net; an...@tuxera.com > > Subject: Re: [PATCH v11 00/10] NTFS read-write driver GPL implementation by > > Paragon Software > > > > Hello and thanks for update! > > > > I have just two comments for the last v11 version. > > > > I really do not like nls_alt mount option and I do not think we should > > merge this mount option into ntfs kernel driver. Details I described in: > > https://lore.kernel.org/linux-fsdevel/20201009154734.andv4es3azkkskm5@pali/ > > > > tl;dr it is not systematic solution and is incompatible with existing > > in-kernel ntfs driver, also incompatible with in-kernel vfat, udf and > > ext4 (with UNICODE support) drivers. In my opinion, all kernel fs > > drivers which deals with UNICODE should handle it in similar way. > > > > Hello Pali! First of all, apologies for not providing a feedback on your > previous > message regarding the 'nls_alt'. We had internal discussions on the topic and > overall conclusion is that: we do not want to compromise Kernel standards with > our submission. So we will remove the 'nls_alt' option in the next version. > > However, there are still few points we have on the topic, please read below. > > > It would be really bad if userspace application need to behave > > differently for this new ntfs driver and differently for all other > > UNICODE drivers. > > > > The option does not anyhow affect userspace applications. For the "default" > example > of unzip/tar: > 1 - if this option is not applied (e.g. "vfat case"), trying to unzip an > archive with, e.g. CP-1251, > names inside to the target fs volume, will return error, and issued file(s) > won't be unzipped; > 2 - if this option is applied and "nls_alt" is set, the above case will > result in unzipping all the files; I understand what is the point and I'm not against discussion how to fix it. But it should be implemented for all filesystems with UNICODE semantic, so behavior would be same. For user application point of view, behavior of vfat, ntfs, udf and ext4 (with UNICODE support; see below) in handling file names should be very similar (or exactly same if fs tech details allows it). > Also, this issue in general only applies to "non-native" filesystems. I.e. > ext4 is not affected by it > in any case, as it just stores the name as bytes, no matter what those bytes > are. The above case > won't give an unzip error on ext4. The only symptom of this would be, maybe, > "incorrect encoding" > marking within the listing of such files (in File Manager or Terminal, e.g. > in Ubuntu), but there won't > be an unzip process termination with incomplete unarchived fileset, unlike it > is for vfat/exfat/ntfs > without "nls_alt". When using ext4 in default mode then it really does not apply here. But I wrote that it applies for ext4 with UNICODE support. This mode needs to be first enabled for directory, it is relatively new feature and I do not know if there are users of it and how many people tried different crazy test scenarios with normalization, etc... > > Second comment is simplification of usage nls_load() with UTF-8 parameter > > which I described in older email: > > https://lore.kernel.org/linux-fsdevel/948ac894450d494ea15496c2e5b8c...@paragon-software.com/ > > > > You wrote that you have applied it, but seems it was lost (maybe during > > rebase?) as it is not present in the last v11 version. > > > > I suggested to not use nls_load() with UTF-8 at all. Your version of > > ntfs driver does not use kernel's nls utf8 module for UTF-8 support, so > > trying to load it should be avoided. Also kernel can be compiled without > > utf8 nls module (which is moreover broken) and with my above suggestion, > > ntfs driver would work correctly. Without that suggestion, mounting > > would fail. > > Thanks for pointing that out. It is likely the "nls_load()" fixes were lost > during rebase. > Will recheck it and return them to the v12. OK!
Re: [PATCH 4/4] lib/test_printf.c: use deterministic sequence of random numbers
On Sun 2020-10-25 22:48:42, Rasmus Villemoes wrote: > The printf test suite does each test with a few different buffer sizes > to ensure vsnprintf() behaves correctly with respect to truncation and > size reporting. It calls vsnprintf() with a buffer size that is > guaranteed to be big enough, a buffer size of 0 to ensure that nothing > gets written to the buffer, but it also calls vsnprintf() with a > buffer size chosen to guarantee the output gets truncated somewhere in > the middle. > > That buffer size is chosen randomly to increase the chance of finding > some corner case bug (for example, there used to be some %p > extension that would fail to produce any output if there wasn't room > enough for it all, despite the requirement of producing as much as > there's room for). I'm not aware of that having found anything yet, > but should it happen, it's annoying not to be able to repeat the > test with the same sequence of truncated lengths. > > For demonstration purposes, if we break one of the test cases > deliberately, we still get different buffer sizes if we don't pass the > seed parameter: > > root@(none):/# modprobe test_printf > [ 15.317783] test_printf: vsnprintf(buf, 18, "%piS|%pIS", ...) wrote > '127.000.000.001|1', expected '127-000.000.001|1' > [ 15.323182] test_printf: failed 3 out of 388 tests > [ 15.324034] test_printf: random seed used was 0x278bb9311979cc91 > modprobe: ERROR: could not insert 'test_printf': Invalid argument > > root@(none):/# modprobe test_printf > [ 13.940909] test_printf: vsnprintf(buf, 22, "%piS|%pIS", ...) wrote > '127.000.000.001|127.0', expected '127-000.000.001|127.0' > [ 13.944744] test_printf: failed 3 out of 388 tests > [ 13.945607] test_printf: random seed used was 0x9f72eee1c9dc02e5 > modprobe: ERROR: could not insert 'test_printf': Invalid argument > > but to repeat a specific sequence of tests, we can do > > root@(none):/# modprobe test_printf seed=0x9f72eee1c9dc02e5 > [ 448.328685] test_printf: vsnprintf(buf, 22, "%piS|%pIS", ...) wrote > '127.000.000.001|127.0', expected '127-000.000.001|127.0' > [ 448.331650] test_printf: failed 3 out of 388 tests > [ 448.332295] test_printf: random seed used was 0x9f72eee1c9dc02e5 > modprobe: ERROR: could not insert 'test_printf': Invalid argument > > Signed-off-by: Rasmus Villemoes Great feature! Reviewed-by: Petr Mladek Best Regards, Petr
Re: [PATCH 2/3] mm, page_poison: use static key more efficiently
On Mon, Oct 26, 2020 at 06:33:57PM +0100, Vlastimil Babka wrote: > Commit 11c9c7edae06 ("mm/page_poison.c: replace bool variable with static > key") > changed page_poisoning_enabled() to a static key check. However, the function > is not inlined, so each check still involves a function call with overhead not > eliminated when page poisoning is disabled. > > Analogically to how debug_pagealloc is handled, this patch converts > page_poisoning_enabled() back to boolean check, and introduces > page_poisoning_enabled_static() for fast paths. Both functions are inlined. > > Also optimize the check that enables page poisoning instead of debug_pagealloc > for architectures without proper debug_pagealloc support. Move the check to > init_mem_debugging() to enable a single static key instead of having two > static branches in page_poisoning_enabled_static(). > > Signed-off-by: Vlastimil Babka This patchset causes a regression x86_64 as a guest. I was able to bisect this on the following linux-next tags: next-20201015 OK next-20201023 OK next-20201026 OK next-20201027 BAD next-20201028 BAD Bisection inside next-20201027 lands me on: "mm, page_alloc: do not rely on the order of page_poison and init_on_alloc/free parameters" which is part of this patchset, however, reverting that patch causes a conflict, likely due to a subsequent patch in this series. So I decided to try before the patch set and after and this confirms the bisection. Before this patchset, on patch titled, "mm: forbid splitting special mappings" I see no issue, but after this patch set, on the patch titled "mm, page_alloc: reduce static keys in prep_new_page()" I get a crash. The crash log is attached. The good news it that this can be easily reproduced in a jiffy if you use kdevops [0] on Debian (the default) on vagrant. [0] https://github.com/mcgrof/kdevops -- Logs begin at Wed 2020-10-28 01:30:20 UTC, end at Wed 2020-10-28 19:16:03 UTC. -- Oct 28 19:13:04 kdevops kernel: Linux version 5.10.0-rc1-next-20201027 (vagrant@kdevops) (gcc (Debian 10.2.0-15) 10.2.0, GNU ld (GNU Binutils for Debian) 2.35.1) #1 SMP Wed Oct 28 15:14:07 UTC 2020 Oct 28 19:13:04 kdevops kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-rc1-next-20201027 root=UUID=232d7b2f-c31e-4bbe-bffe-0ac429e4cb18 ro console=tty0 console=tty1 console=ttyS0,38400n8 Oct 28 19:13:04 kdevops kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' Oct 28 19:13:04 kdevops kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' Oct 28 19:13:04 kdevops kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' Oct 28 19:13:04 kdevops kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 Oct 28 19:13:04 kdevops kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format. Oct 28 19:13:04 kdevops kernel: BIOS-provided physical RAM map: Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0x-0x0009fbff] usable Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0x0009fc00-0x0009] reserved Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0x000f-0x000f] reserved Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0x0010-0xbffdafff] usable Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0xbffdb000-0xbfff] reserved Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0xfeffc000-0xfeff] reserved Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0xfffc-0x] reserved Oct 28 19:13:04 kdevops kernel: BIOS-e820: [mem 0x0001-0x00023fff] usable Oct 28 19:13:04 kdevops kernel: NX (Execute Disable) protection: active Oct 28 19:13:04 kdevops kernel: SMBIOS 2.8 present. Oct 28 19:13:04 kdevops kernel: DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1 04/01/2014 Oct 28 19:13:04 kdevops kernel: Hypervisor detected: KVM Oct 28 19:13:04 kdevops kernel: kvm-clock: Using msrs 4b564d01 and 4b564d00 Oct 28 19:13:04 kdevops kernel: kvm-clock: cpu 0, msr 2110e2001, primary cpu clock Oct 28 19:13:04 kdevops kernel: kvm-clock: using sched offset of 63760873909837 cycles Oct 28 19:13:04 kdevops kernel: clocksource: kvm-clock: mask: 0x max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns Oct 28 19:13:04 kdevops kernel: tsc: Detected 1992.014 MHz processor Oct 28 19:13:04 kdevops kernel: e820: update [mem 0x-0x0fff] usable ==> reserved Oct 28 19:13:04 kdevops kernel: e820: remove [mem 0x000a-0x000f] usable Oct 28 19:13:04 kdevops kernel: last_pfn = 0x24 max_arch_pfn = 0x4 Oct 28 19:13:04 kdevops kernel: MTRR default type: write-back Oct 28 19:13:04 kdevops kernel: MTRR fixed ranges enabled: Oct 28 19:13:04 kdevops kernel: 0-9 write-back Oct 28 19:13:04 kdevops kernel: A-B uncachable Oct 28 19:13:04 kdevops kernel: C-F write-protect Oct 28 19:13:04 kdevops kernel:
[GIT PULL] Power management fixes for v5.10-rc2
Hi Linus, Please pull from the tag git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ pm-5.10-rc2 with top-most commit dea47cf45a7f9bb94684830c47d4b259d5f8d6af Merge branches 'pm-cpuidle' and 'pm-sleep' on top of commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec Linux 5.10-rc1 to receive power management fixes for 5.10-rc2. These fix a few issues related to running intel_pstate in the passive mode with HWP enabled, correct the handling of the max_cstate module parameter in intel_idle and make a few janitorial changes. Specifics: - Modify Kconfig to prevent configuring either the "conservative" or the "ondemand" governor as the default cpufreq governor if intel_pstate is selected, in which case "schedutil" is the default choice for the default governor setting (Rafael Wysocki). - Modify the cpufreq core, intel_pstate and the schedutil governor to avoid missing updates of the HWP max limit when intel_pstate operates in the passive mode with HWP enabled (Rafael Wysocki). - Fix max_cstate module parameter handling in intel_idle for processor models with C-state tables coming from ACPI (Chen Yu). - Clean up assorted pieces of power management code (Jackie Zamow, Tom Rix, Zhang Qilong). Thanks! --- Chen Yu (1): intel_idle: Fix max_cstate for processor models without C-state tables Jackie Zamow (1): PM: sleep: fix typo in kernel/power/process.c Rafael J. Wysocki (5): cpufreq: Avoid configuring old governors as default with intel_pstate cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode cpufreq: Introduce cpufreq_driver_test_flags() cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set Tom Rix (1): cpufreq: speedstep: remove unneeded semicolon Zhang Qilong (1): cpufreq: e_powersaver: remove unreachable break --- drivers/cpufreq/Kconfig | 2 ++ drivers/cpufreq/cpufreq.c| 15 ++- drivers/cpufreq/e_powersaver.c | 1 - drivers/cpufreq/intel_pstate.c | 13 ++--- drivers/cpufreq/longhaul.c | 1 - drivers/cpufreq/speedstep-lib.c | 2 +- drivers/idle/intel_idle.c| 2 +- include/linux/cpufreq.h | 11 ++- kernel/power/process.c | 2 +- kernel/sched/cpufreq_schedutil.c | 6 -- 10 files changed, 39 insertions(+), 16 deletions(-)
Re: [PATCH v2 1/2] mm: reorganize internal_get_user_pages_fast()
On Fri 30-10-20 11:46:20, Jason Gunthorpe wrote: > The next patch in this series makes the lockless flow a little more > complex, so move the entire block into a new function and remove a level > of indention. Tidy a bit of cruft: > > - addr is always the same as start, so use start > > - Use the modern check_add_overflow() for computing end = start + len > > - nr_pinned/pages << PAGE_SHIFT needs the LHS to be unsigned long to >avoid shift overflow, make the variables unsigned long to avoid coding >casts in both places. nr_pinned was missing its cast > > - The handling of ret and nr_pinned can be streamlined a bit > > No functional change. > > Signed-off-by: Jason Gunthorpe Looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > mm/gup.c | 99 ++-- > 1 file changed, 54 insertions(+), 45 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 102877ed77a4b4..150cc962c99201 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2671,13 +2671,43 @@ static int __gup_longterm_unlocked(unsigned long > start, int nr_pages, > return ret; > } > > -static int internal_get_user_pages_fast(unsigned long start, int nr_pages, > +static unsigned long lockless_pages_from_mm(unsigned long start, > + unsigned long end, > + unsigned int gup_flags, > + struct page **pages) > +{ > + unsigned long flags; > + int nr_pinned = 0; > + > + if (!IS_ENABLED(CONFIG_HAVE_FAST_GUP) || > + !gup_fast_permitted(start, end)) > + return 0; > + > + /* > + * Disable interrupts. The nested form is used, in order to allow full, > + * general purpose use of this routine. > + * > + * With interrupts disabled, we block page table pages from being freed > + * from under us. See struct mmu_table_batch comments in > + * include/asm-generic/tlb.h for more details. > + * > + * We do not adopt an rcu_read_lock() here as we also want to block IPIs > + * that come from THPs splitting. > + */ > + local_irq_save(flags); > + gup_pgd_range(start, end, gup_flags, pages, _pinned); > + local_irq_restore(flags); > + return nr_pinned; > +} > + > +static int internal_get_user_pages_fast(unsigned long start, > + unsigned long nr_pages, > unsigned int gup_flags, > struct page **pages) > { > - unsigned long addr, len, end; > - unsigned long flags; > - int nr_pinned = 0, ret = 0; > + unsigned long len, end; > + unsigned long nr_pinned; > + int ret; > > if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM | > FOLL_FORCE | FOLL_PIN | FOLL_GET | > @@ -2691,54 +2721,33 @@ static int internal_get_user_pages_fast(unsigned long > start, int nr_pages, > might_lock_read(>mm->mmap_lock); > > start = untagged_addr(start) & PAGE_MASK; > - addr = start; > - len = (unsigned long) nr_pages << PAGE_SHIFT; > - end = start + len; > - > - if (end <= start) > + len = nr_pages << PAGE_SHIFT; > + if (check_add_overflow(start, len, )) > return 0; > if (unlikely(!access_ok((void __user *)start, len))) > return -EFAULT; > > - /* > - * Disable interrupts. The nested form is used, in order to allow > - * full, general purpose use of this routine. > - * > - * With interrupts disabled, we block page table pages from being > - * freed from under us. See struct mmu_table_batch comments in > - * include/asm-generic/tlb.h for more details. > - * > - * We do not adopt an rcu_read_lock(.) here as we also want to > - * block IPIs that come from THPs splitting. > - */ > - if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) && gup_fast_permitted(start, end)) > { > - unsigned long fast_flags = gup_flags; > - > - local_irq_save(flags); > - gup_pgd_range(addr, end, fast_flags, pages, _pinned); > - local_irq_restore(flags); > - ret = nr_pinned; > - } > + nr_pinned = lockless_pages_from_mm(start, end, gup_flags, pages); > + if (nr_pinned == nr_pages || gup_flags & FOLL_FAST_ONLY) > + return nr_pinned; > > - if (nr_pinned < nr_pages && !(gup_flags & FOLL_FAST_ONLY)) { > - /* Try to get the remaining pages with get_user_pages */ > - start += nr_pinned << PAGE_SHIFT; > - pages += nr_pinned; > - > - ret = __gup_longterm_unlocked(start, nr_pages - nr_pinned, > - gup_flags, pages); > - > - /* Have to be a bit careful with return values */ > -
[GIT PULL] asm-generic: bugfix for v5.10
The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec: Linux 5.10-rc1 (2020-10-25 15:14:11 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git tags/asm-generic-fixes-5.10 for you to fetch changes up to 0bcd0a2be8c9ef39d84d167ff85359a49f7be175: asm-generic: mark __{get,put}_user_fn as __always_inline (2020-10-27 16:13:09 +0100) asm-generic: fixes for v5.10 There is one small bugfix, fixing a build regression for RISC-V Signed-off-by: Arnd Bergmann Christoph Hellwig (1): asm-generic: mark __{get,put}_user_fn as __always_inline include/asm-generic/uaccess.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-)
[GIT PULL] ACPI fixes for v5.10-rc2
Hi Linus, Please pull from the tag git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ acpi-5.10-rc2 with top-most commit 8f7304bb9113c95b256d3aa79a884b4c60a806e1 Merge branches 'acpi-button' and 'acpi-dock' on top of commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec Linux 5.10-rc1 to receive ACPI fixes for 5.10-rc2. These fix three assorted minor issues. Specifics: - Eliminate compiler warning emitted when building the ACPI dock driver (Arnd Bergmann). - Drop lid_init_state quirk for Acer SW5-012 that is not needed any more after recent changes (Hans de Goede). - Fix "missing minus" typo in the NFIT parsing code (Zhang Qilong). Thanks! --- Arnd Bergmann (1): ACPI: dock: fix enum-conversion warning Hans de Goede (1): ACPI: button: Drop no longer necessary Acer SW5-012 lid_init_state quirk Zhang Qilong (1): ACPI: NFIT: Fix comparison to '-ENXIO' --- drivers/acpi/button.c| 13 - drivers/acpi/dock.c | 3 ++- drivers/acpi/nfit/core.c | 2 +- 3 files changed, 3 insertions(+), 15 deletions(-)
Re: [PATCH net-next v4 4/5] net: hdlc_fr: Do skb_reset_mac_header for skbs received on normal PVC devices
On Thu, Oct 29, 2020 at 10:33 PM Xie He wrote: > > When an skb is received on a normal (non-Ethernet-emulating) PVC device, > call skb_reset_mac_header before we pass it to upper layers. > > This is because normal PVC devices don't have header_ops, so any header we > have would not be visible to upper layer code when sending, so the header > shouldn't be visible to upper layer code when receiving, either. > > Cc: Willem de Bruijn > Cc: Krzysztof Halasa > Signed-off-by: Xie He Acked-by: Willem de Bruijn Should this go to net if a bugfix though?
Re: [PATCH net-next v4 3/5] net: hdlc_fr: Improve the initial checks when we receive an skb
On Thu, Oct 29, 2020 at 10:33 PM Xie He wrote: > > 1. > Change the skb->len check from "<= 4" to "< 4". > At first we only need to ensure a 4-byte header is present. We indeed > normally need the 5th byte, too, but it'd be more logical and cleaner > to check its existence when we actually need it. > > 2. > Add an fh->ea2 check to the initial checks in fr_rx. fh->ea2 == 1 means > the second address byte is the final address byte. We only support the > case where the address length is 2 bytes. Can you elaborate a bit for readers not intimately familiar with the codebase? Is there something in the following code that has this implicit assumption on 2-byte address lengths?
[GIT PULL] Device properties framework fixes for v5.10-rc2
Hi Linus, Please pull from the tag git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ devprop-5.10-rc2 with top-most commit 99aed9227073fb34ce2880cbc7063e04185a65e1 device property: Don't clear secondary pointer for shared primary firmware node on top of commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec Linux 5.10-rc1 to receive device properties framework fixes for 5.10-rc2. Fix the secondary firmware node handling while manipulating the primary firmware node for a given device (Andy Shevchenko). Thanks! --- Andy Shevchenko (2): device property: Keep secondary firmware node secondary by type device property: Don't clear secondary pointer for shared primary firmware node --- drivers/base/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
[GIT PULL] PNP fix for v5.10-rc2
Hi Linus, Please pull from the tag git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ pnp-5.10-rc2 with top-most commit e510785f8aca4a7346497edd4d5aceefe5370960 PNP: fix kernel-doc markups on top of commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec Linux 5.10-rc1 to receive a PNP fix for 5.10-rc2. Make function names in kerneldoc comments match the actual names of the functions that they correspond to (Mauro Carvalho Chehab). Thanks! --- Mauro Carvalho Chehab (1): PNP: fix kernel-doc markups --- drivers/pnp/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Re: [PATCH 07/14] dt-bindings: media: i2c: Add A31 MIPI CSI-2 bindings documentation
On Fri, Oct 23, 2020 at 07:45:39PM +0200, Paul Kocialkowski wrote: > This introduces YAML bindings documentation for the A31 MIPI CSI-2 > controller. > > Signed-off-by: Paul Kocialkowski > --- > .../media/allwinner,sun6i-a31-mipi-csi2.yaml | 168 ++ > 1 file changed, 168 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/media/allwinner,sun6i-a31-mipi-csi2.yaml > > diff --git > a/Documentation/devicetree/bindings/media/allwinner,sun6i-a31-mipi-csi2.yaml > b/Documentation/devicetree/bindings/media/allwinner,sun6i-a31-mipi-csi2.yaml > new file mode 100644 > index ..9adc0bc27033 > --- /dev/null > +++ > b/Documentation/devicetree/bindings/media/allwinner,sun6i-a31-mipi-csi2.yaml > @@ -0,0 +1,168 @@ > +# SPDX-License-Identifier: GPL-2.0 Dual license new bindings. > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/media/allwinner,sun6i-a31-mipi-csi2.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Allwinner A31 MIPI CSI-2 Device Tree Bindings > + > +maintainers: > + - Paul Kocialkowski > + > +properties: > + compatible: > +oneOf: > + - const: allwinner,sun6i-a31-mipi-csi2 > + - items: > + - const: allwinner,sun8i-v3s-mipi-csi2 > + - const: allwinner,sun6i-a31-mipi-csi2 > + > + reg: > +maxItems: 1 > + > + interrupts: > +maxItems: 1 > + > + clocks: > +items: > + - description: Bus Clock > + - description: Module Clock > + > + clock-names: > +items: > + - const: bus > + - const: mod > + > + phys: > +items: > + - description: MIPI D-PHY > + > + phy-names: > +items: > + - const: dphy > + > + resets: > +maxItems: 1 > + > + # See ./video-interfaces.txt for details > + ports: > +type: object > + > +properties: > + port@0: > +type: object > +description: Input port, connect to a MIPI CSI-2 sensor > + > +properties: > + reg: > +const: 0 > + > + endpoint: > +type: object > + > +properties: > + remote-endpoint: true > + > + bus-type: > +const: 4 > + > + clock-lanes: > +maxItems: 1 > + > + data-lanes: > +minItems: 1 > +maxItems: 4 > + > +required: > + - bus-type > + - data-lanes > + - remote-endpoint > + > +additionalProperties: false > + > +required: > + - endpoint > + > +additionalProperties: false > + > + port@1: > +type: object > +description: Output port, connect to a CSI controller > + > +properties: > + reg: > +const: 1 > + > + endpoint: > +type: object > + > +properties: > + remote-endpoint: true > + > + bus-type: > +const: 4 > + > +additionalProperties: false > + > +required: > + - endpoint > + > +additionalProperties: false > + > +required: > + - compatible > + - reg > + - interrupts > + - clocks > + - clock-names > + - resets > + > +additionalProperties: false > + > +examples: > + - | > +#include > +#include > +#include > + > +mipi_csi2: mipi-csi2@1cb1000 { I agree with using 'csi' here as that is at least aligned with 'dsi' meaning the host side of the protocol. We've not been consistent here... > +compatible = "allwinner,sun8i-v3s-mipi-csi2", > + "allwinner,sun6i-a31-mipi-csi2"; > +reg = <0x01cb1000 0x1000>; > +interrupts = ; > +clocks = < CLK_BUS_CSI>, > + < CLK_CSI1_SCLK>; > +clock-names = "bus", "mod"; > +resets = < RST_BUS_CSI>; > + > +phys = <>; > +phy-names = "dphy"; > + > +ports { > +#address-cells = <1>; > +#size-cells = <0>; > + > +mipi_csi2_in: port@0 { > +reg = <0>; > + > +mipi_csi2_in_ov5648: endpoint { > +bus-type = <4>; /* MIPI CSI-2 D-PHY */ > +clock-lanes = <0>; > +data-lanes = <1 2 3 4>; > + > +remote-endpoint = <_out_mipi_csi2>; > +}; > +}; > + > +mipi_csi2_out: port@1 { > +reg = <1>; > + > +mipi_csi2_out_csi0: endpoint { > +bus-type = <4>; /* MIPI CSI-2 D-PHY */ > +remote-endpoint = <_in_mipi_csi2>; > +}; > +}; > +}; > +}; > + > +... > -- > 2.28.0 >
Re: [PATCH] arm64/smp: Move rcu_cpu_starting() earlier
On Wed, 28 Oct 2020 14:26:14 -0400, Qian Cai wrote: > The call to rcu_cpu_starting() in secondary_start_kernel() is not early > enough in the CPU-hotplug onlining process, which results in lockdep > splats as follows: > > WARNING: suspicious RCU usage > - > kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!! > > [...] Applied to arm64 (for-next/fixes), thanks! [1/1] arm64/smp: Move rcu_cpu_starting() earlier https://git.kernel.org/arm64/c/ce3d31ad3cac Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev
Re: [PATCH net-next v4 5/5] net: hdlc_fr: Add support for any Ethertype
On Thu, Oct 29, 2020 at 10:32 PM Xie He wrote: > > Change the fr_rx function to make this driver support any Ethertype > when receiving skbs on normal (non-Ethernet-emulating) PVC devices. > (This driver is already able to handle any Ethertype when sending.) > > Originally in the fr_rx function, the code that parses the long (10-byte) > header only recognizes a few Ethertype values and drops frames with other > Ethertype values. This patch replaces this code to make fr_rx support > any Ethertype. This patch also creates a new function fr_snap_parse as > part of the new code. > > Cc: Willem de Bruijn > Cc: Krzysztof Halasa > Signed-off-by: Xie He > --- > drivers/net/wan/hdlc_fr.c | 75 +-- > 1 file changed, 49 insertions(+), 26 deletions(-) > > diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c > index 9a37575686b9..e95efc14bc97 100644 > --- a/drivers/net/wan/hdlc_fr.c > +++ b/drivers/net/wan/hdlc_fr.c > @@ -871,6 +871,45 @@ static int fr_lmi_recv(struct net_device *dev, struct > sk_buff *skb) > return 0; > } > > static int fr_rx(struct sk_buff *skb) > { > @@ -945,35 +984,19 @@ static int fr_rx(struct sk_buff *skb) > skb->protocol = htons(ETH_P_IPV6); > skb_reset_mac_header(skb); > > - } else if (skb->len > 10 && data[3] == FR_PAD && > - data[4] == NLPID_SNAP && data[5] == FR_PAD) { > - u16 oui = ntohs(*(__be16*)(data + 6)); > - u16 pid = ntohs(*(__be16*)(data + 8)); > - skb_pull(skb, 10); > - > - switch u32)oui) << 16) | pid) { > - case ETH_P_ARP: /* routed frame with SNAP */ > - case ETH_P_IPX: > - case ETH_P_IP: /* a long variant */ > - case ETH_P_IPV6: > - if (!pvc->main) > - goto rx_drop; > - skb->dev = pvc->main; > - skb->protocol = htons(pid); > - skb_reset_mac_header(skb); > - break; > - > - case 0x80C20007: /* bridged Ethernet frame */ > - if (!pvc->ether) > + } else if (data[3] == FR_PAD) { > + if (skb->len < 5) > + goto rx_error; > + if (data[4] == NLPID_SNAP) { /* A SNAP header follows */ Should this still check data[5] == FR_PAD? > + skb_pull(skb, 5); > + if (skb->len < 5) /* Incomplete SNAP header */ > + goto rx_error; > + if (fr_snap_parse(skb, pvc)) > goto rx_drop; > - skb->protocol = eth_type_trans(skb, pvc->ether); > - break; > - > - default: > - netdev_info(frad, "Unsupported protocol, OUI=%x > PID=%x\n", > - oui, pid); > + } else { > goto rx_drop; > } > + > } else { > netdev_info(frad, "Unsupported protocol, NLPID=%x > length=%i\n", > data[3], skb->len); > -- > 2.27.0 >
Re: [PATCH net-next v4 1/5] net: hdlc_fr: Simpify fr_rx by using "goto rx_drop" to drop frames
On Thu, Oct 29, 2020 at 10:31 PM Xie He wrote: > > 1. > When the fr_rx function drops a received frame (because the protocol type > is not supported, or because the PVC virtual device that corresponds to > the DLCI number and the protocol type doesn't exist), the function frees > the skb and returns. > > The code for freeing the skb and returning is repeated several times, this > patch uses "goto rx_drop" to replace them so that the code looks cleaner. > > 2. > Add code to increase the stats.rx_dropped count whenever we drop a frame. > Increase the stats.rx_dropped count both after "goto rx_drop" and after > "goto rx_error" because I think we should increase this value whenever an > skb is dropped. In general we try to avoid changing counter behavior like that, as existing users may depend on current behavior, e.g., in dashboards or automated monitoring. I don't know how realistic that is in this specific case, no strong objections. Use good judgment.
[PATCH 0/2] pinctrl: qcom: Add sm8250 lpass lpi pinctrl support
This patch adds support for LPASS (Low Power Audio SubSystem) LPI (Low Power Island) pinctrl on SM8250. This patch has been tested on support to Qualcomm Robotics RB5 Development Kit based on QRB5165 Robotics SoC. This board has 2 WSA881X smart speakers with onboard DMIC connected to internal LPASS codec via WSA and VA macros respectively. Most of the work is derived from downstream Qualcomm kernels. Credits to various Qualcomm authors from Patrick Lai's team who have contributed to this code. Srinivas Kandagatla (2): dt-bindings: pinctrl: qcom: Add sm8250 lpass lpi pinctrl bindings pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver .../pinctrl/qcom,lpass-lpi-pinctrl.yaml | 129 +++ drivers/pinctrl/qcom/Kconfig | 8 + drivers/pinctrl/qcom/Makefile | 1 + drivers/pinctrl/qcom/pinctrl-lpass-lpi.c | 781 ++ 4 files changed, 919 insertions(+) create mode 100644 Documentation/devicetree/bindings/pinctrl/qcom,lpass-lpi-pinctrl.yaml create mode 100644 drivers/pinctrl/qcom/pinctrl-lpass-lpi.c -- 2.21.0
[PATCH 1/2] dt-bindings: pinctrl: qcom: Add sm8250 lpass lpi pinctrl bindings
Add device tree binding Documentation details for Qualcomm SM8250 LPASS(Low Power Audio Sub System) LPI(Low Power Island) pinctrl driver. Signed-off-by: Srinivas Kandagatla --- .../pinctrl/qcom,lpass-lpi-pinctrl.yaml | 129 ++ 1 file changed, 129 insertions(+) create mode 100644 Documentation/devicetree/bindings/pinctrl/qcom,lpass-lpi-pinctrl.yaml diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,lpass-lpi-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,lpass-lpi-pinctrl.yaml new file mode 100644 index ..8a0732574aee --- /dev/null +++ b/Documentation/devicetree/bindings/pinctrl/qcom,lpass-lpi-pinctrl.yaml @@ -0,0 +1,129 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/pinctrl/qcom,lpass-lpi-pinctrl.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm Technologies, Inc. Low Power Audio SubSystem (LPASS) + Low Power Island (LPI) TLMM block + +maintainers: + - Srinivas Kandagatla + +description: | + This binding describes the Top Level Mode Multiplexer block found in the + LPASS LPI IP on most Qualcomm SoCs + +properties: + compatible: +const: qcom,sm8250-lpass-lpi-pinctrl + + reg: +minItems: 2 +maxItems: 2 + + clocks: +items: + - description: LPASS Core voting clock + - description: LPASS Audio voting clock + + clock-names: +items: + - const: core + - const: audio + + gpio-controller: true + + '#gpio-cells': +description: Specifying the pin number and flags, as defined in + include/dt-bindings/gpio/gpio.h +const: 2 + + gpio-ranges: +maxItems: 1 + +#PIN CONFIGURATION NODES +patternProperties: + '^.*$': +if: + type: object +then: + properties: +pins: + description: +List of gpio pins affected by the properties specified in this +subnode. + items: +oneOf: + - pattern: "^gpio([0-9]|[1-9][0-9])$" + minItems: 1 + maxItems: 14 + +function: + enum: [ gpio, swr_tx_clk, qua_mi2s_sclk, swr_tx_data1, qua_mi2s_ws, + swr_tx_data2, qua_mi2s_data0, swr_rx_clk, qua_mi2s_data1, + swr_rx_data1, qua_mi2s_data2, swr_tx_data3, swr_rx_data2, + dmic1_clk, i2s1_clk, dmic1_data, i2s1_ws, dmic2_clk, + i2s1_data0, dmic2_data, i2s1_data1, i2s2_clk, wsa_swr_clk, + i2s2_ws, wsa_swr_data, dmic3_clk, i2s2_data0, dmic3_data, + i2s2_data1 ] + description: +Specify the alternative function to be configured for the specified +pins. + +drive-strength: + enum: [2, 4, 6, 8, 10, 12, 14, 16] + default: 2 + description: +Selects the drive strength for the specified pins, in mA. + +slew-rate: + enum: [0, 1, 2, 3] + default: 0 + description: | + 0: No adjustments + 1: Higher Slew rate (faster edges) + 2: Lower Slew rate (slower edges) + 3: Reserved (No adjustments) + +bias-pull-down: true + +bias-pull-up: true + +bias-disable: true + +output-high: true + +output-low: true + + required: +- pins +- function + + additionalProperties: false + +required: + - compatible + - reg + - clocks + - clock-names + - gpio-controller + - '#gpio-cells' + - gpio-ranges + +additionalProperties: false + +examples: + - | +#include +lpi_tlmm: pinctrl@33c { +compatible = "qcom,sm8250-lpass-lpi-pinctrl"; +reg = <0x33c 0x2>, + <0x355a000 0x1000>; +clocks = < LPASS_HW_MACRO_VOTE LPASS_CLK_ATTRIBUTE_COUPLE_NO>, + < LPASS_HW_DCODEC_VOTE LPASS_CLK_ATTRIBUTE_COUPLE_NO>; +clock-names = "core", "audio"; +gpio-controller; +#gpio-cells = <2>; +gpio-ranges = <_tlmm 0 0 14>; +}; -- 2.21.0
[PATCH 2/2] pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver
Add initial pinctrl driver to support pin configuration for LPASS (Low Power Audio SubSystem) LPI (Low Power Island) pinctrl on SM8250. Signed-off-by: Srinivas Kandagatla --- drivers/pinctrl/qcom/Kconfig | 8 + drivers/pinctrl/qcom/Makefile| 1 + drivers/pinctrl/qcom/pinctrl-lpass-lpi.c | 781 +++ 3 files changed, 790 insertions(+) create mode 100644 drivers/pinctrl/qcom/pinctrl-lpass-lpi.c diff --git a/drivers/pinctrl/qcom/Kconfig b/drivers/pinctrl/qcom/Kconfig index 5fe7b8aaf69d..af26f4c51f77 100644 --- a/drivers/pinctrl/qcom/Kconfig +++ b/drivers/pinctrl/qcom/Kconfig @@ -236,4 +236,12 @@ config PINCTRL_SM8250 Qualcomm Technologies Inc TLMM block found on the Qualcomm Technologies Inc SM8250 platform. +config PINCTRL_LPASS_LPI + tristate "Qualcomm Technologies Inc LPASS LPI pin controller driver" + depends on GPIOLIB && OF + help + This is the pinctrl, pinmux, pinconf and gpiolib driver for the + Qualcomm Technologies Inc LPASS (Low Power Audio SubSystem) LPI + (Low Power Island) found on the Qualcomm Technologies Inc SoCs. + endif diff --git a/drivers/pinctrl/qcom/Makefile b/drivers/pinctrl/qcom/Makefile index 9e3d9c91a444..c8520155fb1b 100644 --- a/drivers/pinctrl/qcom/Makefile +++ b/drivers/pinctrl/qcom/Makefile @@ -28,3 +28,4 @@ obj-$(CONFIG_PINCTRL_SDM660) += pinctrl-sdm660.o obj-$(CONFIG_PINCTRL_SDM845) += pinctrl-sdm845.o obj-$(CONFIG_PINCTRL_SM8150) += pinctrl-sm8150.o obj-$(CONFIG_PINCTRL_SM8250) += pinctrl-sm8250.o +obj-$(CONFIG_PINCTRL_LPASS_LPI) += pinctrl-lpass-lpi.o diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c new file mode 100644 index ..88937485e3bb --- /dev/null +++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c @@ -0,0 +1,781 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2016-2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2020 Linaro Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "../core.h" +#include "../pinctrl-utils.h" + +#define LPI_GPIO_REG_VAL_CTL 0x00 +#define LPI_GPIO_REG_DIR_CTL 0x04 +#define LPI_SLEW_REG_VAL_CTL 0x00 +#define LPI_SLEW_RATE_MAX0x03 +#define LPI_SLEW_BITS_SIZE 0x02 +#define LPI_GPIO_REG_PULL_SHIFT0x0 +#define LPI_GPIO_REG_PULL_MASK GENMASK(1, 0) +#define LPI_GPIO_REG_FUNCTION_SHIFT0x2 +#define LPI_GPIO_REG_FUNCTION_MASK GENMASK(5, 2) +#define LPI_GPIO_REG_OUT_STRENGTH_SHIFT0x6 +#define LPI_GPIO_REG_OUT_STRENGTH_MASK GENMASK(8, 6) +#define LPI_GPIO_REG_OE_SHIFT 0x9 +#define LPI_GPIO_REG_OE_MASK BIT(9) +#define LPI_GPIO_REG_DIR_SHIFT 0x1 +#define LPI_GPIO_REG_DIR_MASK 0x2 +#define LPI_GPIO_BIAS_DISABLE 0x0 +#define LPI_GPIO_PULL_DOWN 0x1 +#define LPI_GPIO_KEEPER0x2 +#define LPI_GPIO_PULL_UP 0x3 + +#define LPI_FUNCTION(fname)\ + [LPI_MUX_##fname] = { \ + .name = #fname, \ + .groups = fname##_groups, \ + .ngroups = ARRAY_SIZE(fname##_groups), \ + } + +#define LPI_PINGROUP(id, f1, f2, f3, f4) \ + { \ + .name = "gpio" #id, \ + .pins = gpio##id##_pins,\ + .pin = id, \ + .npins = ARRAY_SIZE(gpio##id##_pins), \ + .funcs = (int[]){ \ + LPI_MUX_gpio, \ + LPI_MUX_##f1, \ + LPI_MUX_##f2, \ + LPI_MUX_##f3, \ + LPI_MUX_##f4, \ + }, \ + .nfuncs = 5,\ + } +struct lpi_pingroup { + const char *name; + const unsigned int *pins; + unsigned int npins; + unsigned int pin; + unsigned int *funcs; + unsigned int nfuncs; +}; + +struct lpi_function { + const char *name; + const char * const *groups; + unsigned int ngroups; +}; + +struct lpi_pinctrl_variant_data { + int tlmm_reg_offset; + const struct pinctrl_pin_desc *pins; + int npins; + const struct lpi_pingroup *groups; + int ngroups; + const struct lpi_function *functions; + int nfunctions; + int *slew_reg_pin_offsets; +}; + +struct lpi_pinctrl { + struct device *dev; + struct pinctrl_dev *ctrl; + struct
Re: [PATCH RFC v2 12/21] kasan: inline and rename kasan_unpoison_memory
On Wed, Oct 28, 2020 at 12:36 PM Dmitry Vyukov wrote: > > On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov > wrote: > > > > Currently kasan_unpoison_memory() is used as both an external annotation > > and as internal memory poisoning helper. Rename external annotation to > > kasan_unpoison_data() and inline the internal helper for for hardware > > tag-based mode to avoid undeeded function calls. > > > > There's the external annotation kasan_unpoison_slab() that is currently > > defined as static inline and uses kasan_unpoison_memory(). With this > > change it's turned into a function call. Overall, this results in the > > same number of calls for hardware tag-based mode as > > kasan_unpoison_memory() is now inlined. > > Can't we leave kasan_unpoison_slab as is? Or there are other reasons > to uninline it? Just to have cleaner kasan.h callbacks definitions. > It seems that uninling it is orthogonal to the rest of this patch. I can split it out into a separate patch if you think this makes sense?