Re: [-next] strace tests fail because of "y2038: socket: Add compat_sys_recvmmsg_time64"
On Mon, Dec 17, 2018 at 11:05:06PM +0100, Arnd Bergmann wrote: > On Mon, Dec 17, 2018 at 10:40 PM Arnd Bergmann wrote: > > > > On Mon, Dec 17, 2018 at 2:06 PM Heiko Carstens > > wrote: > > > > > > Hi Arnd, > > > > > > in linux-next as of today 16 strace self tests fail on s390. I could > > > bisect this to b136972b063b ("y2038: socket: Add > > > compat_sys_recvmmsg_time64"). > > > > > > The following tests fail: > > > > Hi Heiko, > > > > Thanks for the report and sorry I broke things. I'll have a closer look > > tomorrow if I don't find it right away. I suppose the regression was in > > native system calls, not the compat syscalls with 31-bit user space, > > right? Yes, I was talking about 64 bit native system calls. > I found a bug in my patch by inspection. Can you try if the patch > below makes it all work (apologies for the garbled whitespace), > I'm considering a rewrite of that function now (to split it into two > again), but want to make sure there isn't another problem in my > original patch. With your patch below applied, the tests pass again. Thanks! > > diff --git a/net/socket.c b/net/socket.c > index 3bb2ee083f97..7f9f225d0b6c 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -2486,12 +2486,12 @@ int __sys_recvmmsg(int fd, struct mmsghdr __user > *mmsg, > return -EFAULT; > > if (!timeout && !timeout32) > -do_recvmmsg(fd, mmsg, vlen, flags, NULL); > +return do_recvmmsg(fd, mmsg, vlen, flags, NULL); > > datagrams = do_recvmmsg(fd, mmsg, vlen, flags, &timeout_sys); > > -if (!datagrams) > -return 0; > +if (datagrams <= 0) > +return datagrams; > > if (timeout && put_timespec64(&timeout_sys, timeout)) > datagrams = -EFAULT; >
[PATCH v4 3/6] arm64/kvm: add a userspace option to enable pointer authentication
This feature will allow the KVM guest to allow the handling of pointer authentication instructions or to treat them as undefined if not set. It uses the existing vcpu API KVM_ARM_VCPU_INIT to supply this parameter instead of creating a new API. A new register is not created to pass this parameter via SET/GET_ONE_REG interface as just a flag (KVM_ARM_VCPU_PTRAUTH) supplied is enough to select this feature. Signed-off-by: Amit Daniel Kachhap Cc: Mark Rutland Cc: Marc Zyngier Cc: Christoffer Dall Cc: kvm...@lists.cs.columbia.edu --- Documentation/arm64/pointer-authentication.txt | 9 + Documentation/virtual/kvm/api.txt | 4 arch/arm/include/asm/kvm_host.h| 4 arch/arm64/include/asm/kvm_host.h | 7 --- arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/kvm/handle_exit.c | 2 +- arch/arm64/kvm/hyp/ptrauth-sr.c| 16 arch/arm64/kvm/reset.c | 3 +++ include/uapi/linux/kvm.h | 1 + 9 files changed, 39 insertions(+), 8 deletions(-) diff --git a/Documentation/arm64/pointer-authentication.txt b/Documentation/arm64/pointer-authentication.txt index 5baca42..8c0f338 100644 --- a/Documentation/arm64/pointer-authentication.txt +++ b/Documentation/arm64/pointer-authentication.txt @@ -87,7 +87,8 @@ used to get and set the keys for a thread. Virtualization -- -Pointer authentication is not currently supported in KVM guests. KVM -will mask the feature bits from ID_AA64ISAR1_EL1, and attempted use of -the feature will result in an UNDEFINED exception being injected into -the guest. +Pointer authentication is enabled in KVM guest when virtual machine is +created by passing a flag (KVM_ARM_VCPU_PTRAUTH) requesting this feature +to be enabled. Without this flag, pointer authentication is not enabled +in KVM guests and attempted use of the feature will result in an UNDEFINED +exception being injected into the guest. diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index cd209f7..e20583a 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2634,6 +2634,10 @@ Possible features: Depends on KVM_CAP_ARM_PSCI_0_2. - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU. Depends on KVM_CAP_ARM_PMU_V3. + - KVM_ARM_VCPU_PTRAUTH: Emulate Pointer authentication for the CPU. + Depends on KVM_CAP_ARM_PTRAUTH and only on arm64 architecture. If + set, then the KVM guest allows the execution of pointer authentication + instructions or treats them as undefined if not set. 4.83 KVM_ARM_PREFERRED_TARGET diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 02d9bfc..62a85d9 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -352,6 +352,10 @@ static inline int kvm_arm_have_ssbd(void) static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {} static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {} static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu) {} +static inline bool kvm_arm_vcpu_ptrauth_allowed(struct kvm_vcpu *vcpu) +{ + return false; +} #define __KVM_HAVE_ARCH_VM_ALLOC struct kvm *kvm_arch_alloc_vm(void); diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 629712d..f853a95 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -43,7 +43,7 @@ #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS -#define KVM_VCPU_MAX_FEATURES 4 +#define KVM_VCPU_MAX_FEATURES 5 #define KVM_REQ_SLEEP \ KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) @@ -453,14 +453,15 @@ static inline bool kvm_arch_check_sve_has_vhe(void) void kvm_arm_vcpu_ptrauth_enable(struct kvm_vcpu *vcpu); void kvm_arm_vcpu_ptrauth_disable(struct kvm_vcpu *vcpu); +bool kvm_arm_vcpu_ptrauth_allowed(struct kvm_vcpu *vcpu); static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu) { /* Disable ptrauth and use it in a lazy context via traps */ - if (has_vhe() && system_supports_ptrauth()) + if (has_vhe() && system_supports_ptrauth() + && kvm_arm_vcpu_ptrauth_allowed(vcpu)) kvm_arm_vcpu_ptrauth_disable(vcpu); } - void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu); static inline void kvm_arch_hardware_unsetup(void) {} diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 97c3478..5f82ca1 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -102,6 +102,7 @@ struct kvm_regs { #define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */ #define KVM_ARM_VCPU_PSCI_0_2 2 /* CPU uses PSCI v0.2 */ #define KVM_ARM_VCPU_PMU_V33 /* Support guest PMUv3 */ +#define KVM_ARM_VCPU_PTRAUTH
[PATCH v4 1/6] arm64/kvm: preserve host HCR_EL2 value
When restoring HCR_EL2 for the host, KVM uses HCR_HOST_VHE_FLAGS, which is a constant value. This works today, as the host HCR_EL2 value is always the same, but this will get in the way of supporting extensions that require HCR_EL2 bits to be set conditionally for the host. To allow such features to work without KVM having to explicitly handle every possible host feature combination, this patch has KVM save/restore the host HCR when switching to/from a guest HCR. The saving of the register is done once during cpu hypervisor initialization state and is just restored after switch from guest. For fetching HCR_EL2 during kvm initilisation, a hyp call is made using kvm_call_hyp and is helpful in NHVE case. For the hyp TLB maintenance code, __tlb_switch_to_host_vhe() is updated to toggle the TGE bit with a RMW sequence, as we already do in __tlb_switch_to_guest_vhe(). Signed-off-by: Mark Rutland Signed-off-by: Amit Daniel Kachhap Cc: Marc Zyngier Cc: Christoffer Dall Cc: kvm...@lists.cs.columbia.edu --- arch/arm/include/asm/kvm_host.h | 2 ++ arch/arm64/include/asm/kvm_asm.h | 2 ++ arch/arm64/include/asm/kvm_host.h | 14 -- arch/arm64/kvm/hyp/switch.c | 15 +-- arch/arm64/kvm/hyp/sysreg-sr.c| 11 +++ arch/arm64/kvm/hyp/tlb.c | 6 +- virt/kvm/arm/arm.c| 2 ++ 7 files changed, 43 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 5ca5d9a..0f012c8 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -273,6 +273,8 @@ static inline void __cpu_init_stage2(void) kvm_call_hyp(__init_stage2_translation); } +static inline void __cpu_copy_host_registers(void) {} + static inline int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext) { return 0; diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index aea01a0..25ac9fa 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -73,6 +73,8 @@ extern void __vgic_v3_init_lrs(void); extern u32 __kvm_get_mdcr_el2(void); +extern u64 __read_hyp_hcr_el2(void); + /* Home-grown __this_cpu_{ptr,read} variants that always work at HYP */ #define __hyp_this_cpu_ptr(sym) \ ({ \ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 52fbc82..1b9eed9 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -196,13 +196,17 @@ enum vcpu_sysreg { #define NR_COPRO_REGS (NR_SYS_REGS * 2) +struct kvm_cpu_init_host_regs { + u64 hcr_el2; +}; + struct kvm_cpu_context { struct kvm_regs gp_regs; union { u64 sys_regs[NR_SYS_REGS]; u32 copro[NR_COPRO_REGS]; }; - + struct kvm_cpu_init_host_regs init_regs; struct kvm_vcpu *__hyp_running_vcpu; }; @@ -211,7 +215,7 @@ typedef struct kvm_cpu_context kvm_cpu_context_t; struct kvm_vcpu_arch { struct kvm_cpu_context ctxt; - /* HYP configuration */ + /* Guest HYP configuration */ u64 hcr_el2; u32 mdcr_el2; @@ -455,6 +459,12 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, static inline void __cpu_init_stage2(void) {} +static inline void __cpu_copy_host_registers(void) +{ + kvm_cpu_context_t *host_cxt = this_cpu_ptr(&kvm_host_cpu_state); + host_cxt->init_regs.hcr_el2 = kvm_call_hyp(__read_hyp_hcr_el2); +} + /* Guest/host FPSIMD coordination helpers */ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu); void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index f6e02cc..85a2a5c 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -139,15 +139,15 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) __activate_traps_nvhe(vcpu); } -static void deactivate_traps_vhe(void) +static void deactivate_traps_vhe(struct kvm_cpu_context *host_ctxt) { extern char vectors[]; /* kernel exception vectors */ - write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); + write_sysreg(host_ctxt->init_regs.hcr_el2, hcr_el2); write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); write_sysreg(vectors, vbar_el1); } -static void __hyp_text __deactivate_traps_nvhe(void) +static void __hyp_text __deactivate_traps_nvhe(struct kvm_cpu_context *host_ctxt) { u64 mdcr_el2 = read_sysreg(mdcr_el2); @@ -157,12 +157,15 @@ static void __hyp_text __deactivate_traps_nvhe(void) mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; write_sysreg(mdcr_el2, mdcr_el2); - write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2); + write_sysreg(host_ctxt->init_regs.hcr_el2, hcr_el2); write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
[PATCH v4 4/6] arm64/kvm: enable pointer authentication cpufeature conditionally
According to userspace settings, pointer authentication cpufeature is enabled/disabled from guests. Signed-off-by: Amit Daniel Kachhap Cc: Mark Rutland Cc: Christoffer Dall Cc: Marc Zyngier Cc: kvm...@lists.cs.columbia.edu --- Documentation/arm64/pointer-authentication.txt | 3 +++ arch/arm64/kvm/sys_regs.c | 33 -- 2 files changed, 24 insertions(+), 12 deletions(-) diff --git a/Documentation/arm64/pointer-authentication.txt b/Documentation/arm64/pointer-authentication.txt index 8c0f338..a65dca2 100644 --- a/Documentation/arm64/pointer-authentication.txt +++ b/Documentation/arm64/pointer-authentication.txt @@ -92,3 +92,6 @@ created by passing a flag (KVM_ARM_VCPU_PTRAUTH) requesting this feature to be enabled. Without this flag, pointer authentication is not enabled in KVM guests and attempted use of the feature will result in an UNDEFINED exception being injected into the guest. + +Additionally, when KVM_ARM_VCPU_PTRAUTH is not set then KVM will mask the +feature bits from ID_AA64ISAR1_EL1. diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 6af6c7d..ce6144a 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1055,7 +1055,7 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, } /* Read a sanitised cpufeature ID register by sys_reg_desc */ -static u64 read_id_reg(struct sys_reg_desc const *r, bool raz) +static u64 read_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_desc const *r, bool raz) { u32 id = sys_reg((u32)r->Op0, (u32)r->Op1, (u32)r->CRn, (u32)r->CRm, (u32)r->Op2); @@ -1066,6 +1066,15 @@ static u64 read_id_reg(struct sys_reg_desc const *r, bool raz) kvm_debug("SVE unsupported for guests, suppressing\n"); val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); + } else if (id == SYS_ID_AA64ISAR1_EL1) { + const u64 ptrauth_mask = (0xfUL << ID_AA64ISAR1_APA_SHIFT) | +(0xfUL << ID_AA64ISAR1_API_SHIFT) | +(0xfUL << ID_AA64ISAR1_GPA_SHIFT) | +(0xfUL << ID_AA64ISAR1_GPI_SHIFT); + if (!kvm_arm_vcpu_ptrauth_allowed(vcpu)) { + kvm_debug("ptrauth unsupported for guests, suppressing\n"); + val &= ~ptrauth_mask; + } } else if (id == SYS_ID_AA64MMFR1_EL1) { if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT)) kvm_debug("LORegions unsupported for guests, suppressing\n"); @@ -1086,7 +1095,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu, if (p->is_write) return write_to_read_only(vcpu, p, r); - p->regval = read_id_reg(r, raz); + p->regval = read_id_reg(vcpu, r, raz); return true; } @@ -1115,17 +1124,17 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg); * are stored, and for set_id_reg() we don't allow the effective value * to be changed. */ -static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr, - bool raz) +static int __get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, + void __user *uaddr, bool raz) { const u64 id = sys_reg_to_index(rd); - const u64 val = read_id_reg(rd, raz); + const u64 val = read_id_reg(vcpu, rd, raz); return reg_to_user(uaddr, &val, id); } -static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr, - bool raz) +static int __set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, + void __user *uaddr, bool raz) { const u64 id = sys_reg_to_index(rd); int err; @@ -1136,7 +1145,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr, return err; /* This is what we mean by invariant: you can't change it. */ - if (val != read_id_reg(rd, raz)) + if (val != read_id_reg(vcpu, rd, raz)) return -EINVAL; return 0; @@ -1145,25 +1154,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr, static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - return __get_id_reg(rd, uaddr, false); + return __get_id_reg(vcpu, rd, uaddr, false); } static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - return __set_id_reg(rd, uaddr, false); + return __set_id_reg(vcpu, rd, uaddr, false); } static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - return __get_id_reg(rd, uaddr, true); + ret
[PATCH net] net: phy: Fix the issue that netif always links up after resuming
Even though the link is down before entering hibernation, there is an issue that the network interface always links up after resuming from hibernation. If the link is still down before enabling the network interface, and after resuming from hibernation, the phydev->state is forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up. In suspend sequence, only if the PHY is attached, mdio_bus_phy_suspend() calls phy_stop_machine(), and mdio_bus_phy_resume() calls phy_start_machine(). In resume sequence, it's enough to do the same as mdio_bus_phy_resume() because the state has been preserved. This patch fixes the issue by calling phy_start_machine() in mdio_bus_phy_restore() in the same way as mdio_bus_phy_resume(). Suggested-by: Heiner Kallweit Signed-off-by: Kunihiko Hayashi --- drivers/net/phy/phy_device.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) This patch is based on the RFC patch discussion [1]. [1] https://www.spinics.net/lists/netdev/msg537326.html diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 7d5d698..3685be4 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -315,11 +315,8 @@ static int mdio_bus_phy_restore(struct device *dev) if (ret < 0) return ret; - /* The PHY needs to renegotiate. */ - phydev->link = 0; - phydev->state = PHY_UP; - - phy_start_machine(phydev); + if (phydev->attached_dev && phydev->adjust_link) + phy_start_machine(phydev); return 0; } -- 2.7.4
[PATCH v8 1/2] dt-bindings: PCI: meson: add DT bindings for Amlogic Meson PCIe controller
From: Yue Wang The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare PCI core. This patch adds documentation for the DT bindings in Meson PCIe controller. Signed-off-by: Yue Wang Signed-off-by: Hanjie Lin Reviewed-by: Rob Herring --- .../devicetree/bindings/pci/amlogic,meson-pcie.txt | 70 ++ 1 file changed, 70 insertions(+) create mode 100644 Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt diff --git a/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt b/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt new file mode 100644 index 000..12b18f8 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt @@ -0,0 +1,70 @@ +Amlogic Meson AXG DWC PCIE SoC controller + +Amlogic Meson PCIe host controller is based on the Synopsys DesignWare PCI core. +It shares common functions with the PCIe DesignWare core driver and +inherits common properties defined in +Documentation/devicetree/bindings/pci/designware-pci.txt. + +Additional properties are described here: + +Required properties: +- compatible: + should contain "amlogic,axg-pcie" to identify the core. +- reg: + should contain the configuration address space. +- reg-names: Must be + - "elbi"External local bus interface registers + - "cfg" Meson specific registers + - "phy" Meson PCIE PHY registers + - "config" PCIe configuration space +- reset-gpios: The GPIO to generate PCIe PERST# assert and deassert signal. +- clocks: Must contain an entry for each entry in clock-names. +- clock-names: Must include the following entries: + - "pclk" PCIe GEN 100M PLL clock + - "port" PCIe_x(A or B) RC clock gate + - "general"PCIe Phy clock + - "mipi" PCIe_x(A or B) 100M ref clock gate +- resets: phandle to the reset lines. +- reset-names: must contain "phy" "port" and "apb" + - "phy" Share PHY reset + - "port"Port A or B reset + - "apb" Share APB reset +- device_type: + should be "pci". As specified in designware-pcie.txt + + +Example configuration: + + pcie: pcie@f980 { + compatible = "amlogic,axg-pcie", "snps,dw-pcie"; + reg = <0x0 0xf980 0x0 0x40 + 0x0 0xff646000 0x0 0x2000 + 0x0 0xff644000 0x0 0x2000 + 0x0 0xf9f0 0x0 0x10>; + reg-names = "elbi", "cfg", "phy", "config"; + reset-gpios = <&gpio GPIOX_19 GPIO_ACTIVE_HIGH>; + interrupts = ; + #interrupt-cells = <1>; + interrupt-map-mask = <0 0 0 0>; + interrupt-map = <0 0 0 0 &gic GIC_SPI 179 IRQ_TYPE_EDGE_RISING>; + bus-range = <0x0 0xff>; + #address-cells = <3>; + #size-cells = <2>; + device_type = "pci"; + ranges = <0x8200 0 0 0x0 0xf9c0 0 0x0030>; + + clocks = <&clkc CLKID_USB + &clkc CLKID_MIPI_ENABLE + &clkc CLKID_PCIE_A + &clkc CLKID_PCIE_CML_EN0>; + clock-names = "general", + "mipi", + "pclk", + "port"; + resets = <&reset RESET_PCIE_PHY>, + <&reset RESET_PCIE_A>, + <&reset RESET_PCIE_APB>; + reset-names = "phy", + "port", + "apb"; + }; -- 2.7.4
[PATCH v7 0/2] add the Amlogic Meson PCIe controller driver
The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare PCI core. This patchset add the driver and dt-bindings of the controller. Changes since v7: [6] - include files in alphabetical order - get rid of unused MACROs and variables - optimize meson_pcie_link_up() while loop Changes since v6: [5] - fix bad usage of ERR_PTR(ENXIO) - fix meson_pcie_rd_own_conf() when read PCI_CLASS_DEVICE reg Changes since v5: [4] - update MAINTAINER file in alphabetical order - remove meaningless comment - use ERR_PTR function instead of (void *) cast - use is_power_of_2(size) instead of size & (size - 1) - add comment for PCI_CLASS_REVISION register operation Changes since v4: [3] - fix kbuild test robot and compile warnings Changes since v3: [2] - modify subject format - update Kconfig - update MAINTAINER file - add comment and error handle for meson_pcie_get_mem_shared() - drop useless initialization code - add comment for meson_size_to_payload() - optimize meson_pcie_establish_link() return code - optimize meson_pcie_enable_interrupts() redundant function - drop device_attch related code - drop dw_pcie_ops read_dbi and write_dbi function - add error handle for meson_add_pcie_port() when probe Changes since v2: [1] - abandon phy driver, move reset to the controller - use devm_add_action_or_reset() to use clock res - format correcting Changes since v1: [0] - use gpio lib instead open code - move 'apb' and 'port' reset from phy driver - format correcting [0] : https://lkml.kernel.org/r/1534227522-186798-1-git-send-email-hanjie@amlogic.com [1] : https://lkml.kernel.org/r/1535096165-45827-1-git-send-email-hanjie@amlogic.com [2] : https://lkml.kernel.org/r/1537509820-52040-1-git-send-email-hanjie@amlogic.com [3] : https://lkml.kernel.org/r/1538999834-156423-3-git-send-email-hanjie@amlogic.com [4] : https://lkml.kernel.org/r/1539049990-30810-1-git-send-email-hanjie@amlogic.com [5] : https://lkml.kernel.org/r/1542876836-191355-1-git-send-email-hanjie@amlogic.com Yue Wang (2): dt-bindings: PCI: meson: add DT bindings for Amlogic Meson PCIe controller PCI: amlogic: Add the Amlogic Meson PCIe controller driver .../devicetree/bindings/pci/amlogic,meson-pcie.txt | 70 +++ MAINTAINERS| 7 + drivers/pci/controller/dwc/Kconfig | 10 + drivers/pci/controller/dwc/Makefile| 1 + drivers/pci/controller/dwc/pci-meson.c | 595 + 5 files changed, 683 insertions(+) create mode 100644 Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt create mode 100644 drivers/pci/controller/dwc/pci-meson.c -- 2.7.4
[PATCH v4 5/6] arm64/kvm: control accessibility of ptrauth key registers
According to userspace settings, ptrauth key registers are conditionally present in guest system register list based on user specified flag KVM_ARM_VCPU_PTRAUTH. Signed-off-by: Amit Daniel Kachhap Cc: Mark Rutland Cc: Christoffer Dall Cc: Marc Zyngier Cc: kvm...@lists.cs.columbia.edu --- Documentation/arm64/pointer-authentication.txt | 3 +- arch/arm64/kvm/sys_regs.c | 42 +++--- 2 files changed, 33 insertions(+), 12 deletions(-) diff --git a/Documentation/arm64/pointer-authentication.txt b/Documentation/arm64/pointer-authentication.txt index a65dca2..729055a 100644 --- a/Documentation/arm64/pointer-authentication.txt +++ b/Documentation/arm64/pointer-authentication.txt @@ -94,4 +94,5 @@ in KVM guests and attempted use of the feature will result in an UNDEFINED exception being injected into the guest. Additionally, when KVM_ARM_VCPU_PTRAUTH is not set then KVM will mask the -feature bits from ID_AA64ISAR1_EL1. +feature bits from ID_AA64ISAR1_EL1 and pointer authentication key registers +are hidden from userspace. diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index ce6144a..09302b2 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1343,12 +1343,6 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 }, { SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 }, - PTRAUTH_KEY(APIA), - PTRAUTH_KEY(APIB), - PTRAUTH_KEY(APDA), - PTRAUTH_KEY(APDB), - PTRAUTH_KEY(APGA), - { SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 }, { SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 }, { SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 }, @@ -1500,6 +1494,14 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_FPEXC32_EL2), NULL, reset_val, FPEXC32_EL2, 0x70 }, }; +static const struct sys_reg_desc ptrauth_reg_descs[] = { + PTRAUTH_KEY(APIA), + PTRAUTH_KEY(APIB), + PTRAUTH_KEY(APDA), + PTRAUTH_KEY(APDB), + PTRAUTH_KEY(APGA), +}; + static bool trap_dbgidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) @@ -2100,6 +2102,8 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu, r = find_reg(params, table, num); if (!r) r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); + if (!r && kvm_arm_vcpu_ptrauth_allowed(vcpu)) + r = find_reg(params, ptrauth_reg_descs, ARRAY_SIZE(ptrauth_reg_descs)); if (likely(r)) { perform_access(vcpu, params, r); @@ -2213,6 +2217,8 @@ static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu, r = find_reg_by_id(id, ¶ms, table, num); if (!r) r = find_reg(¶ms, sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); + if (!r && kvm_arm_vcpu_ptrauth_allowed(vcpu)) + r = find_reg(¶ms, ptrauth_reg_descs, ARRAY_SIZE(ptrauth_reg_descs)); /* Not saved in the sys_reg array and not otherwise accessible? */ if (r && !(r->reg || r->get_user)) @@ -2494,18 +2500,22 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd, } /* Assumed ordered tables, see kvm_sys_reg_table_init. */ -static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind) +static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind, + const struct sys_reg_desc *desc, unsigned int len) { const struct sys_reg_desc *i1, *i2, *end1, *end2; unsigned int total = 0; size_t num; int err; + if (desc == ptrauth_reg_descs && !kvm_arm_vcpu_ptrauth_allowed(vcpu)) + return total; + /* We check for duplicates here, to allow arch-specific overrides. */ i1 = get_target_table(vcpu->arch.target, true, &num); end1 = i1 + num; - i2 = sys_reg_descs; - end2 = sys_reg_descs + ARRAY_SIZE(sys_reg_descs); + i2 = desc; + end2 = desc + len; BUG_ON(i1 == end1 || i2 == end2); @@ -2533,7 +2543,10 @@ unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu *vcpu) { return ARRAY_SIZE(invariant_sys_regs) + num_demux_regs() - + walk_sys_regs(vcpu, (u64 __user *)NULL); + + walk_sys_regs(vcpu, (u64 __user *)NULL, sys_reg_descs, + ARRAY_SIZE(sys_reg_descs)) + + walk_sys_regs(vcpu, (u64 __user *)NULL, ptrauth_reg_descs, + ARRAY_SIZE(ptrauth_reg_descs)); } int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) @@ -2548,7 +2561,12 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) uindices++; } - err = walk_sys_regs(vcpu, uindices); +
[PATCH v4 6/6] arm/kvm: arm64: Add a vcpu feature for pointer authentication
This is a runtime feature and can be enabled by --ptrauth option. Signed-off-by: Amit Daniel Kachhap Cc: Mark Rutland Cc: Christoffer Dall Cc: Marc Zyngier Cc: kvm...@lists.cs.columbia.edu --- arm/aarch32/include/kvm/kvm-cpu-arch.h| 2 ++ arm/aarch64/include/asm/kvm.h | 3 +++ arm/aarch64/include/kvm/kvm-arch.h| 1 + arm/aarch64/include/kvm/kvm-config-arch.h | 4 +++- arm/aarch64/include/kvm/kvm-cpu-arch.h| 2 ++ arm/aarch64/kvm-cpu.c | 5 + arm/include/arm-common/kvm-config-arch.h | 1 + arm/kvm-cpu.c | 7 +++ include/linux/kvm.h | 1 + 9 files changed, 25 insertions(+), 1 deletion(-) diff --git a/arm/aarch32/include/kvm/kvm-cpu-arch.h b/arm/aarch32/include/kvm/kvm-cpu-arch.h index d28ea67..5779767 100644 --- a/arm/aarch32/include/kvm/kvm-cpu-arch.h +++ b/arm/aarch32/include/kvm/kvm-cpu-arch.h @@ -13,4 +13,6 @@ #define ARM_CPU_ID 0, 0, 0 #define ARM_CPU_ID_MPIDR 5 +unsigned int kvm__cpu_ptrauth_get_feature(void) {} + #endif /* KVM__KVM_CPU_ARCH_H */ diff --git a/arm/aarch64/include/asm/kvm.h b/arm/aarch64/include/asm/kvm.h index c286035..0fd183d 100644 --- a/arm/aarch64/include/asm/kvm.h +++ b/arm/aarch64/include/asm/kvm.h @@ -98,6 +98,9 @@ struct kvm_regs { #define KVM_ARM_VCPU_PSCI_0_2 2 /* CPU uses PSCI v0.2 */ #define KVM_ARM_VCPU_PMU_V33 /* Support guest PMUv3 */ +/* CPU uses address authentication and A key */ +#define KVM_ARM_VCPU_PTRAUTH 4 + struct kvm_vcpu_init { __u32 target; __u32 features[7]; diff --git a/arm/aarch64/include/kvm/kvm-arch.h b/arm/aarch64/include/kvm/kvm-arch.h index 9de623a..bd566cb 100644 --- a/arm/aarch64/include/kvm/kvm-arch.h +++ b/arm/aarch64/include/kvm/kvm-arch.h @@ -11,4 +11,5 @@ #include "arm-common/kvm-arch.h" + #endif /* KVM__KVM_ARCH_H */ diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h index 04be43d..2074684 100644 --- a/arm/aarch64/include/kvm/kvm-config-arch.h +++ b/arm/aarch64/include/kvm/kvm-config-arch.h @@ -8,7 +8,9 @@ "Create PMUv3 device"), \ OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed, \ "Specify random seed for Kernel Address Space " \ - "Layout Randomization (KASLR)"), + "Layout Randomization (KASLR)"),\ + OPT_BOOLEAN('\0', "ptrauth", &(cfg)->has_ptrauth, \ + "Enable address authentication"), #include "arm-common/kvm-config-arch.h" diff --git a/arm/aarch64/include/kvm/kvm-cpu-arch.h b/arm/aarch64/include/kvm/kvm-cpu-arch.h index a9d8563..f7b64b7 100644 --- a/arm/aarch64/include/kvm/kvm-cpu-arch.h +++ b/arm/aarch64/include/kvm/kvm-cpu-arch.h @@ -17,4 +17,6 @@ #define ARM_CPU_CTRL 3, 0, 1, 0 #define ARM_CPU_CTRL_SCTLR_EL1 0 +unsigned int kvm__cpu_ptrauth_get_feature(void); + #endif /* KVM__KVM_CPU_ARCH_H */ diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c index 1b29374..10da2cb 100644 --- a/arm/aarch64/kvm-cpu.c +++ b/arm/aarch64/kvm-cpu.c @@ -123,6 +123,11 @@ void kvm_cpu__reset_vcpu(struct kvm_cpu *vcpu) return reset_vcpu_aarch64(vcpu); } +unsigned int kvm__cpu_ptrauth_get_feature(void) +{ + return (1UL << KVM_ARM_VCPU_PTRAUTH); +} + int kvm_cpu__get_endianness(struct kvm_cpu *vcpu) { struct kvm_one_reg reg; diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h index 6a196f1..eb872db 100644 --- a/arm/include/arm-common/kvm-config-arch.h +++ b/arm/include/arm-common/kvm-config-arch.h @@ -10,6 +10,7 @@ struct kvm_config_arch { boolaarch32_guest; boolhas_pmuv3; u64 kaslr_seed; + boolhas_ptrauth; enum irqchip_type irqchip; }; diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c index 7780251..5afd727 100644 --- a/arm/kvm-cpu.c +++ b/arm/kvm-cpu.c @@ -68,6 +68,13 @@ struct kvm_cpu *kvm_cpu__arch_init(struct kvm *kvm, unsigned long cpu_id) vcpu_init.features[0] |= (1UL << KVM_ARM_VCPU_PSCI_0_2); } + /* Set KVM_ARM_VCPU_PTRAUTH_I_A if available */ + if (kvm__supports_extension(kvm, KVM_CAP_ARM_PTRAUTH)) { + if (kvm->cfg.arch.has_ptrauth) + vcpu_init.features[0] |= + kvm__cpu_ptrauth_get_feature(); + } + /* * If the preferred target ioctl is successful then * use preferred target else try each and every target type diff --git a/include/linux/kvm.h b/include/linux/kvm.h index f51d508..ffd8f5c 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -883,6 +883,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_PPC_MMU_RADIX 134 #define KVM_CAP_PPC_MMU_HASH_V3 135 #define KVM_CAP_IMMEDIATE_
[PATCH v4 2/6] arm64/kvm: context-switch ptrauth registers
When pointer authentication is supported, a guest may wish to use it. This patch adds the necessary KVM infrastructure for this to work, with a semi-lazy context switch of the pointer auth state. Pointer authentication feature is only enabled when VHE is built into the kernel and present into CPU implementation so only VHE code paths are modified. When we schedule a vcpu, we disable guest usage of pointer authentication instructions and accesses to the keys. While these are disabled, we avoid context-switching the keys. When we trap the guest trying to use pointer authentication functionality, we change to eagerly context-switching the keys, and enable the feature. The next time the vcpu is scheduled out/in, we start again. Pointer authentication consists of address authentication and generic authentication, and CPUs in a system might have varied support for either. Where support for either feature is not uniform, it is hidden from guests via ID register emulation, as a result of the cpufeature framework in the host. Unfortunately, address authentication and generic authentication cannot be trapped separately, as the architecture provides a single EL2 trap covering both. If we wish to expose one without the other, we cannot prevent a (badly-written) guest from intermittently using a feature which is not uniformly supported (when scheduled on a physical CPU which supports the relevant feature). When the guest is scheduled on a physical CPU lacking the feature, these attemts will result in an UNDEF being taken by the guest. Signed-off-by: Mark Rutland Signed-off-by: Amit Daniel Kachhap Cc: Marc Zyngier Cc: Christoffer Dall Cc: kvm...@lists.cs.columbia.edu --- arch/arm/include/asm/kvm_host.h | 1 + arch/arm64/include/asm/cpufeature.h | 6 +++ arch/arm64/include/asm/kvm_host.h | 24 arch/arm64/include/asm/kvm_hyp.h| 7 arch/arm64/kernel/traps.c | 1 + arch/arm64/kvm/handle_exit.c| 24 +++- arch/arm64/kvm/hyp/Makefile | 1 + arch/arm64/kvm/hyp/ptrauth-sr.c | 73 + arch/arm64/kvm/hyp/switch.c | 4 ++ arch/arm64/kvm/sys_regs.c | 40 virt/kvm/arm/arm.c | 2 + 11 files changed, 166 insertions(+), 17 deletions(-) create mode 100644 arch/arm64/kvm/hyp/ptrauth-sr.c diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 0f012c8..02d9bfc 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -351,6 +351,7 @@ static inline int kvm_arm_have_ssbd(void) static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {} static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {} +static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu) {} #define __KVM_HAVE_ARCH_VM_ALLOC struct kvm *kvm_arch_alloc_vm(void); diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 1c8393f..ac7d496 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -526,6 +526,12 @@ static inline bool system_supports_generic_auth(void) cpus_have_const_cap(ARM64_HAS_GENERIC_AUTH); } +static inline bool system_supports_ptrauth(void) +{ + return IS_ENABLED(CONFIG_ARM64_PTR_AUTH) && + cpus_have_const_cap(ARM64_HAS_ADDRESS_AUTH); +} + #define ARM64_SSBD_UNKNOWN -1 #define ARM64_SSBD_FORCE_DISABLE 0 #define ARM64_SSBD_KERNEL 1 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1b9eed9..629712d 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -146,6 +146,18 @@ enum vcpu_sysreg { PMSWINC_EL0,/* Software Increment Register */ PMUSERENR_EL0, /* User Enable Register */ + /* Pointer Authentication Registers */ + APIAKEYLO_EL1, + APIAKEYHI_EL1, + APIBKEYLO_EL1, + APIBKEYHI_EL1, + APDAKEYLO_EL1, + APDAKEYHI_EL1, + APDBKEYLO_EL1, + APDBKEYHI_EL1, + APGAKEYLO_EL1, + APGAKEYHI_EL1, + /* 32bit specific registers. Keep them at the end of the range */ DACR32_EL2, /* Domain Access Control Register */ IFSR32_EL2, /* Instruction Fault Status Register */ @@ -439,6 +451,18 @@ static inline bool kvm_arch_check_sve_has_vhe(void) return true; } +void kvm_arm_vcpu_ptrauth_enable(struct kvm_vcpu *vcpu); +void kvm_arm_vcpu_ptrauth_disable(struct kvm_vcpu *vcpu); + +static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu) +{ + /* Disable ptrauth and use it in a lazy context via traps */ + if (has_vhe() && system_supports_ptrauth()) + kvm_arm_vcpu_ptrauth_disable(vcpu); +} + +void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu); + static inline void kvm_arch_hardware_unsetup(void) {} static inline void kvm_arch_sync_events(st
[PATCH v8 2/2] PCI: amlogic: Add the Amlogic Meson PCIe controller driver
From: Yue Wang The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare PCI core. This patch adds the driver support for Meson PCIe controller. Signed-off-by: Yue Wang Signed-off-by: Hanjie Lin --- MAINTAINERS| 7 + drivers/pci/controller/dwc/Kconfig | 10 + drivers/pci/controller/dwc/Makefile| 1 + drivers/pci/controller/dwc/pci-meson.c | 595 + 4 files changed, 613 insertions(+) create mode 100644 drivers/pci/controller/dwc/pci-meson.c diff --git a/MAINTAINERS b/MAINTAINERS index 7fe120f..21ed916 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11600,6 +11600,13 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/ S: Supported F: drivers/pci/controller/ +PCIE DRIVER FOR AMLOGIC MESON +M: Yue Wang +L: linux-...@vger.kernel.org +L: linux-amlo...@lists.infradead.org +S: Maintained +F: drivers/pci/controller/dwc/pci-meson.c + PCIE DRIVER FOR AXIS ARTPEC M: Jesper Nilsson L: linux-arm-ker...@axis.com diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig index 91b0194..7800322 100644 --- a/drivers/pci/controller/dwc/Kconfig +++ b/drivers/pci/controller/dwc/Kconfig @@ -193,4 +193,14 @@ config PCIE_HISI_STB help Say Y here if you want PCIe controller support on HiSilicon STB SoCs +config PCI_MESON + bool "MESON PCIe controller" + depends on PCI_MSI_IRQ_DOMAIN + select PCIE_DW_HOST + help + Say Y here if you want to enable PCI controller support on Amlogic + SoCs. The PCI controller on Amlogic is based on DesignWare hardware + and therefore the driver re-uses the DesignWare core functions to + implement the driver. + endmenu diff --git a/drivers/pci/controller/dwc/Makefile b/drivers/pci/controller/dwc/Makefile index fcf91ea..e05a015 100644 --- a/drivers/pci/controller/dwc/Makefile +++ b/drivers/pci/controller/dwc/Makefile @@ -14,6 +14,7 @@ obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o obj-$(CONFIG_PCIE_KIRIN) += pcie-kirin.o obj-$(CONFIG_PCIE_HISI_STB) += pcie-histb.o +obj-$(CONFIG_PCI_MESON) += pci-meson.o # The following drivers are for devices that use the generic ACPI # pci_root.c driver but don't support standard ECAM config access. diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c new file mode 100644 index 000..7993f9d --- /dev/null +++ b/drivers/pci/controller/dwc/pci-meson.c @@ -0,0 +1,595 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * PCIe host controller driver for Amlogic MESON SoCs + * + * Copyright (c) 2018 Amlogic, inc. + * Author: Yue Wang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "pcie-designware.h" + +#define to_meson_pcie(x) dev_get_drvdata((x)->dev) + +/* External local bus interface registers */ +#define PLR_OFFSET 0x700 +#define PCIE_PORT_LINK_CTRL_OFF(PLR_OFFSET + 0x10) +#define FAST_LINK_MODE BIT(7) +#define LINK_CAPABLE_MASK GENMASK(21, 16) +#define LINK_CAPABLE_X1BIT(16) + +#define PCIE_GEN2_CTRL_OFF (PLR_OFFSET + 0x10c) +#define NUM_OF_LANES_MASK GENMASK(12, 8) +#define NUM_OF_LANES_X1BIT(8) +#define DIRECT_SPEED_CHANGEBIT(17) + +#define TYPE1_HDR_OFFSET 0x0 +#define PCIE_STATUS_COMMAND(TYPE1_HDR_OFFSET + 0x04) +#define PCI_IO_EN BIT(0) +#define PCI_MEM_SPACE_EN BIT(1) +#define PCI_BUS_MASTER_EN BIT(2) + +#define PCIE_BASE_ADDR0(TYPE1_HDR_OFFSET + 0x10) +#define PCIE_BASE_ADDR1(TYPE1_HDR_OFFSET + 0x14) + +#define PCIE_CAP_OFFSET0x70 +#define PCIE_DEV_CTRL_DEV_STUS (PCIE_CAP_OFFSET + 0x08) +#define PCIE_CAP_MAX_PAYLOAD_MASK GENMASK(7, 5) +#define PCIE_CAP_MAX_PAYLOAD_SIZE(x) ((x) << 5) +#define PCIE_CAP_MAX_READ_REQ_MASK GENMASK(14, 12) +#define PCIE_CAP_MAX_READ_REQ_SIZE(x) ((x) << 12) + +/* PCIe specific config registers */ +#define PCIE_CFG0 0x0 +#define APP_LTSSM_ENABLE BIT(7) + +#define PCIE_CFG_STATUS12 0x30 +#define IS_SMLH_LINK_UP(x) ((x) & (1 << 6)) +#define IS_RDLH_LINK_UP(x) ((x) & (1 << 16)) +#define IS_LTSSM_UP(x) x) >> 10) & 0x1f) == 0x11) + +#define PCIE_CFG_STATUS17 0x44 +#define PM_CURRENT_STATE(x)(((x) >> 7) & 0x1) + +#define WAIT_LINKUP_TIMEOUT4000 +#define PORT_CLK_RATE 1UL +#define MAX_PAYLOAD_SIZE 256 +#define MAX_READ_REQ_SIZE 256 +#define MESON_PCIE_PHY_POWERUP 0x1c +#define PCIE_RESET_DELAY 500 +#defin
[PATCH v4 0/6] Add ARMv8.3 pointer authentication for kvm guest
Hi, This patch series adds pointer authentication support for KVM guest and is based on top of Linux 4.20-rc5 and generic pointer authentication patch series[1]. The first two patch in this series was originally posted by Mark Rutland earlier[2,3] and contains some history of this work. Extension Overview: = The ARMv8.3 pointer authentication extension adds functionality to detect modification of pointer values, mitigating certain classes of attack such as stack smashing, and making return oriented programming attacks harder. The extension introduces the concept of a pointer authentication code (PAC), which is stored in some upper bits of pointers. Each PAC is derived from the original pointer, another 64-bit value (e.g. the stack pointer), and a secret 128-bit key. New instructions are added which can be used to: * Insert a PAC into a pointer * Strip a PAC from a pointer * Authenticate and strip a PAC from a pointer The detailed description of ARMv8.3 pointer authentication support in userspace/kernel can be found in Kristina's generic pointer authentication patch series[1]. KVM guest work: == If pointer authentication is enabled for KVM guests then the new PAC intructions will not trap to EL2. If not then they may be ignored if in HINT region or trapped in EL2 as illegal instruction. Since KVM guest vcpu runs as a thread so they have a key initialised which will be used by PAC. When world switch happens between host and guest then this key is exchanged. There were some review comments by Christoffer Dall in the original series[2,3,4] and this patch series tries to implement them. The original series enabled pointer authentication for both userspace and kvm userspace. However it is now bifurcated and this series contains only KVM guest support. Changes since v3 [4]: * Use pointer authentication only when VHE is present as ARM8.3 implies ARM8.1 features to be present. * Added lazy context handling of ptrauth instructions from V2 version again. * Added more details in Documentation. * Rebased to new version of generic ptrauth patches [1]. Changes since v2 [2,3]: * Allow host and guest to have different HCR_EL2 settings and not just constant value HCR_HOST_VHE_FLAGS or HCR_HOST_NVHE_FLAGS. * Optimise the reading of HCR_EL2 in host/guest switch by fetching it once during KVM initialisation state and using it later. * Context switch pointer authentication keys when switching between guest and host. Pointer authentication was enabled in a lazy context earlier[2] and is removed now to make it simple. However it can be revisited later if there is significant performance issue. * Added a userspace option to choose pointer authentication. * Based on the userspace option, ptrauth cpufeature will be visible. * Based on the userspace option, ptrauth key registers will be accessible. * A small document is added on how to enable pointer authentication from userspace KVM API. Looking for feedback and comments. Thanks, Amit [1]: https://lkml.org/lkml/2018/12/7/666 [2]: https://lore.kernel.org/lkml/20171127163806.31435-11-mark.rutl...@arm.com/ [3]: https://lore.kernel.org/lkml/20171127163806.31435-10-mark.rutl...@arm.com/ [4]: https://lkml.org/lkml/2018/10/17/594 Linux (4.20-rc5 based): Amit Daniel Kachhap (5): arm64/kvm: preserve host HCR_EL2 value arm64/kvm: context-switch ptrauth registers arm64/kvm: add a userspace option to enable pointer authentication arm64/kvm: enable pointer authentication cpufeature conditionally arm64/kvm: control accessibility of ptrauth key registers Documentation/arm64/pointer-authentication.txt | 13 ++-- Documentation/virtual/kvm/api.txt | 4 ++ arch/arm/include/asm/kvm_host.h| 7 ++ arch/arm64/include/asm/cpufeature.h| 6 ++ arch/arm64/include/asm/kvm_asm.h | 2 + arch/arm64/include/asm/kvm_host.h | 41 +++- arch/arm64/include/asm/kvm_hyp.h | 7 ++ arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/kernel/traps.c | 1 + arch/arm64/kvm/handle_exit.c | 24 --- arch/arm64/kvm/hyp/Makefile| 1 + arch/arm64/kvm/hyp/ptrauth-sr.c| 89 + arch/arm64/kvm/hyp/switch.c| 19 -- arch/arm64/kvm/hyp/sysreg-sr.c | 11 arch/arm64/kvm/hyp/tlb.c | 6 +- arch/arm64/kvm/reset.c | 3 + arch/arm64/kvm/sys_regs.c | 91 -- include/uapi/linux/kvm.h | 1 + virt/kvm/arm/arm.c | 4 ++ 19 files changed, 289 insertions(+), 42 deletions(-) create mode 100644 arch/arm64/kvm/hyp/ptrauth-sr.c kvmtool: Repo: git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git Amit Daniel Kachhap (1):
Re: [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"
On Mon, Dec 17, 2018 at 05:39:49PM +0100, Michal Hocko wrote: > On Mon 17-12-18 17:03:50, Michal Hocko wrote: > > On Mon 17-12-18 16:59:22, Heiko Carstens wrote: > > > Hi Michal, > > > > > > with linux-next as of today on s390 I see tons of messages like > > > > > > [ 20.536664] page dumped because: has_unmovable_pages > > > [ 20.536792] page:03d081ff4080 count:1 mapcount:0 > > > mapping:8ff88600 index:0x0 compound_mapcount: 0 > > > [ 20.536794] flags: 0x3fffe010200(slab|head) > > > [ 20.536795] raw: 03fffe010200 0100 0200 > > > 8ff88600 > > > [ 20.536796] raw: 00200041 0001 > > > > > > [ 20.536797] page dumped because: has_unmovable_pages > > > [ 20.536814] page:03d0823b count:1 mapcount:0 > > > mapping: index:0x0 > > > [ 20.536815] flags: 0x7fffe00() > > > [ 20.536817] raw: 07fffe00 0100 0200 > > > > > > [ 20.536818] raw: 0001 > > > > > > > > > bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for > > > memory offline failures") > > > which is the first commit with which the messages appear. > > > > I would bet this is CMA allocator. How much is tons? Maybe we want a > > rate limit or the other user is not really interested in them at all? Yes, the system in question has a 4NB CMA area. "tons" translates to several hundred. > In other words, this should silence those messages. Yes, with the patch below applied the messages don't appear anymore. > diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h > index 4ae347cbc36d..4eb26d278046 100644 > --- a/include/linux/page-isolation.h > +++ b/include/linux/page-isolation.h > @@ -30,8 +30,11 @@ static inline bool is_migrate_isolate(int migratetype) > } > #endif > > +#define SKIP_HWPOISON0x1 > +#define REPORT_FAILURE 0x2 > + > bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > - int migratetype, bool skip_hwpoisoned_pages); > + int migratetype, int flags); > void set_pageblock_migratetype(struct page *page, int migratetype); > int move_freepages_block(struct zone *zone, struct page *page, > int migratetype, int *num_movable); > @@ -44,10 +47,14 @@ int move_freepages_block(struct zone *zone, struct page > *page, > * For isolating all pages in the range finally, the caller have to > * free all pages in the range. test_page_isolated() can be used for > * test it. > + * > + * The following flags are allowed (they can be combined in a bit mask) > + * SKIP_HWPOISON - ignore hwpoison pages > + * REPORT_FAILURE - report details about the failure to isolate the range > */ > int > start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, > - unsigned migratetype, bool skip_hwpoisoned_pages); > + unsigned migratetype, int flags); > > /* > * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE. > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index c82193db4be6..8537429d33a6 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1226,7 +1226,7 @@ static bool is_pageblock_removable_nolock(struct page > *page) > if (!zone_spans_pfn(zone, pfn)) > return false; > > - return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true); > + return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, > SKIP_HWPOISON); > } > > /* Checks if this range of memory is likely to be hot-removable. */ > @@ -1577,7 +1577,8 @@ static int __ref __offline_pages(unsigned long > start_pfn, > > /* set above range as isolated */ > ret = start_isolate_page_range(start_pfn, end_pfn, > -MIGRATE_MOVABLE, true); > +MIGRATE_MOVABLE, > +SKIP_HWPOISON | REPORT_FAILURE); > if (ret) { > mem_hotplug_done(); > reason = "failure to isolate range"; > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index ec2c7916dc2d..ee4043419791 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -7754,8 +7754,7 @@ void *__init alloc_large_system_hash(const char > *tablename, > * race condition. So you can't expect this function should be exact. > */ > bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > - int migratetype, > - bool skip_hwpoisoned_pages) > + int migratetype, int flags) > { > unsigned long pfn, iter, found; > > @@ -7818,7 +7817,7 @@ bool has_unmovable_pages(struct zone *zone, struct page > *page, int count, >* The HWPoisoned page may be not in buddy system, and >* pag
Re: [alsa-devel] [PATCH] ASoC: sdm845: set jack only for a specific backend
On Fri, Dec 14, 2018 at 6:02 PM Rohit kumar wrote: > > Headset codec is connected over PRIMARY_MI2S interface. Call > set_jack for codec associated with Primary Mi2s interface. > Also, set_jack to NULL when jack is freed. > > Signed-off-by: Rohit kumar > --- > sound/soc/qcom/sdm845.c | 31 ++- > 1 file changed, 22 insertions(+), 9 deletions(-) > > diff --git a/sound/soc/qcom/sdm845.c b/sound/soc/qcom/sdm845.c > index 1db8ef66..6f66a58 100644 > --- a/sound/soc/qcom/sdm845.c > +++ b/sound/soc/qcom/sdm845.c > @@ -158,17 +158,24 @@ static int sdm845_snd_hw_params(struct > snd_pcm_substream *substream, > return ret; > } > > +static void sdm845_jack_free(struct snd_jack *jack) > +{ > + struct snd_soc_component *component = jack->private_data; > + > + snd_soc_component_set_jack(component, NULL, NULL); > +} > + > static int sdm845_dai_init(struct snd_soc_pcm_runtime *rtd) > { > struct snd_soc_component *component; > - struct snd_soc_dai_link *dai_link = rtd->dai_link; > struct snd_soc_card *card = rtd->card; > + struct snd_soc_dai *codec_dai = rtd->codec_dai; > + struct snd_soc_dai *cpu_dai = rtd->cpu_dai; > struct sdm845_snd_data *pdata = snd_soc_card_get_drvdata(card); > - int i, rval; > + struct snd_jack *jack; > + int rval; > > if (!pdata->jack_setup) { > - struct snd_jack *jack; > - > rval = snd_soc_card_jack_new(card, "Headset Jack", > SND_JACK_HEADSET | > SND_JACK_HEADPHONE | > @@ -190,16 +197,22 @@ static int sdm845_dai_init(struct snd_soc_pcm_runtime > *rtd) > pdata->jack_setup = true; > } > > - for (i = 0 ; i < dai_link->num_codecs; i++) { > - struct snd_soc_dai *dai = rtd->codec_dais[i]; > + switch (cpu_dai->id) { > + case PRIMARY_MI2S_RX: > + jack = pdata->jack.jack; > + component = codec_dai->component; > > - component = dai->component; > - rval = snd_soc_component_set_jack( > - component, &pdata->jack, NULL); > + jack->private_data = component; > + jack->private_free = sdm845_jack_free; > + rval = snd_soc_component_set_jack(component, > + &pdata->jack, NULL); > if (rval != 0 && rval != -ENOTSUPP) { > dev_warn(card->dev, "Failed to set jack: %d\n", rval); > return rval; > } > + break; > + default: > + break; > } > > return 0; > -- > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., > is a member of Code Aurora Forum, a Linux Foundation Collaborative Project. > > ___ > Alsa-devel mailing list > alsa-de...@alsa-project.org > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel Thanks a lot for the fix! Reviewed-by: Cheng-Yi Chiang Tested-by: Cheng-Yi Chiang
Re: [PATCH v1 00/12] Add some functionalities for Tegra soctherm
Sorry, please ignore this cover-letter. Wei. On 18/12/2018 3:34 PM, Wei Ni wrote: > Move the hw/sw shutdown patches into this serial. There already have > some discussion for it in https://lkml.org/lkml/2018/12/7/225. > Support GPU HW throttle, thermal IRQ, set_trips(), EDP IRQ and OC > hw throttle. > > Wei Ni (12): > of: Add bindings of thermtrip for Tegra soctherm > thermal: tegra: support hw and sw shutdown > arm64: dts: tegra210: set thermtrip > of: Add bindings of gpu hw throttle for Tegra soctherm > thermal: tegra: add support for gpu hw-throttle > arm64: dts: tegra210: set gpu hw throttle level > thermal: tegra: add support for thermal IRQ > thermal: tegra: add set_trips functionality > thermal: tegra: add support for EDP IRQ > arm64: dts: tegra210: set EDP interrupt line > of: Add bindings of OC hw throttle for Tegra soctherm > thermal: tegra: enable OC hw throttle > > .../bindings/thermal/nvidia,tegra124-soctherm.txt | 63 +- > arch/arm64/boot/dts/nvidia/tegra210.dtsi | 20 +- > drivers/thermal/tegra/soctherm.c | 955 > +++-- > drivers/thermal/tegra/soctherm.h | 16 + > drivers/thermal/tegra/tegra124-soctherm.c | 7 +- > drivers/thermal/tegra/tegra132-soctherm.c | 7 +- > drivers/thermal/tegra/tegra210-soctherm.c | 15 +- > include/dt-bindings/thermal/tegra124-soctherm.h| 22 +- > 8 files changed, 1029 insertions(+), 76 deletions(-) >
Re: [PATCH V2 00/10] unify the interface of the proportional-share policy in blkio/io
[RESENDING BECAUSE BOUNCED] > Il giorno 10 dic 2018, alle ore 14:45, Angelo Ruocco > ha scritto: > > 2018-11-30 19:53 GMT+01:00, Paolo Valente : >> >> >>> Il giorno 30 nov 2018, alle ore 19:42, Tejun Heo ha >>> scritto: >>> >>> Hello, Paolo. >>> >>> On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote: > Then we understood that exactly the same happens with throttling, in > case the latter is activated on different devices w.r.t. bfq. > > In addition, the same may happen, in the near future, with the > bandwidth controller Josef is working on. If the controller can be > configured per device, as with throttling, then statistics may differ, > for the same interface files, between bfq, throttling and that > controller. >>> >>> So, regardless of how all these are implemented, what's presented to >>> user should be consistent and clear. There's no other way around it. >>> Only what's relevant should be visible to userspace. >>> have you had time to look into this? Any improvement to this interface is ok for us. We are only interested in finally solving this interface issue, as, for what concerns us directly, it has been preventing legacy code to use bfq for years. >>> >>> Unfortunately, I don't have any implementation proposal, but we can't >>> show things this way to userspace. >>> >> >> Well, this is not very helpful to move forward :) >> >> Let me try to repeat the problem, to try to help you help us unblock >> the situation. >> >> If we have multiple entities attached to the same interface output >> file, you don't find it clear that each entity shows the number it >> wants to show. But you have no idea either of how that differentiated >> information should be shown. Is this the situation, or is the problem >> somewhere 'above' this level? >> >> If the problem is as I described it, here are some proposal attempts: >> 1) Do you want file sharing to be allowed only if all entities will >> output the same number? (this seems excessive, but maybe it makes >> sense) >> 2) Do you want only one number to be shown, equal to the sum of the >> numbers of each entity? (in some cases, this may make sense) >> 3) Do you prefer an average? >> 4) Do you have any other idea, even if just germinal? > > To further add to what Paolo said and better expose the problem, I'd like to > say that all those proposals have issues. > If we only allow "same output" cftypes to be shared then we lose all the > flexibility of this solution, and we need a way for an entity to know other > entities internal variables beforehand, which sounds at least very hard, and > maybe is not even an acceptable thing to do. > To put the average, sum or some other mathematical function in the file only > makes sense for certain cftypes, so also doesn't sound like a good idea. In > fact I can think of scenarios where only seeing the different values of the > entities makes sense for a user. > > I understand that the problem is inconsistency: having a file that behaves > differently depending on the situation, and the only way to prevent this I can > think of is to *always* show the entity owner of a certain file (or part of > the > output), even when the output would be the same among entities or when the > file is not currently shared but could be. Can this be an acceptable solution? > > Angelo > Hi Jens, all, let me push for this interface to be fixed too. If we don't fix it in some way, then from 4.21 we well end up with a ridiculous paradox: the proportional share policy (weights) will of course be available, but unusable in practice. In fact, as Lennart--and not only Lennart--can confirm, no piece of code uses bfq.weight to set weights, or will do it. A trivial solution would be to throw away all our work to fix this issue by extending the interface, and just let bfq use the former cfq names. But then the same mess will happen as, e.g., Josef will propose his proportional-share controller. Before making this solution, we proposed it and waited for it to be approved several months ago, so I hope that Tejun concern can be addressed somehow. If Tejun cannot see any solution to his concern, then can we just switch to this extension, considering that - for non-shared names the interface is *identical* to the current one; - by using this new interface, and getting feedback we could understand how to better handle Tejun's concern? A lot of systems do use weights, and people don't even know that these systems don't work correctly in blk-mq. And they won't work correctly in any available configuration from 4.21, if we don't fix this problem. Thanks. Paolo >> >> Looking forward to your feedback, >> Paolo >> >> >>> Thanks. >>> >>> -- >>> tejun >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "bfq-iosched" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to bfq-io
Re: [PATCH v11 6/7] ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set
On 2018/12/18 10:56, Sinan Kaya wrote: > Remove PCI dependent code out of iort.c when CONFIG_PCI is not defined. > A quick search reveals the following functions: > 1. pci_request_acs() > 2. pci_domain_nr() > 3. pci_is_root_bus() > 4. to_pci_dev() > > Both pci_domain_nr() and pci_is_root_bus() are defined in linux/pci.h. > pci_domain_nr() is a stub function when CONFIG_PCI is not set and > pci_is_root_bus() just returns a reference to a structure member which > is still valid without CONFIG_PCI set. > > to_pci_dev() is a macro that expands to container_of. > > pci_request_acs() is the only code that gets pulled in from drivers/pci/*.c Actually we have pci_for_each_dma_alias() too which is from drivers/pci/search.c without stub function in linux/pci.h, but I didn't get the compile error at link time, I think the compiler just do some optimization to remove the dead code because dev_is_pci() is obvious false. Thanks Hanjun
Re: [PATCH v4 2/2] trace nvme submit queue status
On Mon, Dec 17, 2018 at 11:26 PM Sagi Grimberg wrote: > > > > @@ -899,6 +900,10 @@ static inline void nvme_handle_cqe(struct nvme_queue > > *nvmeq, u16 idx) > > } > > > > req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id); > > + trace_nvme_sq(req->rq_disk, > > + nvmeq->qid, > > + le16_to_cpu(cqe->sq_head), > > + nvmeq->sq_tail); > > Why the newline escapes? why not escape at the 80 char border? > Sorry, I don't quite understand your meaning. Do you mean I'd better change this: trace_nvme_sq(req->rq_disk, nvmeq->qid, le16_to_cpu(cqe->sq_head), nvmeq->sq_tail); to something like below: trace_nvme_sq(req->rq_disk, nvmeq->qid, le16_to_cpu(cqe->sq_head), nvmeq->sq_tail); Please let me know whether my understanding is correct.
RE: [PATCH 02/18] mfd: adp5520: Make it explicitly non-modular
> -Original Message- > From: Paul Gortmaker [mailto:paul.gortma...@windriver.com] > Sent: Montag, 17. Dezember 2018 21:31 > To: Lee Jones > Cc: linux-kernel@vger.kernel.org; Paul Gortmaker > ; Hennerich, Michael > > Subject: [PATCH 02/18] mfd: adp5520: Make it explicitly non-modular > > The Makefile/Kconfig currently controlling compilation of this code is: > > drivers/mfd/Makefile:obj-$(CONFIG_PMIC_ADP5520) += adp5520.o > drivers/mfd/Kconfig:config PMIC_ADP5520 > drivers/mfd/Kconfig:bool "Analog Devices ADP5520/01 MFD PMIC Core Support" > > ...meaning that it currently is not being built as a module by anyone. > > Lets remove the modular code that is essentially orphaned, so that > when reading the driver there is no doubt it is builtin-only. > > We explicitly disallow a driver unbind, since that doesn't have a > sensible use case anyway, and it allows us to drop the ".remove" > code for non-modular drivers. > > Since module_i2c_driver() uses the same init level priority as > builtin_i2c_driver() the init ordering remains unchanged with > this commit. > > Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. > > We also delete the MODULE_LICENSE tag etc. since all that information > was (or is now) contained at the top of the file in the comments. > > Cc: Michael Hennerich > Cc: Lee Jones > Signed-off-by: Paul Gortmaker > Acked-by: Linus Walleij Acked-by: Michael Hennerich > --- > drivers/mfd/adp5520.c | 30 +++--- > 1 file changed, 7 insertions(+), 23 deletions(-) > > diff --git a/drivers/mfd/adp5520.c b/drivers/mfd/adp5520.c > index be0497b96720..2cdd39cb8a18 100644 > --- a/drivers/mfd/adp5520.c > +++ b/drivers/mfd/adp5520.c > @@ -7,6 +7,8 @@ > * > * Copyright 2009 Analog Devices Inc. > * > + * Author: Michael Hennerich > + * > * Derived from da903x: > * Copyright (C) 2008 Compulab, Ltd. > * Mike Rapoport > @@ -18,7 +20,7 @@ > */ > > #include > -#include > +#include > #include > #include > #include > @@ -304,18 +306,6 @@ static int adp5520_probe(struct i2c_client *client, > return ret; > } > > -static int adp5520_remove(struct i2c_client *client) > -{ > - struct adp5520_chip *chip = dev_get_drvdata(&client->dev); > - > - if (chip->irq) > - free_irq(chip->irq, chip); > - > - adp5520_remove_subdevs(chip); > - adp5520_write(chip->dev, ADP5520_MODE_STATUS, 0); > - return 0; > -} > - > #ifdef CONFIG_PM_SLEEP > static int adp5520_suspend(struct device *dev) > { > @@ -346,20 +336,14 @@ static const struct i2c_device_id adp5520_id[] = { > { "pmic-adp5501", ID_ADP5501 }, > { } > }; > -MODULE_DEVICE_TABLE(i2c, adp5520_id); > > static struct i2c_driver adp5520_driver = { > .driver = { > - .name = "adp5520", > - .pm = &adp5520_pm, > + .name = "adp5520", > + .pm = &adp5520_pm, > + .suppress_bind_attrs= true, > }, > .probe = adp5520_probe, > - .remove = adp5520_remove, > .id_table = adp5520_id, > }; > - > -module_i2c_driver(adp5520_driver); > - > -MODULE_AUTHOR("Michael Hennerich "); > -MODULE_DESCRIPTION("ADP5520(01) PMIC-MFD Driver"); > -MODULE_LICENSE("GPL"); > +builtin_i2c_driver(adp5520_driver); > -- > 2.7.4
linux-next: Tree for Dec 18
Hi all, Changes since 20181217: The nfs-anna tree lost its build failure. The vfs tree gained a conflict against the fscrypt tree. The hwmon-staging tree lost its build failure. The rdma tree still had its build failure so I used a supplied patch. The net-next tree lost its build failure. The mfd tree lost its build failure. The selinux tree gained a conflict against the vfs tree. The gpio tree lost its build failure. The y2038 tree lost its build failure. Non-merge commits (relative to Linus' tree): 9464 9665 files changed, 463878 insertions(+), 252256 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 291 trees (counting Linus' and 69 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (7566ec393f41 Linux 4.20-rc7) Merging fixes/master (d8c137546ef8 powerpc: tag implicit fall throughs) Merging kbuild-current/fixes (ccda4af0f4b9 Linux 4.20-rc2) Merging arc-current/for-curr (4c567a448b30 ARC: perf: remove useless ifdefs) Merging arm-current/fixes (c2a3831df6dc ARM: 8816/1: dma-mapping: fix potential uninitialized return) Merging arm64-fixes/for-next/fixes (3238c359acee arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing) Merging m68k-current/for-linus (58c116fb7dc6 m68k/sun3: Remove is_medusa and m68k_pgtable_cachemode) Merging powerpc-fixes/fixes (a225f1567405 powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call) Merging sparc/master (cf76c364a1e1 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi) Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2) Merging net/master (369a094d500f Merge branch 'hns-fixes') Merging bpf/master (8203e2d844d3 net: clear skb->tstamp in forwarding paths) Merging ipsec/master (4a135e538962 xfrm_user: fix freeing of xfrm states on acquire) Merging netfilter/master (9e69efd45321 Merge branch 'vhost-fixes') Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates of non-anonymous set) Merging wireless-drivers/master (eca1e56ceedd iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT to old firmwares) Merging mac80211/master (312ca38ddda6 cfg80211: Fix busy loop regression in ieee80211_ie_split_ric()) Merging rdma-fixes/for-rc (37fbd834b4e4 IB/core: Fix oops in netdev_next_upper_dev_rcu()) Merging sound-current/for-linus (0bea4cc83835 ALSA: hda/realtek: Enable audio jacks of ASUS UX433FN/UX333FA with ALC294) Merging sound-asoc-fixes/for-linus (9683863aecfe Merge branch 'asoc-4.20' into asoc-linus) Merging regmap-fixes/for-linus (40e020c129cf Linux 4.20-rc6) Merging regulator-fixes/for-linus (1a365a5c7959 Merge branch 'regulator-4.20' into regulator-linus) Merging spi-fixes/for-linus (6594150c7db6 Merge branch 'spi-4.20' into spi-linus) Merging pci-current/for-linus (1063a5148ac9 PCI/AER: Queue one GHES event, not several uninitialized ones) Merging driver-core.current/driver-core-linus (2595646791c3 Linux 4.20-rc5) Merging tty.current/tty-linus (3c9dc275dba1 Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again") Merging usb.current/usb-linus (2419f30a4a4f USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd) Merging usb-gadget-fixes/fixes (069caf5
Re: [PATCH v5 2/2] media: usb: pwc: Don't use coherent DMA buffers for ISO transfer
On Tue, Dec 18, 2018 at 04:22:43PM +0900, Tomasz Figa wrote: > It kind of limits the usability of this API, since it enforces > contiguous allocations even for big sizes even for devices behind > IOMMU (contrary to the case when DMA_ATTR_NON_CONSISTENT is not set), > but given that it's just a temporary solution for devices like these > USB cameras, I guess that's fine. The problem is that you can't have flexibility and simplicity at the same time. Once you use kernel virtual address remapping you need to be prepared to have multiple segments. So as I said you can call dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT in a loop with a suitably small chunk size, then stuff the results into a scatterlist and map that again for the device share with if you don't want a single contigous region. You just have to either deal with non-contigous access from the kernel or use vmap and the right vmap cache flushing helpers. > Note that in V4L2 we use the DMA API extensively, so that we don't > need to embed any device-specific or integration-specific knowledge in > the framework. Right now we're using dma_alloc_attrs() with > driver-provided attrs [1], but current driver never request > non-consistent memory. We're however thinking about making it possible > to allocate non-consistent memory. What would you suggest for this? > > [1] > https://elixir.bootlin.com/linux/v4.20-rc7/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L139 I would advice against new non-consistent users until this series goes through, mostly because dma_cache_sync is such an amazing bad API. Otherwise things will just work at the allocation side, you'll just need to be careful to transfer ownership between the cpu and the device(s) carefully using the dma_sync_* APIs.
Re: [PATCH v2] mm, page_alloc: Fix has_unmovable_pages for HugePages
On Mon 17-12-18 15:07:26, Andrew Morton wrote: > On Mon, 17 Dec 2018 23:51:13 +0100 Oscar Salvador wrote: > > > v1 -> v2: > > - Fix the logic for skipping pages by Michal > > > > --- > > Please be careful with the "^---$". It signifies end-of-changelog, so > I ended up without a changelog! > > > >From e346b151037d3c37feb10a981a4d2a25018acf81 Mon Sep 17 00:00:00 2001 > > From: Oscar Salvador > > Date: Mon, 17 Dec 2018 14:53:35 +0100 > > Subject: [PATCH] mm, page_alloc: Fix has_unmovable_pages for HugePages > > > > While playing with gigantic hugepages and memory_hotplug, I triggered > > the following #PF when "cat memoryX/removable": > > > > ... > > > > Also, since gigantic pages span several pageblocks, re-adjust the logic > > for skipping pages. > > > > Signed-off-by: Oscar Salvador Acked-by: Michal Hocko > cc:stable? See http://lkml.kernel.org/r/20181217152936.gr30...@dhcp22.suse.cz. I believe nobody is simply using gigantic pages and hotplug at the same time and those pages do not seem to cross cma regions as well. At least not since hugepage_migration_supported stops reporting giga pages as migrateable. That being said, I do not think we really need it in stable but it should be relatively easy to backport so no objection from me to put it there. -- Michal Hocko SUSE Labs
[PATCH v1 07/12] thermal: tegra: add support for thermal IRQ
Support to generate an interrupt when the temperature crosses a programmed threshold and notify the thermal framework. Signed-off-by: Wei Ni --- drivers/thermal/tegra/soctherm.c | 136 +++ 1 file changed, 136 insertions(+) diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c index d3cef88a3f22..c66fdd546ef0 100644 --- a/drivers/thermal/tegra/soctherm.c +++ b/drivers/thermal/tegra/soctherm.c @@ -86,6 +86,20 @@ #define THERMCTL_LVL0_UP_STATS 0x10 #define THERMCTL_LVL0_DN_STATS 0x14 +#define THERMCTL_INTR_STATUS 0x84 +#define THERMCTL_INTR_ENABLE 0x88 +#define THERMCTL_INTR_DISABLE 0x8c + +#define TH_INTR_MD0_MASK BIT(25) +#define TH_INTR_MU0_MASK BIT(24) +#define TH_INTR_GD0_MASK BIT(17) +#define TH_INTR_GU0_MASK BIT(16) +#define TH_INTR_CD0_MASK BIT(9) +#define TH_INTR_CU0_MASK BIT(8) +#define TH_INTR_PD0_MASK BIT(1) +#define TH_INTR_PU0_MASK BIT(0) +#define TH_INTR_IGNORE_MASK0xFCFCFCFC + #define THERMCTL_STATS_CTL 0x94 #define STATS_CTL_CLR_DN 0x8 #define STATS_CTL_EN_DN0x4 @@ -242,6 +256,8 @@ struct tegra_soctherm { void __iomem *clk_regs; void __iomem *ccroc_regs; + int thermal_irq; + u32 *calib; struct thermal_zone_device **thermctl_tzs; struct tegra_soctherm_soc *soc; @@ -640,6 +656,98 @@ static int tegra_soctherm_set_hwtrips(struct device *dev, return 0; } +static irqreturn_t soctherm_thermal_isr(int irq, void *dev_id) +{ + struct tegra_soctherm *ts = dev_id; + u32 r; + + r = readl(ts->regs + THERMCTL_INTR_STATUS); + writel(r, ts->regs + THERMCTL_INTR_DISABLE); + + return IRQ_WAKE_THREAD; +} + +/** + * soctherm_thermal_isr_thread() - Handles a thermal interrupt request + * @irq: The interrupt number being requested; not used + * @dev_id:Opaque pointer to tegra_soctherm; + * + * Clears the interrupt status register if there are expected + * interrupt bits set. + * The interrupt(s) are then handled by updating the corresponding + * thermal zones. + * + * An error is logged if any unexpected interrupt bits are set. + * + * Disabled interrupts are re-enabled. + * + * Return: %IRQ_HANDLED. Interrupt was handled and no further processing + * is needed. + */ +static irqreturn_t soctherm_thermal_isr_thread(int irq, void *dev_id) +{ + struct tegra_soctherm *ts = dev_id; + struct thermal_zone_device *tz; + u32 st, ex = 0, cp = 0, gp = 0, pl = 0, me = 0; + + st = readl(ts->regs + THERMCTL_INTR_STATUS); + + /* deliberately clear expected interrupts handled in SW */ + cp |= st & TH_INTR_CD0_MASK; + cp |= st & TH_INTR_CU0_MASK; + + gp |= st & TH_INTR_GD0_MASK; + gp |= st & TH_INTR_GU0_MASK; + + pl |= st & TH_INTR_PD0_MASK; + pl |= st & TH_INTR_PU0_MASK; + + me |= st & TH_INTR_MD0_MASK; + me |= st & TH_INTR_MU0_MASK; + + ex |= cp | gp | pl | me; + if (ex) { + writel(ex, ts->regs + THERMCTL_INTR_STATUS); + st &= ~ex; + + if (cp) { + tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_CPU]; + thermal_zone_device_update(tz, + THERMAL_EVENT_UNSPECIFIED); + } + + if (gp) { + tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_GPU]; + thermal_zone_device_update(tz, + THERMAL_EVENT_UNSPECIFIED); + } + + if (pl) { + tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_PLLX]; + thermal_zone_device_update(tz, + THERMAL_EVENT_UNSPECIFIED); + } + + if (me) { + tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_MEM]; + thermal_zone_device_update(tz, + THERMAL_EVENT_UNSPECIFIED); + } + } + + /* deliberately ignore expected interrupts NOT handled in SW */ + ex |= TH_INTR_IGNORE_MASK; + st &= ~ex; + + if (st) { + /* Whine about any other unexpected INTR bits still set */ + pr_err("soctherm: Ignored unexpected INTRs 0x%08x\n", st); + writel(st, ts->regs + THERMCTL_INTR_STATUS); + } + + return IRQ_HANDLED; +} + #ifdef CONFIG_DEBUG_FS static int regs_show(struct seq_file *s, void *data) { @@ -1312,6 +1420,32 @@ static void tegra_soctherm_throttle(struct devic
[PATCH v1 03/12] arm64: dts: tegra210: set thermtrip
Set "nvidia,thermtrips" property, it used to set HW shutdown temperatures. Signed-off-by: Wei Ni --- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi index 2205d66b0443..36c7dce7fa69 100644 --- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi @@ -1332,6 +1332,9 @@ reset-names = "soctherm"; #thermal-sensor-cells = <1>; + nvidia,thermtrips = ; + throttle-cfgs { throttle_heavy: heavy { nvidia,priority = <100>; @@ -1351,8 +1354,8 @@ <&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>; trips { - cpu-shutdown-trip { - temperature = <102500>; + cpu-critical-trip { + temperature = <102000>; hysteresis = <0>; type = "critical"; }; @@ -1379,7 +1382,7 @@ <&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>; trips { - mem-shutdown-trip { + mem-critical-trip { temperature = <103000>; hysteresis = <0>; type = "critical"; @@ -1401,8 +1404,8 @@ <&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>; trips { - gpu-shutdown-trip { - temperature = <103000>; + gpu-critical-trip { + temperature = <102500>; hysteresis = <0>; type = "critical"; }; @@ -1429,7 +1432,7 @@ <&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>; trips { - pllx-shutdown-trip { + pllx-critical-trip { temperature = <103000>; hysteresis = <0>; type = "critical"; -- 2.7.4
[PATCH v1 01/12] of: Add bindings of thermtrip for Tegra soctherm
Add optional property "nvidia,thermtrips". If present, these trips will be used as HW shutdown trips, and critical trips will be used as SW shutdown trips. Signed-off-by: Wei Ni --- .../bindings/thermal/nvidia,tegra124-soctherm.txt| 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt index b6c0ae53d4dc..ab66d6feab4b 100644 --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt @@ -55,10 +55,21 @@ Required properties : - #cooling-cells: Should be 1. This cooling device only support on/off state. See ./thermal.txt for a description of this property. +Optional properties: +- nvidia,thermtrips : When present, this property specifies the temperature at + which the soctherm hardware will assert the thermal trigger signal to the + Power Management IC, which can be configured to reset or shutdown the device. + It is an array of pairs where each pair represents a tsensor id followed by a + temperature in milli Celcius. In the absence of this property the critical + trip point will be used for thermtrip temperature. + Note: -- the "critical" type trip points will be set to SOC_THERM hardware as the -shut down temperature. Once the temperature of this thermal zone is higher -than it, the system will be shutdown or reset by hardware. +- the "critical" type trip points will be used to set the temperature at which +the SOC_THERM hardware will assert a thermal trigger if the "nvidia,thermtrips" +property is missing. When the thermtrips property is present, the breach of a +critical trip point is reported back to the thermal framework to implement +software shutdown. + - the "hot" type trip points will be set to SOC_THERM hardware as the throttle temperature. Once the the temperature of this thermal zone is higher than it, it will trigger the HW throttle event. @@ -79,6 +90,9 @@ Example : #thermal-sensor-cells = <1>; + nvidia,thermtrips = ; + throttle-cfgs { /* * When the "heavy" cooling device triggered, -- 2.7.4
[PATCH v1 02/12] thermal: tegra: support hw and sw shutdown
Currently the critical trip points in thermal framework are the only way to specify a temperature at which HW should shutdown. This is insufficient for certain platforms which would want an orderly software shutdown in addition to HW shutdown. This change support to parse "nvidia, thermtrips" property, it allows soctherm DT to specify thermtrip temperatures so that critical trip points framework can be used for doing software shutdown. Signed-off-by: Wei Ni --- drivers/thermal/tegra/soctherm.c | 99 ++- drivers/thermal/tegra/soctherm.h | 6 ++ drivers/thermal/tegra/tegra210-soctherm.c | 8 +++ 3 files changed, 98 insertions(+), 15 deletions(-) diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c index ed28110a3535..673c3ffa9001 100644 --- a/drivers/thermal/tegra/soctherm.c +++ b/drivers/thermal/tegra/soctherm.c @@ -446,6 +446,24 @@ find_throttle_cfg_by_name(struct tegra_soctherm *ts, const char *name) return NULL; } +static int tsensor_group_thermtrip_get(struct tegra_soctherm *ts, int id) +{ + int i, temp = min_low_temp; + struct tsensor_group_thermtrips *tt = ts->soc->thermtrips; + + if (id >= TEGRA124_SOCTHERM_SENSOR_NUM) + return temp; + + if (tt) { + for (i = 0; i < ts->soc->num_ttgs; i++) { + if (tt[i].id == id) + return tt[i].temp; + } + } + + return temp; +} + static int tegra_thermctl_set_trip_temp(void *data, int trip, int temp) { struct tegra_thermctl_zone *zone = data; @@ -464,7 +482,16 @@ static int tegra_thermctl_set_trip_temp(void *data, int trip, int temp) return ret; if (type == THERMAL_TRIP_CRITICAL) { - return thermtrip_program(dev, sg, temp); + /* +* If thermtrips property is set in DT, +* doesn't need to program critical type trip to HW, +* if not, program critical trip to HW. +*/ + if (min_low_temp == tsensor_group_thermtrip_get(ts, sg->id)) + return thermtrip_program(dev, sg, temp); + else + return 0; + } else if (type == THERMAL_TRIP_HOT) { int i; @@ -523,7 +550,8 @@ static int get_hot_temp(struct thermal_zone_device *tz, int *trip, int *temp) * @dev: struct device * of the SOC_THERM instance * * Configure the SOC_THERM HW trip points, setting "THERMTRIP" - * "THROTTLE" trip points , using "critical" or "hot" type trip_temp + * "THROTTLE" trip points , using "thermtrips", "critical" or "hot" + * type trip_temp * from thermal zone. * After they have been configured, THERMTRIP or THROTTLE will take * action when the configured SoC thermal sensor group reaches a @@ -545,28 +573,23 @@ static int tegra_soctherm_set_hwtrips(struct device *dev, { struct tegra_soctherm *ts = dev_get_drvdata(dev); struct soctherm_throt_cfg *stc; - int i, trip, temperature; - int ret; + int i, trip, temperature, ret; - ret = tz->ops->get_crit_temp(tz, &temperature); - if (ret) { - dev_warn(dev, "thermtrip: %s: missing critical temperature\n", -sg->name); - goto set_throttle; - } + /* Get thermtrips. If missing, try to get critical trips. */ + temperature = tsensor_group_thermtrip_get(ts, sg->id); + if (min_low_temp == temperature) + if (tz->ops->get_crit_temp(tz, &temperature)) + temperature = max_high_temp; ret = thermtrip_program(dev, sg, temperature); if (ret) { - dev_err(dev, "thermtrip: %s: error during enable\n", - sg->name); + dev_err(dev, "thermtrip: %s: error during enable\n", sg->name); return ret; } - dev_info(dev, -"thermtrip: will shut down when %s reaches %d mC\n", + dev_info(dev, "thermtrip: will shut down when %s reaches %d mC\n", sg->name, temperature); -set_throttle: ret = get_hot_temp(tz, &trip, &temperature); if (ret) { dev_warn(dev, "throttrip: %s: missing hot temperature\n", @@ -907,6 +930,50 @@ static const struct thermal_cooling_device_ops throt_cooling_ops = { .set_cur_state = throt_set_cdev_state, }; +static int soctherm_thermtrips_parse(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct tegra_soctherm *ts = dev_get_drvdata(dev); + struct tsensor_group_thermtrips *tt = ts->soc->thermtrips; + const int max_num_prop = ts->soc->num_ttgs * 2; + u32 *tlb; + int i, j, n, ret; + + if (!tt) + return -ENOMEM; + + n = of_property_count_u32_elems(dev->of_node, "nvidia,thermtrips"); + if (n <= 0) { +
[PATCH v1 09/12] thermal: tegra: add support for EDP IRQ
Add support to generate OC (over-current) interrupts to indicate the OC event and print out alarm messages. Signed-off-by: Wei Ni --- drivers/thermal/tegra/soctherm.c | 420 +++ 1 file changed, 420 insertions(+) diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c index eefbb29b3b7d..37108f2290f9 100644 --- a/drivers/thermal/tegra/soctherm.c +++ b/drivers/thermal/tegra/soctherm.c @@ -23,6 +23,8 @@ #include #include #include +#include +#include #include #include #include @@ -104,6 +106,16 @@ #define STATS_CTL_CLR_UP 0x2 #define STATS_CTL_EN_UP0x1 +#define OC_INTR_STATUS 0x39c +#define OC_INTR_ENABLE 0x3a0 +#define OC_INTR_DISABLE0x3a4 + +#define OC_INTR_OC1_MASK BIT(0) +#define OC_INTR_OC2_MASK BIT(1) +#define OC_INTR_OC3_MASK BIT(2) +#define OC_INTR_OC4_MASK BIT(3) +#define OC_INTR_OC5_MASK BIT(4) + #define THROT_GLOBAL_CFG 0x400 #define THROT_GLOBAL_ENB_MASK BIT(0) @@ -212,9 +224,23 @@ static const int max_high_temp = 127000; enum soctherm_throttle_id { THROTTLE_LIGHT = 0, THROTTLE_HEAVY, + THROTTLE_OC1, + THROTTLE_OC2, + THROTTLE_OC3, + THROTTLE_OC4, + THROTTLE_OC5, /* OC5 is reserved */ THROTTLE_SIZE, }; +enum soctherm_oc_irq_id { + TEGRA_SOC_OC_IRQ_1, + TEGRA_SOC_OC_IRQ_2, + TEGRA_SOC_OC_IRQ_3, + TEGRA_SOC_OC_IRQ_4, + TEGRA_SOC_OC_IRQ_5, + TEGRA_SOC_OC_IRQ_MAX, +}; + enum soctherm_throttle_dev_id { THROTTLE_DEV_CPU = 0, THROTTLE_DEV_GPU, @@ -224,6 +250,11 @@ enum soctherm_throttle_dev_id { static const char *const throt_names[] = { [THROTTLE_LIGHT] = "light", [THROTTLE_HEAVY] = "heavy", + [THROTTLE_OC1] = "oc1", + [THROTTLE_OC2] = "oc2", + [THROTTLE_OC3] = "oc3", + [THROTTLE_OC4] = "oc4", + [THROTTLE_OC5] = "oc5", }; struct tegra_soctherm; @@ -255,6 +286,7 @@ struct tegra_soctherm { void __iomem *ccroc_regs; int thermal_irq; + int edp_irq; u32 *calib; struct thermal_zone_device **thermctl_tzs; @@ -267,6 +299,15 @@ struct tegra_soctherm { struct mutex thermctl_lock; }; +struct soctherm_oc_irq_chip_data { + struct mutexirq_lock; /* serialize OC IRQs */ + struct irq_chip irq_chip; + struct irq_domain *domain; + int irq_enable; +}; + +static struct soctherm_oc_irq_chip_data soc_irq_cdata; + /** * ccroc_writel() - writes a value to a CCROC register * @ts: pointer to a struct tegra_soctherm @@ -807,6 +848,360 @@ static irqreturn_t soctherm_thermal_isr_thread(int irq, void *dev_id) return IRQ_HANDLED; } +/** + * soctherm_oc_intr_enable() - Enables the soctherm over-current interrupt + * @alarm: The soctherm throttle id + * @enable:Flag indicating enable the soctherm over-current + * interrupt or disable it + * + * Enables a specific over-current pins @alarm to raise an interrupt if the flag + * is set and the alarm corresponds to OC1, OC2, OC3, or OC4. + */ +static void soctherm_oc_intr_enable(struct tegra_soctherm *ts, + enum soctherm_throttle_id alarm, + bool enable) +{ + u32 r; + + if (!enable) + return; + + r = readl(ts->regs + OC_INTR_ENABLE); + switch (alarm) { + case THROTTLE_OC1: + r = REG_SET_MASK(r, OC_INTR_OC1_MASK, 1); + break; + case THROTTLE_OC2: + r = REG_SET_MASK(r, OC_INTR_OC2_MASK, 1); + break; + case THROTTLE_OC3: + r = REG_SET_MASK(r, OC_INTR_OC3_MASK, 1); + break; + case THROTTLE_OC4: + r = REG_SET_MASK(r, OC_INTR_OC4_MASK, 1); + break; + default: + r = 0; + break; + } + writel(r, ts->regs + OC_INTR_ENABLE); +} + +/** + * soctherm_handle_alarm() - Handles soctherm alarms + * @alarm: The soctherm throttle id + * + * "Handles" over-current alarms (OC1, OC2, OC3, and OC4) by printing + * a warning or informative message. + * + * Return: -EINVAL for @alarm = THROTTLE_OC3, otherwise 0 (success). + */ +static int soctherm_handle_alarm(enum soctherm_throttle_id alarm) +{ + int rv = -EINVAL; + + switch (alarm) { + case THROTTLE_OC1: + pr_debug("soctherm: Successfully handled OC1 alarm\n"); + rv = 0; + break; + + case THROTTLE_OC2: + pr_debug("soctherm: Successfully handled OC2 alarm\n"); + rv =
[PATCH v1 08/12] thermal: tegra: add set_trips functionality
Implement set_trips ops to set passive trip points. Signed-off-by: Wei Ni --- drivers/thermal/tegra/soctherm.c | 64 ++- drivers/thermal/tegra/soctherm.h | 10 + drivers/thermal/tegra/tegra124-soctherm.c | 7 +++- drivers/thermal/tegra/tegra132-soctherm.c | 7 +++- drivers/thermal/tegra/tegra210-soctherm.c | 7 +++- 5 files changed, 90 insertions(+), 5 deletions(-) diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c index c66fdd546ef0..eefbb29b3b7d 100644 --- a/drivers/thermal/tegra/soctherm.c +++ b/drivers/thermal/tegra/soctherm.c @@ -87,8 +87,6 @@ #define THERMCTL_LVL0_DN_STATS 0x14 #define THERMCTL_INTR_STATUS 0x84 -#define THERMCTL_INTR_ENABLE 0x88 -#define THERMCTL_INTR_DISABLE 0x8c #define TH_INTR_MD0_MASK BIT(25) #define TH_INTR_MU0_MASK BIT(24) @@ -265,6 +263,8 @@ struct tegra_soctherm { struct soctherm_throt_cfg throt_cfgs[THROTTLE_SIZE]; struct dentry *debugfs_dir; + + struct mutex thermctl_lock; }; /** @@ -542,9 +542,59 @@ static int tegra_thermctl_set_trip_temp(void *data, int trip, int temp) return 0; } +static void thermal_irq_enable(struct tegra_thermctl_zone *zn) +{ + u32 r; + + /* multiple zones could be handling and setting trips at once */ + mutex_lock(&zn->ts->thermctl_lock); + r = readl(zn->ts->regs + THERMCTL_INTR_ENABLE); + r = REG_SET_MASK(r, zn->sg->thermctl_isr_mask, TH_INTR_UP_DN_EN); + writel(r, zn->ts->regs + THERMCTL_INTR_ENABLE); + mutex_unlock(&zn->ts->thermctl_lock); +} + +static void thermal_irq_disable(struct tegra_thermctl_zone *zn) +{ + u32 r; + + /* multiple zones could be handling and setting trips at once */ + mutex_lock(&zn->ts->thermctl_lock); + r = readl(zn->ts->regs + THERMCTL_INTR_DISABLE); + r = REG_SET_MASK(r, zn->sg->thermctl_isr_mask, 0); + writel(r, zn->ts->regs + THERMCTL_INTR_DISABLE); + mutex_unlock(&zn->ts->thermctl_lock); +} + +static int tegra_thermctl_set_trips(void *data, int lo, int hi) +{ + struct tegra_thermctl_zone *zone = data; + u32 r; + + thermal_irq_disable(zone); + + r = readl(zone->ts->regs + zone->sg->thermctl_lvl0_offset); + r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 0); + writel(r, zone->ts->regs + zone->sg->thermctl_lvl0_offset); + + lo = enforce_temp_range(zone->dev, lo) / zone->ts->soc->thresh_grain; + hi = enforce_temp_range(zone->dev, hi) / zone->ts->soc->thresh_grain; + dev_dbg(zone->dev, "%s hi:%d, lo:%d\n", __func__, hi, lo); + + r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_up_thresh_mask, hi); + r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_dn_thresh_mask, lo); + r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 1); + writel(r, zone->ts->regs + zone->sg->thermctl_lvl0_offset); + + thermal_irq_enable(zone); + + return 0; +} + static const struct thermal_zone_of_device_ops tegra_of_thermal_ops = { .get_temp = tegra_thermctl_get_temp, .set_trip_temp = tegra_thermctl_set_trip_temp, + .set_trips = tegra_thermctl_set_trips, }; static int get_hot_temp(struct thermal_zone_device *tz, int *trip, int *temp) @@ -661,6 +711,15 @@ static irqreturn_t soctherm_thermal_isr(int irq, void *dev_id) struct tegra_soctherm *ts = dev_id; u32 r; + /* Case for no lock: +* Although interrupts are enabled in set_trips, there is still no need +* to lock here because the interrupts are disabled before programming +* new trip points. Hence there cant be a interrupt on the same sensor. +* An interrupt can however occur on a sensor while trips are being +* programmed on a different one. This beign a LEVEL interrupt won't +* cause a new interrupt but this is taken care of by the re-reading of +* the STATUS register in the thread function. +*/ r = readl(ts->regs + THERMCTL_INTR_STATUS); writel(r, ts->regs + THERMCTL_INTR_DISABLE); @@ -1523,6 +1582,7 @@ static int tegra_soctherm_probe(struct platform_device *pdev) if (!tegra) return -ENOMEM; + mutex_init(&tegra->thermctl_lock); dev_set_drvdata(&pdev->dev, tegra); tegra->soc = soc; diff --git a/drivers/thermal/tegra/soctherm.h b/drivers/thermal/tegra/soctherm.h index c05c7e37e968..70501e73d586 100644 --- a/drivers/thermal/tegra/soctherm.h +++ b/drivers/thermal/tegra/soctherm.h @@ -1,3 +1,4 @@ +/* SPDX-License-Identifier: GPL-2.0 */ /* * Copyright (c) 2014-2016, NVIDIA CORPORATION. All rights reserved. * @@ -29,6 +30,14 @@ #define THERMCTL_THERMTRIP_CTL 0x80 /* BITs are defined in device file */ +#define THERMCTL_INTR_ENABLE 0x88 +#define THERMCTL_INTR_DISABLE
[PATCH v1 10/12] arm64: dts: tegra210: set EDP interrupt line
Set EDP interrupt line. Signed-off-by: Wei Ni --- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi index 57dae9cc7b7d..b4531a8c18f6 100644 --- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi @@ -1324,7 +1324,8 @@ reg = <0x0 0x700e2000 0x0 0x600 /* SOC_THERM reg_base */ 0x0 0x60006000 0x0 0x400>; /* CAR reg_base */ reg-names = "soctherm-reg", "car-reg"; - interrupts = ; + interrupts = ; clocks = <&tegra_car TEGRA210_CLK_TSENSOR>, <&tegra_car TEGRA210_CLK_SOC_THERM>; clock-names = "tsensor", "soctherm"; -- 2.7.4
[PATCH v1 04/12] of: Add bindings of gpu hw throttle for Tegra soctherm
Add "nvidia,gpu-throt-level" property to set gpu hw throttle level. Signed-off-by: Wei Ni --- .../bindings/thermal/nvidia,tegra124-soctherm.txt | 17 +++-- include/dt-bindings/thermal/tegra124-soctherm.h| 22 ++ 2 files changed, 33 insertions(+), 6 deletions(-) diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt index ab66d6feab4b..cf6d0be56b7a 100644 --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt @@ -52,6 +52,15 @@ Required properties : Must set as following values: TEGRA_SOCTHERM_THROT_LEVEL_LOW, TEGRA_SOCTHERM_THROT_LEVEL_MED TEGRA_SOCTHERM_THROT_LEVEL_HIGH, TEGRA_SOCTHERM_THROT_LEVEL_NONE + - nvidia,gpu-throt-level: This property is for Tegra124 and Tegra210. +It is the level of pulse skippers, which used to throttle clock +frequencies. It indicates gpu clock throttling depth and can be +programmed to any of the following values which represent a throttling +percentage: +TEGRA_SOCTHERM_THROT_LEVEL_NONE (0%) +TEGRA_SOCTHERM_THROT_LEVEL_LOW (50%), +TEGRA_SOCTHERM_THROT_LEVEL_MED (75%), +TEGRA_SOCTHERM_THROT_LEVEL_HIGH (85%). - #cooling-cells: Should be 1. This cooling device only support on/off state. See ./thermal.txt for a description of this property. @@ -96,22 +105,26 @@ Example : throttle-cfgs { /* * When the "heavy" cooling device triggered, -* the HW will skip cpu clock's pulse in 85% depth +* the HW will skip cpu clock's pulse in 85% depth, +* skip gpu clock's pulse in 85% level */ throttle_heavy: heavy { nvidia,priority = <100>; nvidia,cpu-throt-percent = <85>; + nvidia,gpu-throt-level = ; #cooling-cells = <1>; }; /* * When the "light" cooling device triggered, -* the HW will skip cpu clock's pulse in 50% depth +* the HW will skip cpu clock's pulse in 50% depth, +* skip gpu clock's pulse in 50% level */ throttle_light: light { nvidia,priority = <80>; nvidia,cpu-throt-percent = <50>; + nvidia,gpu-throt-level = ; #cooling-cells = <1>; }; diff --git a/include/dt-bindings/thermal/tegra124-soctherm.h b/include/dt-bindings/thermal/tegra124-soctherm.h index c15e8b709a0d..75853df1c609 100644 --- a/include/dt-bindings/thermal/tegra124-soctherm.h +++ b/include/dt-bindings/thermal/tegra124-soctherm.h @@ -1,5 +1,19 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* + * Copyright (c) 2014 - 2018, NVIDIA CORPORATION. All rights reserved. + * + * Author: + * Mikko Perttunen + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * * This header provides constants for binding nvidia,tegra124-soctherm. */ @@ -12,9 +26,9 @@ #define TEGRA124_SOCTHERM_SENSOR_PLLX 3 #define TEGRA124_SOCTHERM_SENSOR_NUM 4 -#define TEGRA_SOCTHERM_THROT_LEVEL_LOW 0 -#define TEGRA_SOCTHERM_THROT_LEVEL_MED 1 -#define TEGRA_SOCTHERM_THROT_LEVEL_HIGH 2 -#define TEGRA_SOCTHERM_THROT_LEVEL_NONE -1 +#define TEGRA_SOCTHERM_THROT_LEVEL_NONE 0 +#define TEGRA_SOCTHERM_THROT_LEVEL_LOW 1 +#define TEGRA_SOCTHERM_THROT_LEVEL_MED 2 +#define TEGRA_SOCTHERM_THROT_LEVEL_HIGH 3 #endif -- 2.7.4
[PATCH v1 00/12] Add some functionalities for Tegra soctherm
Move the hw/sw shutdown patches into this serial. There already have some discussion for it in https://lkml.org/lkml/2018/12/7/225. Support GPU HW throttle, thermal IRQ, set_trips(), EDP IRQ and OC hw throttle. Wei Ni (12): of: Add bindings of thermtrip for Tegra soctherm thermal: tegra: support hw and sw shutdown arm64: dts: tegra210: set thermtrip of: Add bindings of gpu hw throttle for Tegra soctherm thermal: tegra: add support for gpu hw-throttle arm64: dts: tegra210: set gpu hw throttle level thermal: tegra: add support for thermal IRQ thermal: tegra: add set_trips functionality thermal: tegra: add support for EDP IRQ arm64: dts: tegra210: set EDP interrupt line of: Add bindings of OC hw throttle for Tegra soctherm thermal: tegra: enable OC hw throttle .../bindings/thermal/nvidia,tegra124-soctherm.txt | 63 +- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 20 +- drivers/thermal/tegra/soctherm.c | 959 +++-- drivers/thermal/tegra/soctherm.h | 16 + drivers/thermal/tegra/tegra124-soctherm.c | 7 +- drivers/thermal/tegra/tegra132-soctherm.c | 7 +- drivers/thermal/tegra/tegra210-soctherm.c | 15 +- include/dt-bindings/thermal/tegra124-soctherm.h| 22 +- 8 files changed, 1033 insertions(+), 76 deletions(-) -- 2.7.4
[PATCH v1 12/12] thermal: tegra: enable OC hw throttle
Parse Over Current settings from DT and program them to generate interrupts. Also enable hw throttling whenever there are OC events. Log the OC events as debug messages. Signed-off-by: Wei Ni --- drivers/thermal/tegra/soctherm.c | 128 --- 1 file changed, 118 insertions(+), 10 deletions(-) diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c index 37108f2290f9..c2a0b048a085 100644 --- a/drivers/thermal/tegra/soctherm.c +++ b/drivers/thermal/tegra/soctherm.c @@ -106,9 +106,26 @@ #define STATS_CTL_CLR_UP 0x2 #define STATS_CTL_EN_UP0x1 +#define OC1_CFG0x310 +#define OC1_CFG_LONG_LATENCY_MASK BIT(6) +#define OC1_CFG_HW_RESTORE_MASKBIT(5) +#define OC1_CFG_PWR_GOOD_MASK_MASK BIT(4) +#define OC1_CFG_THROTTLE_MODE_MASK (0x3 << 2) +#define OC1_CFG_ALARM_POLARITY_MASKBIT(1) +#define OC1_CFG_EN_THROTTLE_MASK BIT(0) + +#define OC1_CNT_THRESHOLD 0x314 +#define OC1_THROTTLE_PERIOD0x318 +#define OC1_ALARM_COUNT0x31c +#define OC1_FILTER 0x320 +#define OC1_STATS 0x3a8 + #define OC_INTR_STATUS 0x39c #define OC_INTR_ENABLE 0x3a0 #define OC_INTR_DISABLE0x3a4 +#define OC_STATS_CTL 0x3c4 +#define OC_STATS_CTL_CLR_ALL 0x2 +#define OC_STATS_CTL_EN_ALL0x1 #define OC_INTR_OC1_MASK BIT(0) #define OC_INTR_OC2_MASK BIT(1) @@ -207,6 +224,25 @@ #define THROT_DELAY_CTRL(throt)(THROT_DELAY_LITE + \ (THROT_OFFSET * throt)) +#define ALARM_OFFSET 0x14 +#define ALARM_CFG(throt) (OC1_CFG + \ + (ALARM_OFFSET * (throt - THROTTLE_OC1))) + +#define ALARM_CNT_THRESHOLD(throt) (OC1_CNT_THRESHOLD + \ + (ALARM_OFFSET * (throt - THROTTLE_OC1))) + +#define ALARM_THROTTLE_PERIOD(throt) (OC1_THROTTLE_PERIOD + \ + (ALARM_OFFSET * (throt - THROTTLE_OC1))) + +#define ALARM_ALARM_COUNT(throt) (OC1_ALARM_COUNT + \ + (ALARM_OFFSET * (throt - THROTTLE_OC1))) + +#define ALARM_FILTER(throt)(OC1_FILTER + \ + (ALARM_OFFSET * (throt - THROTTLE_OC1))) + +#define ALARM_STATS(throt) (OC1_STATS + \ + (4 * (throt - THROTTLE_OC1))) + /* get CCROC_THROT_PSKIP_xxx offset per HIGH/MED/LOW vect*/ #define CCROC_THROT_OFFSET 0x0c #define CCROC_THROT_PSKIP_CTRL_CPU_REG(vect)(CCROC_THROT_PSKIP_CTRL_CPU + \ @@ -218,6 +254,9 @@ #define THERMCTL_LVL_REGS_SIZE 0x20 #define THERMCTL_LVL_REG(rg, lv) ((rg) + ((lv) * THERMCTL_LVL_REGS_SIZE)) +#define OC_THROTTLE_MODE_DISABLED 0 +#define OC_THROTTLE_MODE_BRIEF 2 + static const int min_low_temp = -127000; static const int max_high_temp = 127000; @@ -266,6 +305,15 @@ struct tegra_thermctl_zone { const struct tegra_tsensor_group *sg; }; +struct soctherm_oc_cfg { + u32 active_low; + u32 throt_period; + u32 alarm_cnt_thresh; + u32 alarm_filter; + u32 mode; + bool intr_en; +}; + struct soctherm_throt_cfg { const char *name; unsigned int id; @@ -273,6 +321,7 @@ struct soctherm_throt_cfg { u8 cpu_throt_level; u32 cpu_throt_depth; u32 gpu_throt_level; + struct soctherm_oc_cfg oc_cfg; struct thermal_cooling_device *cdev; bool init; }; @@ -715,7 +764,7 @@ static int tegra_soctherm_set_hwtrips(struct device *dev, return 0; } - for (i = 0; i < THROTTLE_SIZE; i++) { + for (i = 0; i < THROTTLE_OC1; i++) { struct thermal_cooling_device *cdev; if (!ts->throt_cfgs[i].init) @@ -1547,6 +1596,30 @@ static int soctherm_thermtrips_parse(struct platform_device *pdev) return 0; } +static void soctherm_oc_cfg_parse(struct device *dev, + struct device_node *np_oc, + struct soctherm_throt_cfg *stc) +{ + u32 val; + + if (!of_property_read_u32(np_oc, "nvidia,polarity-active-low", &val)) + stc->oc_cfg.active_low = val; + + if (!of_property_read_u32(np_oc, "nvidia,count-threshold", &val)) { + stc->oc_cfg.intr_en = 1; + stc->oc_cfg.alarm_cnt_thresh = val; + } + + if (!of_property_read_u32(np_oc, "nvidia,throttle-period", &val)) + stc->oc_cfg.throt_period = val;
[PATCH v1 11/12] of: Add bindings of OC hw throttle for Tegra soctherm
Add OC HW throttle configuration for soctherm in DT. It is used to describe the OCx throttle events. Signed-off-by: Wei Ni --- .../bindings/thermal/nvidia,tegra124-soctherm.txt | 26 ++ 1 file changed, 26 insertions(+) diff --git a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt index cf6d0be56b7a..d112a8e59ec3 100644 --- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt +++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt @@ -64,6 +64,21 @@ Required properties : - #cooling-cells: Should be 1. This cooling device only support on/off state. See ./thermal.txt for a description of this property. + Optional properties: The following properties are T210 specific and + valid only for OCx throttle events. + - nvidia,count-threshold: Specifies the number of OC events that are +required for triggering an interrupt. Interrupts are not triggered if +the property is missing. A value of 0 will interrupt on every OC alarm. + - nvidia,polarity-active-low: Configures the polarity of the OC alaram +signal. Accepted values are 1 for assert low and 0 for assert high. +Default value is 0. + - nvidia,alarm-filter: Number of clocks to filter event. When the filter +expires (which means the OC event has not occurred for a long time), +the counter is cleared and filter is rearmed. Default value is 0. + - nvidia,throttle-period: Specifies the number of uSec for which +throttling is engaged after the OC event is deasserted. Default value +is 0. + Optional properties: - nvidia,thermtrips : When present, this property specifies the temperature at which the soctherm hardware will assert the thermal trigger signal to the @@ -134,6 +149,17 @@ Example : * arbiter will select the highest priority as the final throttle * settings to skip cpu pulse. */ + + throttle_oc1: oc1 { + nvidia,priority = <50>; + nvidia,polarity-active-low = <1>; + nvidia,count-threshold = <100>; + nvidia,alarm-filter = <510>; + nvidia,throttle-period = <0>; + nvidia,cpu-throt-percent = <75>; + nvidia,gpu-throt-level = + ; +}; }; }; -- 2.7.4
[PATCH v1 06/12] arm64: dts: tegra210: set gpu hw throttle level
Set gpu hw throttle level to TEGRA_SOCTHERM_THROT_LEVEL_HIGH Signed-off-by: Wei Ni --- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi index 36c7dce7fa69..57dae9cc7b7d 100644 --- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi @@ -1339,6 +1339,8 @@ throttle_heavy: heavy { nvidia,priority = <100>; nvidia,cpu-throt-percent = <85>; + nvidia,gpu-throt-level = + ; #cooling-cells = <2>; }; -- 2.7.4
[PATCH v1 05/12] thermal: tegra: add support for gpu hw-throttle
Add support to trigger pulse skippers on the GPU when a HOT trip point is triggered. The pulse skippers can be signalled to throttle at low, medium and high depths\levels. Signed-off-by: Wei Ni --- drivers/thermal/tegra/soctherm.c | 118 --- 1 file changed, 85 insertions(+), 33 deletions(-) diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c index 673c3ffa9001..d3cef88a3f22 100644 --- a/drivers/thermal/tegra/soctherm.c +++ b/drivers/thermal/tegra/soctherm.c @@ -1,5 +1,6 @@ +// SPDX-License-Identifier: GPL-2.0 /* - * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved. + * Copyright (c) 2014 - 2018, NVIDIA CORPORATION. All rights reserved. * * Author: * Mikko Perttunen @@ -160,6 +161,15 @@ /* get dividend from the depth */ #define THROT_DEPTH_DIVIDEND(depth)((256 * (100 - (depth)) / 100) - 1) +/* gk20a nv_therm interface N:3 Mapping. Levels defined in tegra124-sochterm.h + * level vector + * NONE3'b000 + * LOW 3'b001 + * MED 3'b011 + * HIGH3'b111 + */ +#define THROT_LEVEL_TO_DEPTH(level)((0x1 << (level)) - 1) + /* get THROT_PSKIP_xxx offset per LIGHT/HEAVY throt and CPU/GPU dev */ #define THROT_OFFSET 0x30 #define THROT_PSKIP_CTRL(throt, dev) (THROT_PSKIP_CTRL_LITE_CPU + \ @@ -219,6 +229,7 @@ struct soctherm_throt_cfg { u8 priority; u8 cpu_throt_level; u32 cpu_throt_depth; + u32 gpu_throt_level; struct thermal_cooling_device *cdev; bool init; }; @@ -974,6 +985,50 @@ static int soctherm_thermtrips_parse(struct platform_device *pdev) return 0; } +static int soctherm_throt_cfg_parse(struct device *dev, + struct device_node *np, + struct soctherm_throt_cfg *stc) +{ + struct tegra_soctherm *ts = dev_get_drvdata(dev); + int ret; + u32 val; + + ret = of_property_read_u32(np, "nvidia,priority", &val); + if (ret) { + dev_err(dev, "throttle-cfg: %s: invalid priority\n", stc->name); + return -EINVAL; + } + stc->priority = val; + + ret = of_property_read_u32(np, ts->soc->use_ccroc ? + "nvidia,cpu-throt-level" : + "nvidia,cpu-throt-percent", &val); + if (!ret) { + if (ts->soc->use_ccroc && + val <= TEGRA_SOCTHERM_THROT_LEVEL_HIGH) + stc->cpu_throt_level = val; + else if (!ts->soc->use_ccroc && val <= 100) + stc->cpu_throt_depth = val; + else + goto err; + } else { + goto err; + } + + ret = of_property_read_u32(np, "nvidia,gpu-throt-level", &val); + if (!ret && val <= TEGRA_SOCTHERM_THROT_LEVEL_HIGH) + stc->gpu_throt_level = val; + else + goto err; + + return 0; + +err: + dev_err(dev, "throttle-cfg: %s: no throt prop or invalid prop\n", + stc->name); + return -EINVAL; +} + /** * soctherm_init_hw_throt_cdev() - Parse the HW throttle configurations * and register them as cooling devices. @@ -984,8 +1039,7 @@ static void soctherm_init_hw_throt_cdev(struct platform_device *pdev) struct tegra_soctherm *ts = dev_get_drvdata(dev); struct device_node *np_stc, *np_stcc; const char *name; - u32 val; - int i, r; + int i; for (i = 0; i < THROTTLE_SIZE; i++) { ts->throt_cfgs[i].name = throt_names[i]; @@ -1003,6 +1057,7 @@ static void soctherm_init_hw_throt_cdev(struct platform_device *pdev) for_each_child_of_node(np_stc, np_stcc) { struct soctherm_throt_cfg *stc; struct thermal_cooling_device *tcd; + int err; name = np_stcc->name; stc = find_throttle_cfg_by_name(ts, name); @@ -1012,37 +1067,10 @@ static void soctherm_init_hw_throt_cdev(struct platform_device *pdev) continue; } - r = of_property_read_u32(np_stcc, "nvidia,priority", &val); - if (r) { - dev_info(dev, -"throttle-cfg: %s: missing priority\n", name); + + err = soctherm_throt_cfg_parse(dev, np_stcc, stc); + if (err) continue; - } - stc->priority = val; - - if (ts->soc->use_ccroc) { - r = of_property_read_u32(np_stcc, -"nvidia,cpu-throt-level", -&val); - if (r) { - dev_info(dev, -"throttle-cfg: %s: missing cpu-throt-level
[PATCH v1 00/12] Add some functionalities for Tegra soctherm
Move the hw/sw shutdown patches into this serial. There already have some discussion for it in https://lkml.org/lkml/2018/12/7/225. Support GPU HW throttle, thermal IRQ, set_trips(), EDP IRQ and OC hw throttle. Wei Ni (12): of: Add bindings of thermtrip for Tegra soctherm thermal: tegra: support hw and sw shutdown arm64: dts: tegra210: set thermtrip of: Add bindings of gpu hw throttle for Tegra soctherm thermal: tegra: add support for gpu hw-throttle arm64: dts: tegra210: set gpu hw throttle level thermal: tegra: add support for thermal IRQ thermal: tegra: add set_trips functionality thermal: tegra: add support for EDP IRQ arm64: dts: tegra210: set EDP interrupt line of: Add bindings of OC hw throttle for Tegra soctherm thermal: tegra: enable OC hw throttle .../bindings/thermal/nvidia,tegra124-soctherm.txt | 63 +- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 20 +- drivers/thermal/tegra/soctherm.c | 955 +++-- drivers/thermal/tegra/soctherm.h | 16 + drivers/thermal/tegra/tegra124-soctherm.c | 7 +- drivers/thermal/tegra/tegra132-soctherm.c | 7 +- drivers/thermal/tegra/tegra210-soctherm.c | 15 +- include/dt-bindings/thermal/tegra124-soctherm.h| 22 +- 8 files changed, 1029 insertions(+), 76 deletions(-) -- 2.7.4
Re: [PATCH 2/2 v3] kdump,vmcoreinfo: Export the value of sme mask to vmcoreinfo
在 2018年12月17日 21:01, Borislav Petkov 写道: > On Sun, Dec 16, 2018 at 09:16:17PM +0800, Lianbo Jiang wrote: >> For AMD machine with SME feature, makedumpfile tools need to know >> whether the crash kernel was encrypted or not. If SME is enabled >> in the first kernel, the crash kernel's page table(pgd/pud/pmd/pte) >> contains the memory encryption mask, so need to remove the sme mask >> to obtain the true physical address. >> >> Signed-off-by: Lianbo Jiang >> --- >> arch/x86/kernel/machine_kexec_64.c | 14 ++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/arch/x86/kernel/machine_kexec_64.c >> b/arch/x86/kernel/machine_kexec_64.c >> index 4c8acdfdc5a7..1860fe24117d 100644 >> --- a/arch/x86/kernel/machine_kexec_64.c >> +++ b/arch/x86/kernel/machine_kexec_64.c >> @@ -352,10 +352,24 @@ void machine_kexec(struct kimage *image) >> >> void arch_crash_save_vmcoreinfo(void) >> { >> +u64 sme_mask = sme_me_mask; >> + >> VMCOREINFO_NUMBER(phys_base); >> VMCOREINFO_SYMBOL(init_top_pgt); >> vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n", >> pgtable_l5_enabled()); >> +/* >> + * Currently, the local variable 'sme_mask' stores the value of >> + * sme_me_mask(bit 47), and also write the value of sme_mask to >> + * the vmcoreinfo. >> + * If need, the bit(sme_mask) might be redefined in the future, >> + * but the 'bit63' will be reserved. >> + * For example: >> + * [ misc ][ enc bit ][ other misc SME info ] >> + * ____1000______..._ >> + * 63 59 55 51 47 43 39 35 31 27 ... 3 >> + */ > > This text belongs into the document. > Ok, i will move it into VMCOREINFO document. Thanks.
[PATCH] tools/power/x86/intel_pstate_tracer: Fix non root execution for post processing a trace file.
This script is supposed to be allowed to run with regular user privileges if a previously captured trace is being post processed. Commit fbe313884d7ddd73ce457473cbdf3763f5b1d3da tools/power/x86/intel_pstate_tracer: Free the trace buffer memory introduced a bug that breaks that option. Commit 35459105deb26430653a7299a86bc66fb4dd5773 tools/power/x86/intel_pstate_tracer: Add optional setting of trace buffer memory allocation moved the code but kept the bug. This patch fixes the issue. Signed-off-by: Doug Smythies --- tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py b/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py index 84e2b64..2fa3c57 100755 --- a/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py +++ b/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py @@ -585,9 +585,9 @@ current_max_cpu = 0 read_trace_data(filename) -clear_trace_file() -# Free the memory if interval: +clear_trace_file() +# Free the memory free_trace_buffer() if graph_data_present == False: -- 2.7.4
[PATCH] drm/bochs: add edid present check
Check first two header bytes before trying to read the edid blob, to avoid the log being spammed in case qemu has no edid support (old qemu or edid turned off). Fixes: 01f23459cf drm/bochs: add edid support. Signed-off-by: Gerd Hoffmann --- drivers/gpu/drm/bochs/bochs_hw.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/bochs/bochs_hw.c b/drivers/gpu/drm/bochs/bochs_hw.c index c90a0d492f..f91e049625 100644 --- a/drivers/gpu/drm/bochs/bochs_hw.c +++ b/drivers/gpu/drm/bochs/bochs_hw.c @@ -89,6 +89,10 @@ int bochs_hw_load_edid(struct bochs_device *bochs) if (!bochs->mmio) return -1; + if (readb(bochs->mmio + 0) != 0x00 || + readb(bochs->mmio + 1) != 0xff) + return -1; + kfree(bochs->edid); bochs->edid = drm_do_get_edid(&bochs->connector, bochs_get_edid_block, bochs); -- 2.9.3
Re: [PATCH v11 2/7] ACPI / OSL: Stub out acpi_os_(read/write)_pci_configurations()
On Tue, Dec 18, 2018 at 02:56:01AM +, Sinan Kaya wrote: > Getting ready to allow CONFIG_PCI to be unset with ACPI enabled. Stub out > acpi_os_read_pci_configuration and acpi_os_write_pci_configuration > functions when CONFIG_PCI is not defined. > > Signed-off-by: Sinan Kaya > --- > drivers/acpi/osl.c | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > index b48874b8e1ea..524fd5f33ea4 100644 > --- a/drivers/acpi/osl.c > +++ b/drivers/acpi/osl.c > @@ -773,6 +773,7 @@ acpi_status > acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg, > u64 *value, u32 width) > { > +#ifdef CONFIG_PCI > int result, size; > u32 value32; > > @@ -799,12 +800,19 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * > pci_id, u32 reg, > *value = value32; > > return (result ? AE_ERROR : AE_OK); > +#else > + int rc; > + > + rc = pr_warn_once("PCI configuration space access is not supported\n"); > + return rc ? AE_SUPPORT : AE_OK; > +#endif Normally we provide a full separate stub version. If we have enough of them in a separate file.
Re: [PATCH 1/2 v3] kdump: add the vmcoreinfo documentation
在 2018年12月17日 21:00, Borislav Petkov 写道: > On Sun, Dec 16, 2018 at 09:16:16PM +0800, Lianbo Jiang wrote: >> +clear_idx >> += >> +The index that the next printk record to read after the last 'clear' >> +command. It indicates the first record after the last SYSLOG_ACTION >> +_CLEAR, like issued by 'dmesg -c'. > > What is that used for by the userspace tools? > The clear_idx is used when dumping the dmesg log. >> + >> +log_next_idx >> + >> +The index of the next record to store in the buffer 'log_buf'. It helps >> +to compute the index of current strings position. >> + >> +printk_log >> +== >> +The size of a structure 'printk_log'. It helps to compute the size of >> +messages, and extract dmesg log. > > What is the difference between that and log_buf? > The printk_log is used to output human readable text, it will encapsulate header information for log_buf, such as timestamp, syslog level, etc. > > >> + >> +(printk_log, ts_nsec|len|text_len|dict_len) >> +=== >> +It represents these field offsets in the structure 'printk_log'. User >> +space tools can parse it and detect any changes to structure down the >> +line. > > What does that mean? "any changes down the line"? > User space tools can parse it and check whether the values of printk_log's members have been changed. I will improve it in patch v4. >> + >> +(free_area.free_list, MIGRATE_TYPES) >> + >> +The number of migrate types for pages. The free_list is divided into >> +the array, it needs to know the number of the array. > > ... for? > It needs to know the number of the array when makedumpfile computes the number of free pages. >> + >> +NR_FREE_PAGES >> += >> +On linux-2.6.21 or later, the number of free_pages is in >> +vm_stat[NR_FREE_PAGES]. It can get the number of free pages from the >> +array. >> + >> +PG_lru|PG_private|PG_swapcache|PG_swapbacked|PG_slab| >> +PG_hwpoision|PG_head_mask >> += >> +It means the attribute of a page. These flags will be used to filter >> +the free pages. >> + >> +PAGE_BUDDY_MAPCOUNT_VALUE or ~PG_buddy >> +== >> +The 'PG_buddy' flag indicates that the page is free and in the buddy >> +system. Makedumpfile can exclude the free pages managed by a buddy. > > That text belongs with the one above? > It exported the value of (~PG_buddy), so it is placed here independently. >> + >> +HUGETLB_PAGE_DTOR >> += >> +The 'HUGETLB_PAGE_DTOR' flag indicates the hugetlbfs pages. Makedumpfile >> +will exclude these pages. >> + >> + >> +x86_64 variables >> + >> + >> +phys_base >> += >> +In x86_64, the 'phys_base' is necessary to convert virtual address of >> +exported kernel symbol to physical address. >> + >> +init_top_pgt >> + >> +The 'init_top_pgt' used to walk through the whole page table and convert >> +virtual address to physical address. > > This is the same as swapper_pg_dir? > These two variables are somewhat similar, but they are used in different scenarios. >> + >> +pgtable_l5_enabled >> +== >> +User-space tools need to know whether the crash kernel was in 5-level >> +paging mode or not. >> + >> +node_data >> += >> +This is a struct 'pglist_data' array, it stores all numa nodes information. >> +In general, Makedumpfile can get the pglist_data structure from symbol >> +'node_data'. >> + >> +(node_data, MAX_NUMNODES) >> += >> +The number of this 'node_data' array. It means the maximum number of the >> +nodes in system. >> + >> +KERNELOFFSET >> + >> +Randomize the address of the kernel image. This is the offset of KASLR in >> +VMCOREINFO ELF notes. It is used to compute the page offset in x86_64. If >> +KASLE is disabled, this value is zero. >> + >> +KERNEL_IMAGE_SIZE >> += >> +The size of 'KERNEL_IMAGE_SIZE', currently unused. > > So remove? > I'm not sure whether it should be removed, so i keep it. >> + >> +The old MODULES_VADDR need be decided by KERNEL_IMAGE_SIZE when kaslr >> +enabled. Now MODULES_VADDR is not needed any more since Pratyush makes >> +all VA to PA converting done by page table lookup. > > Also, I did clean this up considerably - please include in your next > version: > Great, thanks for you help. I will post v4 later. Regards, Lianbo > --- > diff --git a/Documentation/kdump/vmcoreinfo.txt > b/Documentation/kdump/vmcoreinfo.txt > index d71260bf383a..2ce34d952bfd 100644 > --- a/Documentation/kdump/vmcoreinfo.txt > +++ b/Documentation/kdump/vmcoreinfo.txt > @@ -1,18 +1,19 @@ > > - Documentation for VMCOREINFO > + VMCOREINFO > > > === > What is the V
Re: [PATCH v11 1/7] ACPI: Allow CONFIG_PCI to be unset for reboot
> +#ifdef CONFIG_PCI > + unsigned int devfn; > + struct pci_bus *bus0; > + > /* The reset register can only live on bus 0. */ > bus0 = pci_find_bus(0, 0); > if (!bus0) > @@ -44,8 +47,9 @@ void acpi_reboot(void) > /* Write the value that resets us. */ > pci_bus_write_config_byte(bus0, devfn, > (rr->address & 0x), reset_value); > +#endif This would be a lot cleaner if this was split into a little helper function.
Re: [PATCH v4 1/2] export trace.c helper functions to other modules
Reviewed-by: Sagi Grimberg
Re: [PATCH v4 2/2] trace nvme submit queue status
@@ -899,6 +900,10 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx) } req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id); + trace_nvme_sq(req->rq_disk, + nvmeq->qid, + le16_to_cpu(cqe->sq_head), + nvmeq->sq_tail); Why the newline escapes? why not escape at the 80 char border? Other than that, looks fine, Reviewed-by: Sagi Grimberg
Re: [PATCH v5 2/2] media: usb: pwc: Don't use coherent DMA buffers for ISO transfer
On Fri, Dec 14, 2018 at 9:36 PM Christoph Hellwig wrote: > > On Fri, Dec 14, 2018 at 12:12:38PM +0900, Tomasz Figa wrote: > > > If the buffer always is physically contiguous, as it is in the currently > > > posted series, we can always map it with a single dma_map_single call > > > (if the hardware can handle that in a single segment is a different > > > question, but out of scope here). > > > > Are you sure the buffer is always physically contiguous? At least the > > ARM IOMMU dma_ops [1] and the DMA-IOMMU dma_ops [2] will simply > > allocate pages without any continuity guarantees and remap the pages > > into a contiguous kernel VA (unless DMA_ATTR_NO_KERNEL_MAPPING is > > given, which makes them return an opaque cookie instead of the kernel > > VA). > > > > [1] > > http://git.infradead.org/users/hch/misc.git/blob/2dbb028e4a3017e1b71a6ae3828a3548545eba24:/arch/arm/mm/dma-mapping.c#l1291 > > [2] > > http://git.infradead.org/users/hch/misc.git/blob/2dbb028e4a3017e1b71a6ae3828a3548545eba24:/drivers/iommu/dma-iommu.c#l450 > > We never end up in this allocator for the new DMA_ATTR_NON_CONSISTENT > case, and that is intentional. It kind of limits the usability of this API, since it enforces contiguous allocations even for big sizes even for devices behind IOMMU (contrary to the case when DMA_ATTR_NON_CONSISTENT is not set), but given that it's just a temporary solution for devices like these USB cameras, I guess that's fine. Note that in V4L2 we use the DMA API extensively, so that we don't need to embed any device-specific or integration-specific knowledge in the framework. Right now we're using dma_alloc_attrs() with driver-provided attrs [1], but current driver never request non-consistent memory. We're however thinking about making it possible to allocate non-consistent memory. What would you suggest for this? [1] https://elixir.bootlin.com/linux/v4.20-rc7/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L139 Best regards, Tomasz
Re: [regression, bisected] Keyboard not responding after resuming from suspend/hibernate
Sun, 2 Dec 2018 23:28:09 +0100 tarihinde Pavel Machek yazdı: >On Fri 2018-11-30 15:44:55, Numan Demirdöğen wrote: >> Sun, 28 Oct 2018 22:06:54 +0300 tarihinde >> Numan Demirdöğen yazdı: >> >> >Thu, 25 Oct 2018 09:49:03 +0200 tarihinde >> >Pavel Machek yazdı: >> > >> >> Hi! >> >> >> >> Here's problem bisected down to: >> >> >> >> commit 9d659ae14b545c4296e812c70493bfdc999b5c1c >> >> Author: Peter Zijlstra >> >> Date: Tue Aug 23 14:40:16 2016 +0200 >> >> >> >> locking/mutex: Add lock handoff to avoid starvation >> >> >> >> Implement lock handoff to avoid lock starvation. >> >> >> >> Numan, I assume revert of that patch on the 4.18 kernel still >> >> makes it work? >> >> >> > >> >Unfortunately, I could not revert >> >9d659ae14b545c4296e812c70493bfdc999b5c1c on kernels from 4.18.16 to >> >4.10-rc1 because there were too much conflicts, which I could not >> >solve as an "average Joe". I tried >> >a3ea3d9b865c2a8f7fe455c7fa26db4b6fd066e3 which is parent of >> >9d659ae14b545c4296e812c70493bfdc999b5c1c and succeeded to compile >> >kernel. >> > >> >git checkout a3ea3d9b865c2a8f7fe455c7fa26db4b6fd066e3 >> > >> >Then, I compiled kernel and rebooted with it. I tried a couples of >> >times suspending and resuming, all of the time keyboard worked as >> >expected. >> > >> >> With this one line patch from Takashi Iwai, keyboard is working as >> expected after resuming from suspend/hibernate. >> >> --- a/kernel/locking/mutex.c >> +++ b/kernel/locking/mutex.c >> @@ -59,7 +59,7 @@ EXPORT_SYMBOL(__mutex_init); >> * Bit2 indicates handoff has been done and we're waiting for >> pickup. */ >> #define MUTEX_FLAG_WAITERS 0x01 >> -#define MUTEX_FLAG_HANDOFF 0x02 >> +#define MUTEX_FLAG_HANDOFF 0x00 >> #define MUTEX_FLAG_PICKUP 0x04 >> >> #define MUTEX_FLAGS 0x07 >> >> >> Thanks in advance and regards, > >Ok. So it is a regression, and you can ask Linus to apply this >.. but... that's kind of heavy solution. Peter, do you have any other >ideas? > > Pavel Hi, I did not mention the one line patch from Takashi Iwai as a means of fix but as a hint. Sorry for misunderstanding. Here is a another hint from another user: I found that passing the options i8042.reset=1 i8042.dumbkbd=1 i8042.direct=1 results in the keyboard functioning after resume. However, there is a long delay before the keyboard or mouse will respond to input on the lock screen.[1] [1] https://bugzilla.kernel.org/show_bug.cgi?id=195471#c39 -- Numan Demirdöğen pgpbIY74iePYL.pgp Description: Dijital OpenPGP imzası
Re: [PATCH v2] net/smc: fix TCP fallback socket release
On Mon, Dec 17, 2018 at 03:58:58PM +0100, Ursula Braun wrote: > Hi Ursula, Thank you for your suggestion. I have a question on your comment. > > On 12/17/2018 06:21 AM, Myungho Jung wrote: > > clcsock can be released while kernel_accept() references it in TCP > > listen worker. Also, clcsock needs to wake up before released if TCP > > fallback is used and the clcsock is blocked by accept. Add a lock to > > safely release clcsock and call kernel_sock_shutdown() to wake up > > clcsock from accept in smc_release(). > > Thanks for your effort to solve this problem. I have some minor > improvement proposals: > > > > > Reported-by: syzbot+0bf2e01269f1274b4...@syzkaller.appspotmail.com > > Reported-by: syzbot+e3132895630f95730...@syzkaller.appspotmail.com > > Signed-off-by: Myungho Jung > > --- > > net/smc/af_smc.c | 14 -- > > net/smc/smc.h| 2 ++ > > 2 files changed, 14 insertions(+), 2 deletions(-) > > > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > > index 5fbaf1901571..5d06fb1bbccf 100644 > > --- a/net/smc/af_smc.c > > +++ b/net/smc/af_smc.c > > @@ -147,8 +147,14 @@ static int smc_release(struct socket *sock) > > sk->sk_shutdown |= SHUTDOWN_MASK; > > } > > if (smc->clcsock) { > > + if (smc->use_fallback && sk->sk_state == SMC_LISTEN) { > > + /* wake up clcsock accept */ > > + rc = kernel_sock_shutdown(smc->clcsock, SHUT_RDWR); > > + } > > This part is not needed, since an SMC socket in state SMC_LISTEN is never > a use_fallback socket. In smc_sendmsg(), set use_fallback to true if SMC socket is SMC_INIT state and the message has MSG_FASTOPEN flag. After this, smc_listen() would trigger smc_tcp_listen_work(). Is this not an expected scenario? Then, what is the reason for not skipping smc_sendmsg() in SMC_INIT state? > > > + mutex_lock(&smc->clcsock_release_lock); > > sock_release(smc->clcsock); > > smc->clcsock = NULL; > > + mutex_unlock(&smc->clcsock_release_lock); > > } > > if (smc->use_fallback) { > > if (sk->sk_state != SMC_LISTEN && sk->sk_state != SMC_INIT) > > @@ -205,6 +211,7 @@ static struct sock *smc_sock_alloc(struct net *net, > > struct socket *sock, > > spin_lock_init(&smc->conn.send_lock); > > sk->sk_prot->hash(sk); > > sk_refcnt_debug_inc(sk); > > + mutex_init(&smc->clcsock_release_lock); > > > > return sk; > > } > > @@ -821,7 +828,7 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, > > struct smc_sock **new_smc) > > struct socket *new_clcsock = NULL; > > struct sock *lsk = &lsmc->sk; > > struct sock *new_sk; > > - int rc; > > + int rc = 0; > > Without clcsock the good path should not be executed. Thus I suggest > to initialize with something negative like -EINVAL. > > > > > release_sock(lsk); > > new_sk = smc_sock_alloc(sock_net(lsk), NULL, lsk->sk_protocol); > > @@ -834,7 +841,10 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, > > struct smc_sock **new_smc) > > } > > *new_smc = smc_sk(new_sk); > > > > - rc = kernel_accept(lsmc->clcsock, &new_clcsock, 0); > > + mutex_lock(&lsmc->clcsock_release_lock); > > + if (lsmc->clcsock) > > + rc = kernel_accept(lsmc->clcsock, &new_clcsock, 0); > > + mutex_unlock(&lsmc->clcsock_release_lock); > > lock_sock(lsk); > > if (rc < 0) > > lsk->sk_err = -rc; > > diff --git a/net/smc/smc.h b/net/smc/smc.h > > index 08786ace6010..9a2795cf5d30 100644 > > --- a/net/smc/smc.h > > +++ b/net/smc/smc.h > > @@ -219,6 +219,8 @@ struct smc_sock { /* smc > > sock container */ > > * started, waiting for unsent > > * data to be sent > > */ > > + struct mutexclcsock_release_lock; > > + /* protects clcsock */ > > I suggest to be more precise: "protects clcsock of a listen socket" > > > }; > > > > static inline struct smc_sock *smc_sk(const struct sock *sk) > > >
[PATCH -next] regulator: act8945a-regulator: make symbol act8945a_pm static
Fixes the following sparse warning: drivers/regulator/act8945a-regulator.c:340:1: warning: symbol 'act8945a_pm' was not declared. Should it be static? Fixes: 7482d6ecc68e ("regulator: act8945a-regulator: Implement PM functionalities") Signed-off-by: Wei Yongjun --- drivers/regulator/act8945a-regulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/regulator/act8945a-regulator.c b/drivers/regulator/act8945a-regulator.c index 90572b6..dc3942b 100644 --- a/drivers/regulator/act8945a-regulator.c +++ b/drivers/regulator/act8945a-regulator.c @@ -337,7 +337,7 @@ static int act8945a_suspend(struct device *pdev) return regmap_write(act8945a->regmap, ACT8945A_SYS_CTRL, 0x42); } -SIMPLE_DEV_PM_OPS(act8945a_pm, act8945a_suspend, NULL); +static SIMPLE_DEV_PM_OPS(act8945a_pm, act8945a_suspend, NULL); static void act8945a_pmic_shutdown(struct platform_device *pdev) {
RE: [PATCH] arm64: dts: nxp: ls208xa: add more thermal zone support
Hi, PING. BR, Andy > -Original Message- > From: Yuantian Tang > Sent: 2018年10月31日 12:48 > To: shawn...@kernel.org > Cc: Leo Li ; robh...@kernel.org; mark.rutl...@arm.com; > linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org; > linux-kernel@vger.kernel.org; rui.zh...@intel.com; daniel.lezc...@linaro.org; > Andy Tang > Subject: [PATCH] arm64: dts: nxp: ls208xa: add more thermal zone support > > Ls208xa has several thermal sensors. Add all the sensor id to dts to enable > them. > > To make the dts cleaner, re-organize the nodes to split out the common part so > that it can be shared with other SoCs. > > Signed-off-by: Yuantian Tang > --- > arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi |8 +- > arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi |8 +- > arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 83 +++- > arch/arm64/boot/dts/freescale/fsl-tmu-map1.dtsi | 99 + > arch/arm64/boot/dts/freescale/fsl-tmu-map2.dtsi | 99 + > arch/arm64/boot/dts/freescale/fsl-tmu-map3.dtsi | 99 + > arch/arm64/boot/dts/freescale/fsl-tmu.dtsi | 251 > +++ > 7 files changed, 591 insertions(+), 56 deletions(-) create mode 100644 > arch/arm64/boot/dts/freescale/fsl-tmu-map1.dtsi > create mode 100644 arch/arm64/boot/dts/freescale/fsl-tmu-map2.dtsi > create mode 100644 arch/arm64/boot/dts/freescale/fsl-tmu-map3.dtsi > create mode 100644 arch/arm64/boot/dts/freescale/fsl-tmu.dtsi > > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi > b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi > index f9c1d30..8f9788c 100644 > --- a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi > +++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi > @@ -12,7 +12,7 @@ > #include "fsl-ls208xa.dtsi" > > &cpu { > - cpu0: cpu@0 { > + cooling_map0: cpu0: cpu@0 { > device_type = "cpu"; > compatible = "arm,cortex-a57"; > reg = <0x0>; > @@ -32,7 +32,7 @@ > #cooling-cells = <2>; > }; > > - cpu2: cpu@100 { > + cooling_map1: cpu2: cpu@100 { > device_type = "cpu"; > compatible = "arm,cortex-a57"; > reg = <0x100>; > @@ -52,7 +52,7 @@ > #cooling-cells = <2>; > }; > > - cpu4: cpu@200 { > + cooling_map2: cpu4: cpu@200 { > device_type = "cpu"; > compatible = "arm,cortex-a57"; > reg = <0x200>; > @@ -72,7 +72,7 @@ > #cooling-cells = <2>; > }; > > - cpu6: cpu@300 { > + cooling_map3: cpu6: cpu@300 { > device_type = "cpu"; > compatible = "arm,cortex-a57"; > reg = <0x300>; > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi > b/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi > index 7c882da..013fe16 100644 > --- a/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi > +++ b/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi > @@ -12,7 +12,7 @@ > #include "fsl-ls208xa.dtsi" > > &cpu { > - cpu0: cpu@0 { > + cooling_map0: cpu0: cpu@0 { > device_type = "cpu"; > compatible = "arm,cortex-a72"; > reg = <0x0>; > @@ -32,7 +32,7 @@ > #cooling-cells = <2>; > }; > > - cpu2: cpu@100 { > + cooling_map1: cpu2: cpu@100 { > device_type = "cpu"; > compatible = "arm,cortex-a72"; > reg = <0x100>; > @@ -52,7 +52,7 @@ > #cooling-cells = <2>; > }; > > - cpu4: cpu@200 { > + cooling_map2: cpu4: cpu@200 { > device_type = "cpu"; > compatible = "arm,cortex-a72"; > reg = <0x200>; > @@ -72,7 +72,7 @@ > #cooling-cells = <2>; > }; > > - cpu6: cpu@300 { > + cooling_map3: cpu6: cpu@300 { > device_type = "cpu"; > compatible = "arm,cortex-a72"; > reg = <0x300>; > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi > b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi > index 8cb78dd..4102317 100644 > --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi > +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi > @@ -75,54 +75,7 @@ > mask = <0x2>; > }; > > - thermal-zones { > - cpu_thermal: cpu-thermal { > - polling-delay-passive = <1000>; > - polling-delay = <5000>; > - > - thermal-sensors = <&tmu 4>; > - > - trips { > - cpu_alert: cpu-alert { > - temperature = <75000>; > - hysteresis = <2000>; > - type = "passive"; > - }; > - cpu_crit: cpu-crit { > - temperature = <85000>; > - hysteresis = <2000>; > -
Re: [RFC PATCH net v3] net: phy: Fix the issue that netif always links up after resuming
Hi Heiner, On Tue, 18 Dec 2018 07:44:33 +0100 wrote: > On 18.12.2018 07:25, Kunihiko Hayashi wrote: > > Hi Heiner, > > > > On Mon, 17 Dec 2018 19:43:31 +0100 wrote: > > > >> On 17.12.2018 19:41, Heiner Kallweit wrote: > >>> On 17.12.2018 07:41, Kunihiko Hayashi wrote: > Hi, > > Gentle ping... > Are there any comments about changes since v2? > > v2: https://www.spinics.net/lists/netdev/msg536926.html > > Thank you, > > On Mon, 3 Dec 2018 17:22:29 +0900 wrote: > > > Even though the link is down before entering hibernation, > > there is an issue that the network interface always links up after > > resuming > > from hibernation. > > > > The phydev->state is PHY_READY before enabling the network interface, so > > the link is down. After resuming from hibernation, the phydev->state is > > forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes > > up. > > > > This patch adds a new convenient function to check whether the PHY is in > > a started state, and expects to solve the issue by changing > > phydev->state > > to PHY_UP and calling phy_start_machine() only when the PHY is started. > > > >>> This convenience function and the related change to phy_stop() are part of > >>> the following already and don't need to be part of your patch. > >>> https://patchwork.ozlabs.org/patch/1014171/ > > > > I see. I'll follow your patch when necessary. > > > > Suggested-by: Heiner Kallweit > > Signed-off-by: Kunihiko Hayashi > > --- > > > > Changes since v2: > > - add mutex lock/unlock for changing phydev->state > > - check whether the mutex is locked in phy_is_started() > > > > Changes since v1: > > - introduce a new helper function phy_is_started() and use it instead > > of > >checking link status > > - replace checking phydev->state with phy_is_started() in > >phy_stop_machine() > > > > drivers/net/phy/phy.c| 2 +- > > drivers/net/phy/phy_device.c | 12 +--- > > include/linux/phy.h | 13 + > > 3 files changed, 23 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > > index 1d73ac3..f484d03 100644 > > --- a/drivers/net/phy/phy.c > > +++ b/drivers/net/phy/phy.c > > @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev) > > cancel_delayed_work_sync(&phydev->state_queue); > > > > mutex_lock(&phydev->lock); > > - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED) > > + if (phy_is_started(phydev)) > > phydev->state = PHY_UP; > >>> > >>> I'm wondering whether we need to do this. If the PHY is attached, > >>> then mdio_bus_phy_suspend() calls phy_stop_machine() which does > >>> exactly the same. If the PHY is not attached, then we don't have > >>> to do anything. Therefore I think we just have to do the same as > >>> in mdio_bus_phy_resume(): > >>> > >>> if (phydev->attached_dev && phydev->adjust_link) > >>> phy_start_machine(phydev); > > > > Agreed. > > > > Although the original code changed phydev->state, > > it seems that it's only enough to > > - call phy_stop_machine() in mdio_bus_phy_suspend() > > - call phy_start_machine() in mdio_bus_phy_resume() and > > mdio_bus_phy_restore() > > if the PHY is attached. > > > >>> Can you test this? > > > > I tested your code instead of applying my entire patch, and I confirmed > > that the code solved the issue in my environment. > > > > Do you make new patch instead of my patch? > > (and I can add Reported-by: for the issue and Tested-by:) > > > Up to you. It's fine with me if you submit the patch, but I can also do it > and mention you in Reported-by and Tested-by. Just let me know. I see. I'll make and submit the patch as a fix for the issue. Thank you, > > > Thank you, > > > > > >>> > >> Sorry for the confusion, this comment is related to the next part > >> of your patch. > >> > > mutex_unlock(&phydev->lock); > > } > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c > > index ab33d17..4897d24 100644 > > --- a/drivers/net/phy/phy_device.c > > +++ b/drivers/net/phy/phy_device.c > > @@ -309,10 +309,16 @@ static int mdio_bus_phy_restore(struct device > > *dev) > > return ret; > > > > /* The PHY needs to renegotiate. */ > > - phydev->link = 0; > > - phydev->state = PHY_UP; > > + mutex_lock(&phydev->lock); > > + if (phy_is_started(phydev)) { > > + phydev->state = PHY_UP; > > + mutex_unlock(&phydev->lock); > > + phydev->link = 0; > > + phy_start_machine(phydev); > > + } else { > > + mutex_unlock(&phydev->lock)
Re: [PATCH v2 5/5] ASoC: qcom: Kconfig: select config for codec
On Wed, Nov 28, 2018 at 5:01 PM Cheng-Yi Chiang wrote: > > Select SND_SOC_RT5663 and SND_SOC_MAX98927 for SND_SOC_SDM845. > > Signed-off-by: Rohit kumar > Signed-off-by: Cheng-Yi Chiang > --- > sound/soc/qcom/Kconfig | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/sound/soc/qcom/Kconfig b/sound/soc/qcom/Kconfig > index 2a4c912d1e484..3528c4279cbae 100644 > --- a/sound/soc/qcom/Kconfig > +++ b/sound/soc/qcom/Kconfig > @@ -100,6 +100,8 @@ config SND_SOC_SDM845 > depends on QCOM_APR > select SND_SOC_QDSP6 > select SND_SOC_QCOM_COMMON > + select SND_SOC_RT5663 > + select SND_SOC_MAX98927 This line is actually not needed. We can drop this patch as there was another patch merged already: https://lkml.org/lkml/2018/12/10/875. Thanks a lot for taking a look! > help > To add support for audio on Qualcomm Technologies Inc. > SDM845 SoC-based systems. > -- > 2.20.0.rc0.387.gc7a69e6b6c-goog >
[RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead
swap_vma_readahead()'s comment is missed, just add it. Cc: Huang Ying Cc: Tim Chen Cc: Minchan Kim Signed-off-by: Yang Shi --- mm/swap_state.c | 17 + 1 file changed, 17 insertions(+) diff --git a/mm/swap_state.c b/mm/swap_state.c index 7cc3c29..c12aedf 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -695,6 +695,23 @@ static void swap_ra_info(struct vm_fault *vmf, pte_unmap(orig_pte); } +/** + * swap_vm_readahead - swap in pages in hope we need them soon + * @entry: swap entry of this memory + * @gfp_mask: memory allocation flags + * @vmf: fault information + * + * Returns the struct page for entry and addr, after queueing swapin. + * + * Primitive swap readahead code. We simply read in a few pages whoes + * virtual addresses are around the fault address in the same vma. + * + * This has been extended to use the NUMA policies from the mm triggering + * the readahead. + * + * Caller must hold down_read on the vma->vm_mm if vmf->vma is not NULL. + * + */ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask, struct vm_fault *vmf) { -- 1.8.3.1
[PATCH] powerpc/setup: display reason for not booting
When no machine description matches, display it clearly before looping forever. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/setup-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 4fe7740917a7..ef7fb60534a8 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -634,7 +634,7 @@ void probe_machine(void) } /* What can we do if we didn't find ? */ if (machine_id >= &__machine_desc_end) { - DBG("No suitable machine found !\n"); + pr_err("No suitable machine description found !\n"); for (;;); } -- 2.13.3
[RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use inode_read_congested() to check if the underlying device is busy or not like what file page readahead does. Get inode from swap_info_struct. Although we can add inode information in swap_address_space (address_space->host), it may lead some unexpected side effect, i.e. it may break mapping_cap_account_dirty(). Using inode from swap_info_struct seems simple and good enough. Just does the check in vma_cluster_readahead() since swap_vma_readahead() is just used for non-rotational device which much less likely has congestion than traditional HDD. Although swap slots may be consecutive on swap partition, it still may be fragmented on swap file. This check would help to reduce excessive stall for such case. Cc: Huang Ying Cc: Tim Chen Cc: Minchan Kim Signed-off-by: Yang Shi --- mm/swap_state.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/swap_state.c b/mm/swap_state.c index fd2f21e..7cc3c29 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, bool do_poll = true, page_allocated; struct vm_area_struct *vma = vmf->vma; unsigned long addr = vmf->address; + struct inode *inode = si->swap_file->f_mapping->host; mask = swapin_nr_pages(offset) - 1; if (!mask) goto skip; + if (inode_read_congested(inode)) + goto skip; + do_poll = false; /* Read a page_cluster sized and aligned cluster around offset. */ start_offset = offset & ~mask; -- 1.8.3.1
Re: [PATCH v2 3/7] dt-bindings: remoteproc: qcom: Fixup regulator dependencies
Hi Doug, Thanks for the review :) On 2018-12-18 05:30, Doug Anderson wrote: Hi, On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar wrote: Fixup regulator supply dependencies for Q6V5 MSS on MSM996 SoCs. Signed-off-by: Sibi Sankar --- Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt index 780adc043b37..98894e6ad456 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt +++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt @@ -76,7 +76,9 @@ on the Qualcomm Hexagon core. Usage: required Value type: Definition: reference to the regulators to be held on behalf of the - booting of the Hexagon core + booting of the Hexagon core on MSM8916 SoCs + reference to the pll-supply regulator to be held on behalf + of the booting of the Hexagon core on MSM8996 SoCs The prose gets in the way and doesn't add anything. I also don't understand what you're saying for msm8996. You're saying that "pll-supply" is required there but none of the others? That doesn't seem to be true in the code I have in front of me, but maybe I'm missing some patch. For me, I'd write: AFAIK, only the exceptions are captured. But your suggestion seems more simple/complete. Perhaps I'll replace SoCs instead of compatibles? Anyway I'll wait for Bjorn/Rob's preference. For the compatible strings below the following supplies are required: "qcom,q6v5-pil" "qcom,msm8916-mss-pil", "qcom,msm8974-mss-pil" - cx-supply: - mss-supply: - mx-supply: - pll-supply: Usage: required Value type: Definition: reference to the regulators to be held on behalf of the booting of the Hexagon core ...and if msm8996 actually needs "pll-supply", you could add in... For the compatible strings below the following supplies are required: "qcom,msm8996-mss-pil" - pll-supply: Usage: required Value type: Definition: reference to the regulators to be held on behalf of the booting of the Hexagon core -- -- Sibi Sankar -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH][resend] drm: dw-hdmi-i2s: convert to SPDX identifiers
Hi Morimoto-san, Thank you for the patch. On Tuesday, 18 December 2018 08:00:24 EET Kuninori Morimoto wrote: > From: Kuninori Morimoto > > This patch updates license to use SPDX-License-Identifier > instead of verbose license text. > > Signed-off-by: Kuninori Morimoto Reviewed-by: Laurent Pinchart > --- > few weeks passed, nothing happen. I re-post this patch again. > I added Andrew on Cc The driver seems to be lacking a maintainer :-S > drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c > b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index > 8f9c8a6..2228689 100644 > --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c > +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c > @@ -1,12 +1,9 @@ > +// SPDX-License-Identifier: GPL-2.0 > /* > * dw-hdmi-i2s-audio.c > * > * Copyright (c) 2017 Renesas Solutions Corp. > * Kuninori Morimoto > - * > - * This program is free software; you can redistribute it and/or modify > - * it under the terms of the GNU General Public License version 2 as > - * published by the Free Software Foundation. > */ > #include -- Regards, Laurent Pinchart
Re: [RFC PATCH net v3] net: phy: Fix the issue that netif always links up after resuming
On 18.12.2018 07:25, Kunihiko Hayashi wrote: > Hi Heiner, > > On Mon, 17 Dec 2018 19:43:31 +0100 wrote: > >> On 17.12.2018 19:41, Heiner Kallweit wrote: >>> On 17.12.2018 07:41, Kunihiko Hayashi wrote: Hi, Gentle ping... Are there any comments about changes since v2? v2: https://www.spinics.net/lists/netdev/msg536926.html Thank you, On Mon, 3 Dec 2018 17:22:29 +0900 wrote: > Even though the link is down before entering hibernation, > there is an issue that the network interface always links up after > resuming > from hibernation. > > The phydev->state is PHY_READY before enabling the network interface, so > the link is down. After resuming from hibernation, the phydev->state is > forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up. > > This patch adds a new convenient function to check whether the PHY is in > a started state, and expects to solve the issue by changing phydev->state > to PHY_UP and calling phy_start_machine() only when the PHY is started. > >>> This convenience function and the related change to phy_stop() are part of >>> the following already and don't need to be part of your patch. >>> https://patchwork.ozlabs.org/patch/1014171/ > > I see. I'll follow your patch when necessary. > > Suggested-by: Heiner Kallweit > Signed-off-by: Kunihiko Hayashi > --- > > Changes since v2: > - add mutex lock/unlock for changing phydev->state > - check whether the mutex is locked in phy_is_started() > > Changes since v1: > - introduce a new helper function phy_is_started() and use it instead of >checking link status > - replace checking phydev->state with phy_is_started() in >phy_stop_machine() > > drivers/net/phy/phy.c| 2 +- > drivers/net/phy/phy_device.c | 12 +--- > include/linux/phy.h | 13 + > 3 files changed, 23 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 1d73ac3..f484d03 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev) > cancel_delayed_work_sync(&phydev->state_queue); > > mutex_lock(&phydev->lock); > - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED) > + if (phy_is_started(phydev)) > phydev->state = PHY_UP; >>> >>> I'm wondering whether we need to do this. If the PHY is attached, >>> then mdio_bus_phy_suspend() calls phy_stop_machine() which does >>> exactly the same. If the PHY is not attached, then we don't have >>> to do anything. Therefore I think we just have to do the same as >>> in mdio_bus_phy_resume(): >>> >>> if (phydev->attached_dev && phydev->adjust_link) >>> phy_start_machine(phydev); > > Agreed. > > Although the original code changed phydev->state, > it seems that it's only enough to > - call phy_stop_machine() in mdio_bus_phy_suspend() > - call phy_start_machine() in mdio_bus_phy_resume() and mdio_bus_phy_restore() > if the PHY is attached. > >>> Can you test this? > > I tested your code instead of applying my entire patch, and I confirmed > that the code solved the issue in my environment. > > Do you make new patch instead of my patch? > (and I can add Reported-by: for the issue and Tested-by:) > Up to you. It's fine with me if you submit the patch, but I can also do it and mention you in Reported-by and Tested-by. Just let me know. > Thank you, > > >>> >> Sorry for the confusion, this comment is related to the next part >> of your patch. >> > mutex_unlock(&phydev->lock); > } > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c > index ab33d17..4897d24 100644 > --- a/drivers/net/phy/phy_device.c > +++ b/drivers/net/phy/phy_device.c > @@ -309,10 +309,16 @@ static int mdio_bus_phy_restore(struct device *dev) > return ret; > > /* The PHY needs to renegotiate. */ > - phydev->link = 0; > - phydev->state = PHY_UP; > + mutex_lock(&phydev->lock); > + if (phy_is_started(phydev)) { > + phydev->state = PHY_UP; > + mutex_unlock(&phydev->lock); > + phydev->link = 0; > + phy_start_machine(phydev); > + } else { > + mutex_unlock(&phydev->lock); > + } > > - phy_start_machine(phydev); > > return 0; > } > diff --git a/include/linux/phy.h b/include/linux/phy.h > index 3ea87f7..dd21537 100644 > --- a/include/linux/phy.h > +++ b/include/linux/phy.h > @@ -898,6 +898,19 @@ static inline bool phy_is_pseudo_fixed_link(struct > phy_device *phydev) > } > > /** > + * phy_is_started - Convenience function for testing whether a PHY is in > + * a started state > + * @phydev: the phy_device s
Re: [PATCH] pinctrl: xway: fix gpio-hog related boot issues
On 2018-12-17 17:45, John Crispin wrote: On 17/12/2018 15:32, Linus Walleij wrote: On Fri, Dec 14, 2018 at 8:48 AM Martin Schiller wrote: This patch is based on commit a86caa9ba5d7 ("pinctrl: msm: fix gpio-hog related boot issues"). It fixes the issue that the gpio ranges needs to be defined before gpiochip_add(). Therefore, we also have to swap the order of registering the pinctrl driver and registering the gpio chip. You also have to add the "gpio-ranges" property to the pinctrl device node to get it finally working. Signed-off-by: Martin Schiller Patch applied unless John Crispin has objections, it looks good to me! Yours, Linus Walleij sorry did not see the patch in my inbox Sorry, that was my fault. I've added everyone from getmaintainers.pl output, but forgot to also add you. Regards, Martin
Re: [PATCH v2 6/7] arm64: dts: qcom: sdm845: Add power-domain for Q6V5 MSS node
Hi Doug, Thanks for the review :) On 2018-12-18 05:32, Doug Anderson wrote: Hi, On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar wrote: Add power-domains cx, mx, mss and load_state for Q6V5 MSS node. Signed-off-by: Sibi Sankar --- This patch depends on the following bindings: https://patchwork.kernel.org/patch/10725801/ - rpmhpd dt bindings https://patchwork.kernel.org/patch/10725793/ - rpmhpd dt node https://patchwork.kernel.org/patch/10678301/ - AOP QMP dt bindings arch/arm64/boot/dts/qcom/sdm845.dtsi | 6 ++ 1 file changed, 6 insertions(+) As per my comments on patch #5, I think this patch (AKA patch #6) should be folded in there. okay diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index 33ff8668828f..56f5f55db9e2 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -1401,6 +1401,12 @@ qcom,halt-regs = <&tcsr_mutex_regs 0x23000 0x25000 0x24000>; + power-domains = <&aoss_qmp_pd AOSS_QMP_LS_MODEM>, + <&rpmhpd SDM845_CX>, + <&rpmhpd SDM845_MX>, + <&rpmhpd SDM845_MSS>; + power-domain-names = "load_state", "cx", "mx", "mss"; I guess you changed this to "load_state" from "aop" before? Is there code that actually uses this? Bjorn said he will be posting the patch for handling power-domains for mss.. -Doug -Doug -- -- Sibi Sankar -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v2 5/7] arm64: dts: qcom: sdm845: Add Q6V5 MSS node
Hi Doug, Thanks for the review :) On 2018-12-18 05:32, Doug Anderson wrote: Hi, On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar wrote: This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs. Signed-off-by: Sibi Sankar --- v2: * Fixed style changes * Added missing clocks in the dt-bindings * Split mss remoteproc node into a number of patches I know there was some off-list suggestion to split this into a number of patches, but to actually make that useful to anyone we'd actually need to _also_ post up patches to make the driver probe / work without these power domains. ...and as per other discussions it's kinda "lucky" that it happens to work without them and Bjorn wasn't supportive of making this optional. So I'd actually fold patch 6 into patch 5 and focus on getting the "aoss_qmp_pd" landed sooner rather than later. I'll fold them in v3 Keeping the "shutdown-ack" as a separate patch makes sense though since the bindings currently list that as "optional" and I guess things work OK w/out it. Once patch #6 is folded into patch #5 feel free to add my Reviewed-by tag. okay -- -- Sibi Sankar -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v6 4/6] mm: Shuffle initial free memory to improve memory-side-cache utilization
Hi Dan, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v4.20-rc7 next-20181217] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Dan-Williams/mm-Randomize-free-memory/20181218-130230 config: x86_64-randconfig-x010-201850 (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): mm/memblock.c: In function 'memblock_set_sidecache': >> mm/memblock.c:859:4: error: too many arguments to function >> 'page_alloc_shuffle' page_alloc_shuffle(SHUFFLE_ENABLE); ^~ In file included from mm/memblock.c:20:0: include/linux/shuffle.h:43:20: note: declared here static inline void page_alloc_shuffle(void) ^~ vim +/page_alloc_shuffle +859 mm/memblock.c 825 826 #ifdef CONFIG_HAVE_MEMBLOCK_CACHE_INFO 827 /** 828 * memblock_set_sidecache - set the system memory cache info 829 * @base: base address of the region 830 * @size: size of the region 831 * @cache_size: system side cache size in bytes 832 * @direct: true if the cache has direct mapped associativity 833 * 834 * This function isolates region [@base, @base + @size), and saves the cache 835 * information. 836 * 837 * Return: 0 on success, -errno on failure. 838 */ 839 int __init_memblock memblock_set_sidecache(phys_addr_t base, phys_addr_t size, 840 phys_addr_t cache_size, bool direct_mapped) 841 { 842 struct memblock_type *type = &memblock.memory; 843 int i, ret, start_rgn, end_rgn; 844 845 ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); 846 if (ret) 847 return ret; 848 849 for (i = start_rgn; i < end_rgn; i++) { 850 struct memblock_region *r = &type->regions[i]; 851 852 r->cache_size = cache_size; 853 r->direct_mapped = direct_mapped; 854 /* 855 * Enable randomization for amortizing direct-mapped 856 * memory-side-cache conflicts. 857 */ 858 if (r->size > r->cache_size && r->direct_mapped) > 859 page_alloc_shuffle(SHUFFLE_ENABLE); 860 } 861 862 return 0; 863 } 864 #endif 865 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v2 4/7] dt-bindings: remoteproc: qcom: Add power-domain bindings for Q6V5
Hi Doug, Thanks for the review :) On 2018-12-18 05:31, Doug Anderson wrote: Hi, On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar wrote: Add power-domain bindings for Q6V5 MSS on MSM8996 and SDM845 SoCs. Reviewed-by: Rob Herring Signed-off-by: Sibi Sankar --- v2: * Add load_state power-domain * List cx and mx power-domains for MSM8996 .../devicetree/bindings/remoteproc/qcom,q6v5.txt | 16 1 file changed, 16 insertions(+) diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt index 98894e6ad456..50695cd86397 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt +++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt @@ -80,6 +80,22 @@ on the Qualcomm Hexagon core. reference to the pll-supply regulator to be held on behalf of the booting of the Hexagon core on MSM8996 SoCs +- power-domains: + Usage: required + Value type: + Definition: reference to the list of 2 power-domains for the modem + sub-system on MSM8996 SoCs This is truly required for msm8996 SoCs? The code I'm looking at doesn't try to get these power domains for 8996 so presumably you're breaking backward compatibility with old device tree files by making this required now. I don't personally know how widespread msm8996 usage is w/ upstream, so I'd let Bjorn comment on whether he thinks this is OK. This is one of the reasons why the dt node for mss on 8996 has not been posted/merged upstream. Hence backward compatibility is not broken yet in mainline :) .. However it will break on official linaro integration releases (old dt + new kernel) As with the other patches in this series, I personally prefer less prose and more lists / tables of exactly what is required for which compatible string. + reference to the list of 4 power-domains for the modem + sub-system on SDM845 SoCs + +- power-domain-names: + Usage: required + Value type: + Definition: must be "cx", "mx" for the modem sub-system on MSM8996 + SoCs + must be "cx", "mx", "mss", "load_state" for the modem + sub-system on SDM845 SoCs I haven't see a patch for using "load_state". Can you point at it? I guess this was "aop" in your last version? using load_state was Bjorn's suggestion and seemed more appropriate than aop -Doug -- -- Sibi Sankar -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [RFC PATCH net v3] net: phy: Fix the issue that netif always links up after resuming
Hi Heiner, On Mon, 17 Dec 2018 19:43:31 +0100 wrote: > On 17.12.2018 19:41, Heiner Kallweit wrote: > > On 17.12.2018 07:41, Kunihiko Hayashi wrote: > >> Hi, > >> > >> Gentle ping... > >> Are there any comments about changes since v2? > >> > >> v2: https://www.spinics.net/lists/netdev/msg536926.html > >> > >> Thank you, > >> > >> On Mon, 3 Dec 2018 17:22:29 +0900 wrote: > >> > >>> Even though the link is down before entering hibernation, > >>> there is an issue that the network interface always links up after > >>> resuming > >>> from hibernation. > >>> > >>> The phydev->state is PHY_READY before enabling the network interface, so > >>> the link is down. After resuming from hibernation, the phydev->state is > >>> forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up. > >>> > >>> This patch adds a new convenient function to check whether the PHY is in > >>> a started state, and expects to solve the issue by changing phydev->state > >>> to PHY_UP and calling phy_start_machine() only when the PHY is started. > >>> > > This convenience function and the related change to phy_stop() are part of > > the following already and don't need to be part of your patch. > > https://patchwork.ozlabs.org/patch/1014171/ I see. I'll follow your patch when necessary. > >>> Suggested-by: Heiner Kallweit > >>> Signed-off-by: Kunihiko Hayashi > >>> --- > >>> > >>> Changes since v2: > >>> - add mutex lock/unlock for changing phydev->state > >>> - check whether the mutex is locked in phy_is_started() > >>> > >>> Changes since v1: > >>> - introduce a new helper function phy_is_started() and use it instead of > >>>checking link status > >>> - replace checking phydev->state with phy_is_started() in > >>>phy_stop_machine() > >>> > >>> drivers/net/phy/phy.c| 2 +- > >>> drivers/net/phy/phy_device.c | 12 +--- > >>> include/linux/phy.h | 13 + > >>> 3 files changed, 23 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > >>> index 1d73ac3..f484d03 100644 > >>> --- a/drivers/net/phy/phy.c > >>> +++ b/drivers/net/phy/phy.c > >>> @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev) > >>> cancel_delayed_work_sync(&phydev->state_queue); > >>> > >>> mutex_lock(&phydev->lock); > >>> - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED) > >>> + if (phy_is_started(phydev)) > >>> phydev->state = PHY_UP; > > > > I'm wondering whether we need to do this. If the PHY is attached, > > then mdio_bus_phy_suspend() calls phy_stop_machine() which does > > exactly the same. If the PHY is not attached, then we don't have > > to do anything. Therefore I think we just have to do the same as > > in mdio_bus_phy_resume(): > > > > if (phydev->attached_dev && phydev->adjust_link) > > phy_start_machine(phydev); Agreed. Although the original code changed phydev->state, it seems that it's only enough to - call phy_stop_machine() in mdio_bus_phy_suspend() - call phy_start_machine() in mdio_bus_phy_resume() and mdio_bus_phy_restore() if the PHY is attached. > > Can you test this? I tested your code instead of applying my entire patch, and I confirmed that the code solved the issue in my environment. Do you make new patch instead of my patch? (and I can add Reported-by: for the issue and Tested-by:) Thank you, > > > Sorry for the confusion, this comment is related to the next part > of your patch. > > >>> mutex_unlock(&phydev->lock); > >>> } > >>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c > >>> index ab33d17..4897d24 100644 > >>> --- a/drivers/net/phy/phy_device.c > >>> +++ b/drivers/net/phy/phy_device.c > >>> @@ -309,10 +309,16 @@ static int mdio_bus_phy_restore(struct device *dev) > >>> return ret; > >>> > >>> /* The PHY needs to renegotiate. */ > >>> - phydev->link = 0; > >>> - phydev->state = PHY_UP; > >>> + mutex_lock(&phydev->lock); > >>> + if (phy_is_started(phydev)) { > >>> + phydev->state = PHY_UP; > >>> + mutex_unlock(&phydev->lock); > >>> + phydev->link = 0; > >>> + phy_start_machine(phydev); > >>> + } else { > >>> + mutex_unlock(&phydev->lock); > >>> + } > >>> > >>> - phy_start_machine(phydev); > >>> > >>> return 0; > >>> } > >>> diff --git a/include/linux/phy.h b/include/linux/phy.h > >>> index 3ea87f7..dd21537 100644 > >>> --- a/include/linux/phy.h > >>> +++ b/include/linux/phy.h > >>> @@ -898,6 +898,19 @@ static inline bool phy_is_pseudo_fixed_link(struct > >>> phy_device *phydev) > >>> } > >>> > >>> /** > >>> + * phy_is_started - Convenience function for testing whether a PHY is in > >>> + * a started state > >>> + * @phydev: the phy_device struct > >>> + * > >>> + * The caller must have taken the phy_device mutex lock. > >>> + */ > >>> +static inline bool phy_is_started(struct phy_device *phydev) > >>> +{ > >>> + WARN_ON(!mutex_is_locked(&phydev->lock)); > >>> + return phyd
Re: [PATCH 01/18] mfd: aat2870-core: Make it explicitly non-modular
Acked-by : Jin Park Thanks, Jinyoung. On 12/18/18 5:31 AM, Paul Gortmaker wrote: The Kconfig currently controlling compilation of this code is: drivers/mfd/Kconfig:config MFD_AAT2870_CORE drivers/mfd/Kconfig:bool "AnalogicTech AAT2870" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. We explicitly disallow a driver unbind, since that doesn't have a sensible use case anyway, and it allows us to drop the ".remove" code for non-modular drivers. Since module_init was not in use by this code, the init ordering remains unchanged with this commit. Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Cc: Lee Jones Cc: Jin Park Signed-off-by: Paul Gortmaker Acked-by: Linus Walleij --- drivers/mfd/aat2870-core.c | 40 +++- 1 file changed, 3 insertions(+), 37 deletions(-) diff --git a/drivers/mfd/aat2870-core.c b/drivers/mfd/aat2870-core.c index 3ba19a45f199..9d3d90d386c2 100644 --- a/drivers/mfd/aat2870-core.c +++ b/drivers/mfd/aat2870-core.c @@ -20,7 +20,6 @@ */ #include -#include #include #include #include @@ -349,18 +348,10 @@ static void aat2870_init_debugfs(struct aat2870_data *aat2870) "Failed to create debugfs register file\n"); } -static void aat2870_uninit_debugfs(struct aat2870_data *aat2870) -{ - debugfs_remove_recursive(aat2870->dentry_root); -} #else static inline void aat2870_init_debugfs(struct aat2870_data *aat2870) { } - -static inline void aat2870_uninit_debugfs(struct aat2870_data *aat2870) -{ -} #endif /* CONFIG_DEBUG_FS */ static int aat2870_i2c_probe(struct i2c_client *client, @@ -440,20 +431,6 @@ static int aat2870_i2c_probe(struct i2c_client *client, return ret; } -static int aat2870_i2c_remove(struct i2c_client *client) -{ - struct aat2870_data *aat2870 = i2c_get_clientdata(client); - - aat2870_uninit_debugfs(aat2870); - - mfd_remove_devices(aat2870->dev); - aat2870_disable(aat2870); - if (aat2870->uninit) - aat2870->uninit(aat2870); - - return 0; -} - #ifdef CONFIG_PM_SLEEP static int aat2870_i2c_suspend(struct device *dev) { @@ -492,15 +469,14 @@ static const struct i2c_device_id aat2870_i2c_id_table[] = { { "aat2870", 0 }, { } }; -MODULE_DEVICE_TABLE(i2c, aat2870_i2c_id_table); static struct i2c_driver aat2870_i2c_driver = { .driver = { - .name = "aat2870", - .pm = &aat2870_pm_ops, + .name = "aat2870", + .pm = &aat2870_pm_ops, + .suppress_bind_attrs= true, }, .probe = aat2870_i2c_probe, - .remove = aat2870_i2c_remove, .id_table = aat2870_i2c_id_table, }; @@ -509,13 +485,3 @@ static int __init aat2870_init(void) return i2c_add_driver(&aat2870_i2c_driver); } subsys_initcall(aat2870_init); - -static void __exit aat2870_exit(void) -{ - i2c_del_driver(&aat2870_i2c_driver); -} -module_exit(aat2870_exit); - -MODULE_DESCRIPTION("Core support for the AnalogicTech AAT2870"); -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Jin Park ");
Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
On Mon, Dec 17, 2018 at 10:34:43AM -0800, Matthew Wilcox wrote: > On Mon, Dec 17, 2018 at 01:11:50PM -0500, Jerome Glisse wrote: > > On Mon, Dec 17, 2018 at 08:58:19AM +1100, Dave Chinner wrote: > > > Sure, that's a possibility, but that doesn't close off any race > > > conditions because there can be DMA into the page in progress while > > > the page is being bounced, right? AFAICT this ext3+DIF/DIX case is > > > different in that there is no 3rd-party access to the page while it > > > is under IO (ext3 arbitrates all access to it's metadata), and so > > > nothing can actually race for modification of the page between > > > submission and bouncing at the block layer. > > > > > > In this case, the moment the page is unlocked, anyone else can map > > > it and start (R)DMA on it, and that can happen before the bio is > > > bounced by the block layer. So AFAICT, block layer bouncing doesn't > > > solve the problem of racing writeback and DMA direct to the page we > > > are doing IO on. Yes, it reduces the race window substantially, but > > > it doesn't get rid of it. > > > > So the event flow is: > > - userspace create object that match a range of virtual address > > against a given kernel sub-system (let's say infiniband) and > > let's assume that the range is an mmap() of a regular file > > - device driver do GUP on the range (let's assume it is a write > > GUP) so if the page is not already map with write permission > > in the page table than a page fault is trigger and page_mkwrite > > happens > > - Once GUP return the page to the device driver and once the > > device driver as updated the hardware states to allow access > > to this page then from that point on hardware can write to the > > page at _any_ time, it is fully disconnected from any fs event > > like write back, it fully ignore things like page_mkclean > > > > This is how it is to day, we allowed people to push upstream such > > users of GUP. This is a fact we have to live with, we can not stop > > hardware access to the page, we can not force the hardware to follow > > page_mkclean and force a page_mkwrite once write back ends. This is > > the situation we are inheriting (and i am personnaly not happy with > > that). > > > > >From my point of view we are left with 2 choices: > > [C1] break all drivers that do not abide by the page_mkclean and > > page_mkwrite > > [C2] mitigate as much as possible the issue > > > > For [C2] the idea is to keep track of GUP per page so we know if we > > can expect the page to be written to at any time. Here is the event > > flow: > > - driver GUP the page and program the hardware, page is mark as > > GUPed > > ... > > - write back kicks in on the dirty page, lock the page and every > > thing as usual , sees it is GUPed and inform the block layer to > > use a bounce page > > No. The solution John, Dan & I have been looking at is to take the > dirty page off the LRU while it is pinned by GUP. It will never be > found for writeback. > > That's not the end of the story though. Other parts of the kernel (eg > msync) also need to be taught to stay away from pages which are pinned > by GUP. But the idea is that no page gets written back to storage while > it's pinned by GUP. Only when the last GUP ends is the page returned > to the list of dirty pages. Errr... what does fsync do in the meantime? Not write the page? That would seem to break what fsync() is supposed to do. --D > > - block layer copy the page to a bounce page effectively creating > > a snapshot of what is the content of the real page. This allows > > everything in block layer that need stable content to work on > > the bounce page (raid, stripping, encryption, ...) > > - once write back is done the page is not marked clean but stays > > dirty, this effectively disable things like COW for filesystem > > and other feature that expect page_mkwrite between write back. > > AFAIK it is believe that it is something acceptable > > So none of this is necessary. >
Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
Hi, On 2018/12/17 18:51, Tetsuo Handa wrote: > On 2018/12/17 18:33, Michal Hocko wrote: >> On Sun 16-12-18 19:51:57, Matthew Wilcox wrote: >> [...] >>> Ah, yes, that makes perfect sense. Thank you for the explanation. >>> >>> I wonder if the correct fix, however, is not to move the check for >>> GFP_NOFS in out_of_memory() down to below the check whether to kill >>> the current task. That would solve your problem, and I don't _think_ >>> it would cause any new ones. Michal, you touched this code last, what >>> do you think? >> >> What do you mean exactly? Whether we kill a current task or something >> else doesn't change much on the fact that NOFS is a reclaim restricted >> context and we might kill too early. If the fs can do GFP_FS then it is >> obviously a better thing to do because FS metadata can be reclaimed as >> well and therefore there is potentially less memory pressure on >> application data. >> > > I interpreted "to move the check for GFP_NOFS in out_of_memory() down to > below the check whether to kill the current task" as > > @@ -1077,15 +1077,6 @@ bool out_of_memory(struct oom_control *oc) > } > > /* > - * The OOM killer does not compensate for IO-less reclaim. > - * pagefault_out_of_memory lost its gfp context so we have to > - * make sure exclude 0 mask - all other users should have at least > - * ___GFP_DIRECT_RECLAIM to get here. > - */ > - if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) > - return true; > - > - /* >* Check if there were limitations on the allocation (only relevant for >* NUMA and memcg) that may require different handling. >*/ > @@ -1104,6 +1095,19 @@ bool out_of_memory(struct oom_control *oc) > } > > select_bad_process(oc); > + > + /* > + * The OOM killer does not compensate for IO-less reclaim. > + * pagefault_out_of_memory lost its gfp context so we have to > + * make sure exclude 0 mask - all other users should have at least > + * ___GFP_DIRECT_RECLAIM to get here. > + */ > + if ((oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) && oc->chosen && > + oc->chosen != (void *)-1UL && oc->chosen != current) { > + put_task_struct(oc->chosen); > + return true; > + } > + > /* Found nothing?!?! */ > if (!oc->chosen) { > dump_header(oc, NULL); > > which is prefixed by "the correct fix is not". > > Behaving like sysctl_oom_kill_allocating_task == 1 if __GFP_FS is not used > will not be the correct fix. But ... > > Hou Tao wrote: >> There is no need to disable __GFP_FS in ->readpage: >> * It's a read-only fs, so there will be no dirty/writeback page and >> there will be no deadlock against the caller's locked page > > is read-only filesystem sufficient for safe to use __GFP_FS? > > Isn't "whether it is safe to use __GFP_FS" depends on "whether fs locks > are held or not" rather than "whether fs has dirty/writeback page or not" ? > In my understanding (correct me if I am wrong), there are three ways through which reclamation will invoked fs related code and may cause dead-lock: (1) write-back dirty pages. Not possible for squashfs. (2) the reclamation of inodes & dentries. The current file is in-use, so it will be not reclaimed, and for other reclaimable inodes, squashfs_destroy_inode() will be invoked and it doesn't take any locks. (3) customized shrinker defined by fs. No customized shrinker in squashfs. So my point is that even a page lock is already held by squashfs_readpage() and reclamation invokes back to squashfs code, there will be no dead-lock, so it's safe to use __GFP_FS. Regards, Tao > . >
[PATCH 2/2] irqchip: irq-renesas-intc-irqpin: convert to SPDX identifiers
From: Kuninori Morimoto This patch updates license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: Kuninori Morimoto Reviewed-by: Simon Horman --- drivers/irqchip/irq-renesas-intc-irqpin.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/irqchip/irq-renesas-intc-irqpin.c b/drivers/irqchip/irq-renesas-intc-irqpin.c index c6e6c9e..8c03952 100644 --- a/drivers/irqchip/irq-renesas-intc-irqpin.c +++ b/drivers/irqchip/irq-renesas-intc-irqpin.c @@ -1,20 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0 /* * Renesas INTC External IRQ Pin Driver * * Copyright (C) 2013 Magnus Damm - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ #include -- 2.7.4
[PATCH v2] ACPI / tables: table override from built-in initrd
In some scenario, we need to build initrd with kernel in a single image. This can simplify system deployment process by downloading the whole system once, such as in IC verification. This patch adds support to override ACPI tables from built-in initrd. Cc: Joey Zheng Signed-off-by: Shunyong Yang --- v2: change "upgrade" to "override" as it's more accurate --- Documentation/acpi/initrd_table_override.txt | 4 drivers/acpi/Kconfig | 10 ++ drivers/acpi/tables.c| 12 ++-- include/linux/initrd.h | 3 +++ 4 files changed, 27 insertions(+), 2 deletions(-) diff --git a/Documentation/acpi/initrd_table_override.txt b/Documentation/acpi/initrd_table_override.txt index eb651a6aa285..324d5fb90a22 100644 --- a/Documentation/acpi/initrd_table_override.txt +++ b/Documentation/acpi/initrd_table_override.txt @@ -14,6 +14,10 @@ upgrade the ACPI execution environment that is defined by the ACPI tables via upgrading the ACPI tables provided by the BIOS with an instrumented, modified, more recent version one, or installing brand new ACPI tables. +When building initrd with kernel in a single image, option +ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this +purpose. + For a full list of ACPI tables that can be upgraded/installed, take a look at the char *table_sigs[MAX_ACPI_SIGNATURE]; definition in drivers/acpi/tables.c. diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 7cea769c37df..3b362a1c7685 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -357,6 +357,16 @@ config ACPI_TABLE_UPGRADE initrd, therefore it's safe to say Y. See Documentation/acpi/initrd_table_override.txt for details +config ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD + bool "Override ACPI tables from built-in initrd" + depends on ACPI_TABLE_UPGRADE + depends on INITRAMFS_SOURCE!="" && INITRAMFS_COMPRESSION="" + def_bool n + help + This option provides functionality to override arbitrary ACPI tables + from built-in uncompressed initrd. + See Documentation/acpi/initrd_table_override.txt for details + config ACPI_DEBUG bool "Debug Statements" help diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 61203eebf3a1..f6a2c5ebabcd 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -473,14 +473,22 @@ static u8 __init acpi_table_checksum(u8 *buffer, u32 length) void __init acpi_table_upgrade(void) { - void *data = (void *)initrd_start; - size_t size = initrd_end - initrd_start; + void *data; + size_t size; int sig, no, table_nr = 0, total_offset = 0; long offset = 0; struct acpi_table_header *table; char cpio_path[32] = "kernel/firmware/acpi/"; struct cpio_data file; + if (IS_ENABLED(CONFIG_ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD)) { + data = __initramfs_start; + size = __initramfs_size; + } else { + data = (void *)initrd_start; + size = initrd_end - initrd_start; + } + if (data == NULL || size == 0) return; diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 84b423044088..02d94aae54c7 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -22,3 +22,6 @@ extern void free_initrd_mem(unsigned long, unsigned long); extern unsigned int real_root_dev; + +extern char __initramfs_start[]; +extern unsigned long __initramfs_size; -- 1.8.3.1
[PATCH 1/2] irqchip: irq-renesas-irqc: convert to SPDX identifiers
From: Kuninori Morimoto This patch updates license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: Kuninori Morimoto Reviewed-by: Simon Horman --- drivers/irqchip/irq-renesas-irqc.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/irqchip/irq-renesas-irqc.c b/drivers/irqchip/irq-renesas-irqc.c index a4f1112..a449a7c 100644 --- a/drivers/irqchip/irq-renesas-irqc.c +++ b/drivers/irqchip/irq-renesas-irqc.c @@ -1,20 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0 /* * Renesas IRQC Driver * * Copyright (C) 2013 Magnus Damm - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ #include -- 2.7.4
[PATCH 0/2][resend] irqchip: irq-renesas-xxx: convert to SPDX identifiers
Hi Thomas, Jason, Marc I have posted these and 2weeks passed. Nothing happen, thus, I re-post again Kuninori Morimoto (2): irqchip: irq-renesas-irqc: convert to SPDX identifiers irqchip: irq-renesas-intc-irqpin: convert to SPDX identifiers drivers/irqchip/irq-renesas-intc-irqpin.c | 14 +- drivers/irqchip/irq-renesas-irqc.c| 14 +- 2 files changed, 2 insertions(+), 26 deletions(-) -- 2.7.4 Best regards --- Kuninori Morimoto
[PATCH][resend] drm: dw-hdmi-i2s: convert to SPDX identifiers
From: Kuninori Morimoto This patch updates license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: Kuninori Morimoto --- few weeks passed, nothing happen. I re-post this patch again. I added Andrew on Cc drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index 8f9c8a6..2228689 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c @@ -1,12 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * dw-hdmi-i2s-audio.c * * Copyright (c) 2017 Renesas Solutions Corp. * Kuninori Morimoto - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. */ #include -- 2.7.4
Re: [PATCH v17 18/23] platform/x86: Intel SGX driver
On Thu, Nov 15, 2018 at 5:08 PM Jarkko Sakkinen wrote: > > Intel Software Guard eXtensions (SGX) is a set of CPU instructions that > can be used by applications to set aside private regions of code and > data. The code outside the enclave is disallowed to access the memory > inside the enclave by the CPU access control. This is a very partial review. > +int sgx_encl_find(struct mm_struct *mm, unsigned long addr, > + struct vm_area_struct **vma) > +{ > + struct vm_area_struct *result; > + struct sgx_encl *encl; > + > + result = find_vma(mm, addr); > + if (!result || result->vm_ops != &sgx_vm_ops || addr < > result->vm_start) > + return -EINVAL; > + > + encl = result->vm_private_data; > + *vma = result; > + > + return encl ? 0 : -ENOENT; > +} I realize that this function may go away entirely but, if you keep it: what are the locking rules? What, if anything, prevents another thread from destroying the enclave after sgx_encl_find() returns? > +static int sgx_validate_secs(const struct sgx_secs *secs, > +unsigned long ssaframesize) > +{ ... > + if (secs->attributes & SGX_ATTR_MODE64BIT) { > + if (secs->size > sgx_encl_size_max_64) > + return -EINVAL; > + } else { > + /* On 64-bit architecture allow 32-bit encls only in > +* the compatibility mode. > +*/ > + if (!test_thread_flag(TIF_ADDR32)) > + return -EINVAL; > + if (secs->size > sgx_encl_size_max_32) > + return -EINVAL; > + } Why do we need the 32-bit-on-64-bit check? In general, anything that checks per-task or per-mm flags like TIF_ADDR32 is IMO likely to be problematic. You're allowing 64-bit enclaves in 32-bit tasks, so I'm guessing you could just delete the check. > + > + if (!(secs->xfrm & XFEATURE_MASK_FP) || > + !(secs->xfrm & XFEATURE_MASK_SSE) || > + (((secs->xfrm >> XFEATURE_BNDREGS) & 1) != > +((secs->xfrm >> XFEATURE_BNDCSR) & 1)) || > + (secs->xfrm & ~sgx_xfrm_mask)) > + return -EINVAL; Do we need to check that the enclave doesn't use xfeatures that the kernel doesn't know about? Or are they all safe by design in enclave mode? > +static int sgx_encl_pm_notifier(struct notifier_block *nb, > + unsigned long action, void *data) > +{ > + struct sgx_encl *encl = container_of(nb, struct sgx_encl, > pm_notifier); > + > + if (action != PM_SUSPEND_PREPARE && action != PM_HIBERNATION_PREPARE) > + return NOTIFY_DONE; Hmm. There's an argument to made that omitting this would better exercise the code that handles fully asynchronous loss of an enclave. Also, I think you're unnecessarily killing enclaves when suspend is attempted but fails. > + > +static int sgx_get_key_hash(const void *modulus, void *hash) > +{ > + struct crypto_shash *tfm; > + int ret; > + > + tfm = crypto_alloc_shash("sha256", 0, CRYPTO_ALG_ASYNC); > + if (IS_ERR(tfm)) > + return PTR_ERR(tfm); > + > + ret = __sgx_get_key_hash(tfm, modulus, hash); > + > + crypto_free_shash(tfm); > + return ret; > +} > + I'm so sorry you had to deal with this API. Once Zinc lands, you could clean this up :) > +static int sgx_encl_get(unsigned long addr, struct sgx_encl **encl) > +{ > + struct mm_struct *mm = current->mm; > + struct vm_area_struct *vma; > + int ret; > + > + if (addr & (PAGE_SIZE - 1)) > + return -EINVAL; > + > + down_read(&mm->mmap_sem); > + > + ret = sgx_encl_find(mm, addr, &vma); > + if (!ret) { > + *encl = vma->vm_private_data; > + > + if ((*encl)->flags & SGX_ENCL_SUSPEND) > + ret = SGX_POWER_LOST_ENCLAVE; > + else > + kref_get(&(*encl)->refcount); > + } Hmm. This version has explicit refcounting. > +static int sgx_mmap(struct file *file, struct vm_area_struct *vma) > +{ > + vma->vm_ops = &sgx_vm_ops; > + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO | > +VM_DONTCOPY; > + > + return 0; > +} > + > +static unsigned long sgx_get_unmapped_area(struct file *file, > + unsigned long addr, > + unsigned long len, > + unsigned long pgoff, > + unsigned long flags) > +{ > + if (len < 2 * PAGE_SIZE || (len & (len - 1))) > + return -EINVAL; > + > + if (len > sgx_encl_size_max_64) > + return -EINVAL; > + > + if (len > sgx_encl_size_max_32 && test_thread_flag(TIF_ADDR32)) > + return -EINVAL; Generally speaking, th
Re: [PATCH v2 2/7] dt-bindings: remoteproc: qcom: Add clock bindings for Q6V5
Hi Doug, Thanks for the review :) On 2018-12-18 05:29, Doug Anderson wrote: Hi, On Mon, Dec 17, 2018 at 2:07 AM Sibi Sankar wrote: Add missing clock bindings for Q6V5 MSS on SDM845 SoCs. Signed-off-by: Sibi Sankar --- .../devicetree/bindings/remoteproc/qcom,q6v5.txt | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) Fixes: 9f058fa2efb1 ("remoteproc: qcom: Add support for mss remoteproc on msm8996") Fixes: fb22022ff63d ("dt-bindings: remoteproc: Add Q6v5 Modem PIL binding for SDM845") ...it probably doesn't matter too much but if we wanted to be really careful we could split into two patches, one for the msm8996 and one for sdm845. I don't think people care that much about stable backports of bindings though (someone can feel free to correct me)... I did think of splitting this up but it doesn't actually fix 9f058fa2efb1 yet. I noticed a few missing clocks for mss on 8996 when I did a diff with the corresponding CAF tree. Hence couldn't add bindings for it. Will add them once I validate mss on 8996 with the necessary changes. diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt index 9ff5b0309417..780adc043b37 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt +++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt @@ -39,13 +39,17 @@ on the Qualcomm Hexagon core. - clocks: Usage: required Value type: - Definition: reference to the iface, bus and mem clocks to be held on - behalf of the booting of the Hexagon core + Definition: reference to the list of 4 clocks for the modem sub-system + reference to the list of 8 clocks for the modem sub-system + on SDM845 SoCs The above is confusing because you don't list the SoCs that are supposed to use the 4 clocks. How about instead: Definition: reference to the clocks that match clock-names AFAIK, only the exceptions are captured. I am fine with both, I'll wait for Bjorn/Rob's preference. - clock-names: Usage: required Value type: - Definition: must be "iface", "bus", "mem" + Definition: must be "iface", "bus", "mem", "xo" for the modem sub-system + must be "iface", "bus", "mem", "gpll0_mss", "snoc_axi", + "mnoc_axi", "prng", "xo" for the modem sub-system on SDM845 + SoCs Same here where it's confusing. ...but also, it it correct? As far as I can tell you're missing msm8996. It's better to just be explicit and list each one, ideally without all the prose. Definition: The clocks needed depend on the compatible string: ditto qcom,sdm845-mss-pil: "xo", "prng", "iface", "snoc_axi", "bus", "mem", "gpll0_mss", "mnoc_axi" qcom,msm8996-mss-pil: "xo", "pnoc", "iface", "bus", "mem", "gpll0_mss_clk" ditto qcom,msm8974-mss-pil: "xo", "iface", "bus", "mem" qcom,msm8916-mss-pil: "xo", "iface", "bus", "mem" qcom,q6v5-pil: "xo", "iface", "bus", "mem" ...as far as I can tell this binding is supposed to account for "qcom,ipq8074-wcss-pil" too but it seems that one doesn't have clock-names. Yeah the lack of clocks have to be documented for ipq8074-wcss-pil.. will do it in v3 -Doug -- -- Sibi Sankar -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
[v2, PATCH 2/2] net-next: stmmac: dwmac-mediatek: remove fine-tune property
1. remove fine-tune property and related setting to simplify the timing adjustment flow. 2. set timing value according to the value from device tree, and will not care whether PHY insert internal delay. Signed-off-by: Biao Huang --- .../net/ethernet/stmicro/stmmac/dwmac-mediatek.c | 71 +++- 1 file changed, 24 insertions(+), 47 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c index e400cbd..bf25629 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c @@ -44,7 +44,6 @@ struct mac_delay_struct { u32 rx_delay; bool tx_inv; bool rx_inv; - bool fine_tune; }; struct mediatek_dwmac_plat_data { @@ -105,16 +104,28 @@ static int mt2712_set_interface(struct mediatek_dwmac_plat_data *plat) return 0; } -static void mt2712_delay_ps2stage(struct mac_delay_struct *mac_delay) +static void mt2712_delay_ps2stage(struct mediatek_dwmac_plat_data *plat) { - if (mac_delay->fine_tune) { - /* 170ps per stage for fine tune delay macro circuit*/ - mac_delay->tx_delay /= 170; - mac_delay->rx_delay /= 170; - } else { - /* 550ps per stage for coarse tune delay macro circuit*/ + struct mac_delay_struct *mac_delay = &plat->mac_delay; + + switch (plat->phy_mode) { + case PHY_INTERFACE_MODE_MII: + case PHY_INTERFACE_MODE_RMII: + /* 550ps per stage for MII/RMII */ mac_delay->tx_delay /= 550; mac_delay->rx_delay /= 550; + break; + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_RGMII_TXID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_ID: + /* 170ps per stage for RGMII */ + mac_delay->tx_delay /= 170; + mac_delay->rx_delay /= 170; + break; + default: + dev_err(plat->dev, "phy interface not supported\n"); + break; } } @@ -123,7 +134,7 @@ static int mt2712_set_delay(struct mediatek_dwmac_plat_data *plat) struct mac_delay_struct *mac_delay = &plat->mac_delay; u32 delay_val = 0, fine_val = 0; - mt2712_delay_ps2stage(mac_delay); + mt2712_delay_ps2stage(plat); switch (plat->phy_mode) { case PHY_INTERFACE_MODE_MII: @@ -167,13 +178,10 @@ static int mt2712_set_delay(struct mediatek_dwmac_plat_data *plat) fine_val = ETH_RMII_DLY_TX_INV; break; case PHY_INTERFACE_MODE_RGMII: - /* the PHY is not responsible for inserting any internal -* delay by itself in PHY_INTERFACE_MODE_RGMII case, -* so Ethernet MAC will insert delays for both transmit -* and receive path here. -*/ - if (mac_delay->fine_tune) - fine_val = ETH_FINE_DLY_GTXC | ETH_FINE_DLY_RXC; + case PHY_INTERFACE_MODE_RGMII_TXID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_ID: + fine_val = ETH_FINE_DLY_GTXC | ETH_FINE_DLY_RXC; delay_val |= FIELD_PREP(ETH_DLY_GTXC_ENABLE, !!mac_delay->tx_delay); delay_val |= FIELD_PREP(ETH_DLY_GTXC_STAGES, mac_delay->tx_delay); @@ -183,36 +191,6 @@ static int mt2712_set_delay(struct mediatek_dwmac_plat_data *plat) delay_val |= FIELD_PREP(ETH_DLY_RXC_STAGES, mac_delay->rx_delay); delay_val |= FIELD_PREP(ETH_DLY_RXC_INV, mac_delay->rx_inv); break; - case PHY_INTERFACE_MODE_RGMII_TXID: - /* the PHY should insert an internal delay for the transmit -* path in PHY_INTERFACE_MODE_RGMII_TXID case, -* so Ethernet MAC will insert the delay for receive path here. -*/ - if (mac_delay->fine_tune) - fine_val = ETH_FINE_DLY_RXC; - - delay_val |= FIELD_PREP(ETH_DLY_RXC_ENABLE, !!mac_delay->rx_delay); - delay_val |= FIELD_PREP(ETH_DLY_RXC_STAGES, mac_delay->rx_delay); - delay_val |= FIELD_PREP(ETH_DLY_RXC_INV, mac_delay->rx_inv); - break; - case PHY_INTERFACE_MODE_RGMII_RXID: - /* the PHY should insert an internal delay for the receive -* path in PHY_INTERFACE_MODE_RGMII_RXID case, -* so Ethernet MAC will insert the delay for transmit path here. -*/ - if (mac_delay->fine_tune) - fine_val = ETH_FINE_DLY_GTXC; - - delay_val |= FIELD_PREP(ETH_DLY_GTXC_ENABLE, !!mac_delay->tx_delay); - delay_val |= FIELD_PREP(ETH_DLY_GTXC_STAGES, mac_delay->tx_delay); - delay_val |= FIELD_PREP(ETH_DLY_GTXC_INV, mac_delay-
[v2, PATCH 1/2] dt-binding: mediatek-dwmac: add binding document for MediaTek MT2712 DWMAC
The commit adds the device tree binding documentation for the MediaTek DWMAC found on MediaTek MT2712. Signed-off-by: Biao Huang --- .../devicetree/bindings/net/mediatek-dwmac.txt | 78 1 file changed, 78 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/mediatek-dwmac.txt diff --git a/Documentation/devicetree/bindings/net/mediatek-dwmac.txt b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt new file mode 100644 index 000..8a08621 --- /dev/null +++ b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt @@ -0,0 +1,78 @@ +MediaTek DWMAC glue layer controller + +This file documents platform glue layer for stmmac. +Please see stmmac.txt for the other unchanged properties. + +The device node has following properties. + +Required properties: +- compatible: Should be "mediatek,mt2712-gmac" for MT2712 SoC +- reg: Address and length of the register set for the device +- interrupts: Should contain the MAC interrupts +- interrupt-names: Should contain a list of interrupt names corresponding to + the interrupts in the interrupts property, if available. + Should be "macirq" for the main MAC IRQ +- clocks: Must contain a phandle for each entry in clock-names. +- clock-names: The name of the clock listed in the clocks property. These are + "axi", "apb", "mac_main", "ptp_ref" for MT2712 SoC +- mac-address: See ethernet.txt in the same directory +- phy-mode: See ethernet.txt in the same directory +- mediatek,pericfg: A phandle to the syscon node that control ethernet + interface and timing delay. + +Optional properties: +- mediatek,tx-delay-ps: TX clock delay macro value. Default is 0. + It should be defined for RGMII/MII interface. +- mediatek,rx-delay-ps: RX clock delay macro value. Default is 0. + It should be defined for RGMII/MII/RMII interface. +Both delay properties need to be a multiple of 170 for RGMII interface, +or will round down. Range 0~31*170. +Both delay properties need to be a multiple of 550 for MII/RMII interface, +or will round down. Range 0~31*550. + +- mediatek,rmii-rxc: boolean property, if present indicates that the RMII + reference clock, which is from external PHYs, is connected to RXC pin + on MT2712 SoC. + Otherwise, is connected to TXC pin. +- mediatek,txc-inverse: boolean property, if present indicates that + 1. tx clock will be inversed in MII/RGMII case, + 2. tx clock inside MAC will be inversed relative to reference clock + which is from external PHYs in RMII case, and it rarely happen. +- mediatek,rxc-inverse: boolean property, if present indicates that + 1. rx clock will be inversed in MII/RGMII case. + 2. reference clock will be inversed when arrived at MAC in RMII case. +- assigned-clocks: mac_main and ptp_ref clocks +- assigned-clock-parents: parent clocks of the assigned clocks + +Example: + eth: ethernet@1101c000 { + compatible = "mediatek,mt2712-gmac"; + reg = <0 0x1101c000 0 0x1300>; + interrupts = ; + interrupt-names = "macirq"; + phy-mode ="rgmii"; + mac-address = [00 55 7b b5 7d f7]; + clock-names = "axi", + "apb", + "mac_main", + "ptp_ref", + "ptp_top"; + clocks = <&pericfg CLK_PERI_GMAC>, +<&pericfg CLK_PERI_GMAC_PCLK>, +<&topckgen CLK_TOP_ETHER_125M_SEL>, +<&topckgen CLK_TOP_ETHER_50M_SEL>; + assigned-clocks = <&topckgen CLK_TOP_ETHER_125M_SEL>, + <&topckgen CLK_TOP_ETHER_50M_SEL>; + assigned-clock-parents = <&topckgen CLK_TOP_ETHERPLL_125M>, +<&topckgen CLK_TOP_APLL1_D3>; + mediatek,pericfg = <&pericfg>; + mediatek,tx-delay-ps = <1530>; + mediatek,rx-delay-ps = <1530>; + mediatek,rmii-rxc; + mediatek,txc-inverse; + mediatek,rxc-inverse; + snps,txpbl = <32>; + snps,rxpbl = <32>; + snps,reset-gpio = <&pio 87 GPIO_ACTIVE_LOW>; + snps,reset-active-low; + }; -- 1.7.9.5
[v2, PATCH 0/2] add ethernet binding and modify ethernet driver for mt2712
changes in v2 as comments from Sean: 1. fix typo. 2. use capital letters for RMII/MII/RGMII in driver and bindings. v1: This new series is the result of discussion in: http://lkml.org/lkml/2018/12/13/1007 http://lkml.org/lkml/2018/12/14/53 1. ethernet binding file move to this series. 2. remove fine tune property in device tree 3. remove fine tune flow in ethernet driver 4. set rgmii timing according to the value in device tree, and don't care whether phy insert internal delay or not. Biao Huang (2): dt-binding: mediatek-dwmac: add binding document for MediaTek MT2712 DWMAC net-next: stmmac: dwmac-mediatek: remove fine-tune property .../devicetree/bindings/net/mediatek-dwmac.txt | 78 .../net/ethernet/stmicro/stmmac/dwmac-mediatek.c | 71 ++ 2 files changed, 102 insertions(+), 47 deletions(-) create mode 100644 Documentation/devicetree/bindings/net/mediatek-dwmac.txt -- 1.7.9.5
Re: [PATCH v12 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver
Hi Stephen, On 13-12-18, 02:12, Stephen Boyd wrote: > Quoting Viresh Kumar (2018-12-13 02:05:06) > > On 13-12-18, 01:58, Stephen Boyd wrote: > > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error > > > upon bringing the policy online the second time. I guess cpufreq_stats > > > aren't able to be freed from there because they take locks in different > > > order vs. the normal path? > > > > Please share the lockdep report and the steps to reproduce it. I will > > see if I can simulate the failure forcefully.. > > > > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I > think all it takes is to return an error the second time the policy is > initialized when cpufreq_online() calls into the cpufreq driver. > > == > WARNING: possible circular locking dependency detected > 4.19.8 #61 Tainted: GW > -- > cpuhp/5/36 is trying to acquire lock: > 3e901e8a (kn->count#326){}, at: > kernfs_remove_by_name_ns+0x44/0x80 > > but task is already holding lock: > dd7f52c3 (&policy->rwsem){}, at: cpufreq_policy_free+0x17c/0x1cc > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #1 (&policy->rwsem){}: > down_read+0x50/0xcc > show+0x30/0x78 > sysfs_kf_seq_show+0x17c/0x25c > kernfs_seq_show+0xb4/0xf8 > seq_read+0x4a8/0x8f0 > kernfs_fop_read+0xe0/0x360 > __vfs_read+0x80/0x328 > vfs_read+0xd0/0x1d4 > ksys_read+0x88/0x118 > __arm64_sys_read+0x4c/0x5c > el0_svc_common+0x124/0x1c4 > el0_svc_compat_handler+0x64/0x8c > el0_svc_compat+0x8/0x18 I failed to reproduce it over linux/next. I had the following changes over linux/next: https://pastebin.ubuntu.com/p/zkVm77PGdY/ I also did savedefconfig to show what all I changed in it. I faked multiple clusters on my hikey960 board, which is not big little.. And here is the command list from history that I ran after boot. 501 grep . /sys/devices/system/cpu/cpufreq/*/* 502 grep . /sys/devices/system/cpu/cpufreq/*/*/* 503 grep . /sys/devices/system/cpu/cpufreq/*/*/* 504 grep . /sys/devices/system/cpu/cpufreq/*/*/* 505 grep . /sys/devices/system/cpu/cpufreq/*/*/* 506 grep . /sys/devices/system/cpu/cpufreq/*/* 507 grep . /sys/devices/system/cpu/cpufreq/*/* 508 echo 0 > /sys/devices/system/cpu/cpu4/online 509 echo 0 > /sys/devices/system/cpu/cpu5/online 510 echo 0 > /sys/devices/system/cpu/cpu6/online 511 echo 0 > /sys/devices/system/cpu/cpu7/online 512 grep . /sys/devices/system/cpu/cpufreq/*/* 513 grep . /sys/devices/system/cpu/cpufreq/*/*/* 514 grep . /sys/devices/system/cpu/cpufreq/*/* 515 echo 1 > /sys/devices/system/cpu/cpu4/online 516 grep . /sys/devices/system/cpu/cpufreq/*/* 517 grep . /sys/devices/system/cpu/cpufreq/*/*/* 518 dmesg -- viresh
linux-next: build warning after merge of the gpio tree
Hi Linus, After merging the gpio tree, today's linux-next build (x86_64 allmodconfig) produced this warning: drivers/gpio/gpiolib-acpi.c: In function 'acpi_gpio_adr_space_handler': drivers/gpio/gpiolib-acpi.c:911:8: warning: unused variable 'err' [-Wunused-variable] int err; ^~~ Introduced by commit 21abf103818a ("gpio: Pass a flag to gpiochip_request_own_desc()") -- Cheers, Stephen Rothwell pgpku3TQLB4lw.pgp Description: OpenPGP digital signature
Re: [PATCH net-next] fou: Prevent unbounded recursion in GUE error handler
From: Stefano Brivio Date: Tue, 18 Dec 2018 00:13:17 +0100 > Handling exceptions for direct UDP encapsulation in GUE (that is, > UDP-in-UDP) leads to unbounded recursion in the GUE exception handler, > syzbot reported. > > While draft-ietf-intarea-gue-06 doesn't explicitly forbid direct > encapsulation of UDP in GUE, it probably doesn't make sense to set up GUE > this way, and it's currently not even possible to configure this. > > Skip exception handling if the GUE proto/ctype field is set to the UDP > protocol number. Should we need to handle exceptions for UDP-in-GUE one > day, we might need to either explicitly set a bound for recursion, or > implement a special iterative handling for these cases. > > Reported-and-tested-by: syzbot+43f6755d1c2e62743...@syzkaller.appspotmail.com > Fixes: b8a51b38e4d4 ("fou, fou6: ICMP error handlers for FoU and GUE") > Signed-off-by: Stefano Brivio Applied, thanks.
Re: rcu_preempt caused oom
On Tue, Dec 18, 2018 at 02:46:43AM +, Zhang, Jun wrote: > Hello, paul > > In softirq context, and current is rcu_preempt-10, rcu_gp_kthread_wake don't > wakeup rcu_preempt. > Maybe next patch could fix it. Please help review. > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 0b760c1..98f5b40 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1697,7 +1697,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state > *rsp, struct rcu_node *rnp) > */ > static void rcu_gp_kthread_wake(struct rcu_state *rsp) > { > - if (current == rsp->gp_kthread || > + if (((current == rsp->gp_kthread) && !in_softirq()) || Close, but not quite. Please see below. > !READ_ONCE(rsp->gp_flags) || > !rsp->gp_kthread) > return; > > [44932.311439, 0][ rcu_preempt] rcu_preempt-10[001] .n.. > 44929.401037: rcu_grace_period: rcu_preempt 19063548 reqwait > .. > [44932.311517, 0][ rcu_preempt] rcu_preempt-10[001] d.s2 > 44929.402234: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 > Startleaf > [44932.311536, 0][ rcu_preempt] rcu_preempt-10[001] d.s2 > 44929.402237: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 > Startedroot Good catch! If the rcu_preempt kthread had just entered the function swait_event_idle_exclusive(), which had just called __swait_event_idle() which had just called ___swait_event(), which had just gotten done checking the "condition", then yes, the rcu_preempt kthread could sleep forever. This is a very narrow race window, but that matches your experience with its not happening often -- and my experience with it not happening at all. However, for this to happen, the wakeup must happen within a softirq handler that executes upon return from an interrupt that interrupted ___swait_event() just after the "if (condition)". For this, we don't want in_softirq() but rather in_serving_softirq(), as shown in the patch below. The patch you have above could result in spurious wakeups, as it is checking for bottom halves being disabled, not just executing within a softirq handler. Which might be better than not having enough wakeups, but let's please try for just the right number. ;-) So could you please instead test the patch below? And if it works, could I please have your Signed-off-by so that I can queue it? My patch is quite clearly derived from yours, after all! And you should get credit for finding the problem and arriving at an approximate fix, after all. Thanx, Paul diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e9392a9d6291..b9205b40b621 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1722,7 +1722,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) */ static void rcu_gp_kthread_wake(struct rcu_state *rsp) { - if (current == rsp->gp_kthread || + if ((current == rsp->gp_kthread && !in_serving_softirq()) || !READ_ONCE(rsp->gp_flags) || !rsp->gp_kthread) return;
RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND Flash Controller
Hi Miquel, > -Original Message- > From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] > Sent: Monday, December 17, 2018 10:11 PM > To: Naga Sureshkumar Relli > Cc: Boris Brezillon ; r...@kernel.org; > rich...@nod.at; linux- > ker...@vger.kernel.org; marek.va...@gmail.com; linux-...@lists.infradead.org; > nagasures...@gmail.com; Michal Simek ; > computersforpe...@gmail.com; dw...@infradead.org > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for > Arasan > NAND Flash Controller > > Hi Naga, > > [...] > > > Inserted biterror @ 48/7 > > Successfully corrected 25 bit errors per subpage Inserted biterror @ > > 50/7 ECC failure, invalid data despite read success > > root@xilinx-zc1751-dc2-2018_1:~# > > > > But even in this case also, driver is saying ECC failure but read success. > > That means controller is able to detect errors on read page up to 24 bit > > only. > > After that there is no way to say to the upper layers that the page is bad > > because of the > limitation in the controller. > > This is more than a "limitation", the design is broken. I am not sure how to > support such > controller, and I am not sure if we even want to. The number of errors that are correctable is limited by a parameter 't'(total number of errors), If there is a condition that the number of errors greater than 't', then the controller won't be able to detect that. I guess this concept is same for other controllers as well. In Arasan it is limited to 24-bit. Even, in case of Hamming, it is 1-bit error correction and 2-bit error detection. What will happen if there are multiple errors(greater than 2-bit)? Thanks, Naga Sureshkumar Relli > > > Could you please suggest any alternative to report the errors in that case? > > Shall we support the controller without the hw ECC engine? Boris, any > thoughts? > > > Thanks, > Miquèl
[PATCH] kbuild: fix false positive warning/error about missing libelf
For the same reason as commit 25896d073d8a ("x86/build: Fix compiler support check for CONFIG_RETPOLINE"), you cannot put this $(error ...) into the parse stage of the top Makefile. Perhaps I'd propose a more sophisticated solution later, but this is the best I can do for now. Link: https://lkml.org/lkml/2017/12/25/211 Reported-by: Paul Gortmaker Reported-by: Bernd Edlinger Reported-by: Qian Cai Cc: Josh Poimboeuf Signed-off-by: Masahiro Yamada --- Makefile | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 56d5270..d45856f 100644 --- a/Makefile +++ b/Makefile @@ -962,11 +962,6 @@ ifdef CONFIG_STACK_VALIDATION ifeq ($(has_libelf),1) objtool_target := tools/objtool FORCE else -ifdef CONFIG_UNWINDER_ORC - $(error "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel") -else - $(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel") -endif SKIP_STACK_VALIDATION := 1 export SKIP_STACK_VALIDATION endif @@ -1125,6 +1120,14 @@ uapi-asm-generic: PHONY += prepare-objtool prepare-objtool: $(objtool_target) +ifeq ($(SKIP_STACK_VALIDATION),1) +ifdef CONFIG_UNWINDER_ORC + @echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2 + @false +else + @echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2 +endif +endif # Generate some files # --- -- 2.7.4
[PATCH] Fix mm->owner point to a tsk that has been free
From: guomin chen When mm->owner is modified by exit_mm, if the new owner directly calls unuse_mm to exit, it will cause Use-After-Free. Due to the unuse_mm() directly sets tsk->mm=NULL. Under normal circumstances,When do_exit exits, mm->owner will be updated on exit_mm(). but when the kernel process calls unuse_mm() and then exits,mm->owner cannot be updated. And it will point to a task that has been released. The current issue flow is as follows: (Process A,B,C use the same mm) Process C Process A Process B qemu-system-x86_64: kernel:vhost_net kernel: vhost_net open /dev/vhost-net VHOST_SET_OWNER create kthread vhost-%d create kthread vhost-%d network init use_mm() use_mm() ... ... Abnormal exited ... do_exit exit_mm() update mm->owner to A exit_files() close_files() kthread_should_stop() unuse_mm() Stop Process A tsk->mm=NULL do_exit() can't update owner A exit completed vhost-%d rcv first package vhost-%d build rcv buffer for vq page fault access mm & mm->owner NOW,mm->owner still pointer A kernel UAF stop Process B Although I am having this issue on vhost_net,But it affects all users of unuse_mm. Cc: "Eric W. Biederman" Cc: Andrew Morton Cc: "Luis R. Rodriguez" Cc: Dominik Brodowski Cc: Arnd Bergmann Cc: linux-kernel@vger.kernel.org Cc: linux...@kvack.org Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: Christoph Hellwig Signed-off-by: guomin chen --- mm/mmu_context.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/mmu_context.c b/mm/mmu_context.c index 3e612ae..9eb81aa 100644 --- a/mm/mmu_context.c +++ b/mm/mmu_context.c @@ -60,5 +60,6 @@ void unuse_mm(struct mm_struct *mm) /* active_mm is still 'mm' */ enter_lazy_tlb(mm, tsk); task_unlock(tsk); + mm_update_next_owner(mm); } EXPORT_SYMBOL_GPL(unuse_mm); -- 1.8.3.1
Re: [PATCH v2 1/7] dt-bindings: soc: qcom: Add remote-pid binding for GLINK SMEM
Hi Doug, Thanks for the review :) On 2018-12-18 05:29, Doug Anderson wrote: Hi, On Mon, Dec 17, 2018 at 2:07 AM Sibi Sankar wrote: Add missing qcom,remote-pid dt binding required for GLINK SMEM which specifies the remote endpoint of the GLINK edge. Signed-off-by: Sibi Sankar --- Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt | 5 + 1 file changed, 5 insertions(+) Fixes: 2b41d6c8e696 ("dt-bindings: soc: qcom: Extend GLINK to cover SMEM") diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt b/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt index 0b8cc533ca83..59ae603ba520 100644 --- a/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt @@ -21,6 +21,11 @@ edge. Definition: should specify the IRQ used by the remote processor to signal this processor about communication related events +- qcom,remote-pid: + Usage: required for glink-smem + Value type: + Definition: specifies the identfier of the remote endpoint of this edge s/identfier/identifier/ missed this, will correct it in v3. Other than the typo this seems right to me. Feel free to add my Reviewed-by tag when that's fixed. -Doug -- -- Sibi Sankar -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH] CIFS: return correct errors when pinning memory failed for direct I/O
merged into cifs-2.6.git for-next On Sun, Dec 16, 2018 at 4:44 PM Long Li wrote: > > From: Long Li > > When pinning memory failed, we should return the correct error code and > rewind the SMB credits. > > Reported-by: Murphy Zhou > Signed-off-by: Long Li > Cc: sta...@vger.kernel.org > Cc: Murphy Zhou > --- > fs/cifs/file.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > index c9bc56b..3467351 100644 > --- a/fs/cifs/file.c > +++ b/fs/cifs/file.c > @@ -2630,6 +2630,9 @@ cifs_write_from_iter(loff_t offset, size_t len, struct > iov_iter *from, > result, from->type, > from->iov_offset, from->count); > dump_stack(); > + > + rc = result; > + add_credits_and_wake_if(server, credits, 0); > break; > } > cur_len = (size_t)result; > @@ -3313,13 +3316,16 @@ cifs_send_async_read(loff_t offset, size_t len, > struct cifsFileInfo *open_file, > cur_len, &start); > if (result < 0) { > cifs_dbg(VFS, > - "couldn't get user pages > (cur_len=%zd)" > + "couldn't get user pages (rc=%zd)" > " iter type %d" > " iov_offset %zd count %zd\n", > result, direct_iov.type, > direct_iov.iov_offset, > direct_iov.count); > dump_stack(); > + > + rc = result; > + add_credits_and_wake_if(server, credits, 0); > break; > } > cur_len = (size_t)result; > -- > 2.7.4 > -- Thanks, Steve
Re: [PATCH] CIFS: use the correct length when pinning memory for direct I/O for write
merged into cifs-2.6.git for-next On Sun, Dec 16, 2018 at 5:18 PM Long Li wrote: > > From: Long Li > > The current code attempts to pin memory using the largest possible wsize > based on the currect SMB credits. This doesn't cause kernel oops but this is > not optimal as we may pin more pages then actually needed. > > Fix this by only pinning what are needed for doing this write I/O. > > Signed-off-by: Long Li > Cc: sta...@vger.kernel.org > --- > fs/cifs/file.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > index 3467351..c23bf9d 100644 > --- a/fs/cifs/file.c > +++ b/fs/cifs/file.c > @@ -2617,11 +2617,13 @@ cifs_write_from_iter(loff_t offset, size_t len, > struct iov_iter *from, > if (rc) > break; > > + cur_len = min_t(const size_t, len, wsize); > + > if (ctx->direct_io) { > ssize_t result; > > result = iov_iter_get_pages_alloc( > - from, &pagevec, wsize, &start); > + from, &pagevec, cur_len, &start); > if (result < 0) { > cifs_dbg(VFS, > "direct_writev couldn't get user > pages " > -- > 2.7.4 > -- Thanks, Steve
Re: [PATCH] ima: cleanup the match_token policy code
On Tue, 2018-12-18 at 04:06 +, Al Viro wrote: > On Mon, Dec 17, 2018 at 10:00:07PM -0500, Mimi Zohar wrote: > > > Could you expand on commit 5b2ea6199614 ("selinux: switch away from > > match_token()") patch description. All that it says is "It's not a > > good fit, unfortunately, and the next step will make it even less so. > > Open-code what we need here." And there's even less for the > > equivalent Smack patch, which just says "same issue as with > > selinux...". > > match_token() would require messing around with strsep() or something > equivalent. It's not a regex; foo=%s has no idea that comma is in any > way special, etc. > > As for the next commit... Killing the Cthulhu-awful mess in > selinux_sb_eat_lsm_opts() (allocating two temproraries, concatenating > (comma-separated) non-LSM options into one, concatenating (pipe-separated) > dequoted LSM options into another, then splitting that another by '|' > instances and figuring out which option each piece is, etc.) > is a Good Thing(tm). And having to dance around the needs of > match_token() adds extra headache, for no good reason. Ok, so it is this particular combination of things, not the general usage of strsep or match_token that you're objecting to. So fixing the other match_token non-LSM instances is fine. To prevent the enumeration and the match_table from going out of sync, I was thinking about defining a macro to create the match_table_t: #define __policy_tokens_match(ENUM, str) {Opt_ ## ENUM, #str}, static const match_table_t policy_tokens = { __policy_tokens_id(__policy_tokens_match) }; and the enumeration: enum policy_tokens_id { __policy_tokens_id(__policy_tokens_enumify) }; Mimi
Re: [PATCH v17 18/23] platform/x86: Intel SGX driver
On Mon, Dec 17, 2018 at 7:27 PM Jarkko Sakkinen wrote: > > On Tue, Dec 18, 2018 at 03:39:18AM +0200, Jarkko Sakkinen wrote: > > On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote: > > > The only potential hiccup I can see is the build flow. Currently, > > > EADD+EEXTEND is done via a work queue to avoid major performance issues > > > (10x regression) when userspace is building multiple enclaves in parallel > > > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler, > > > but I've only confirmed the Golang case). The issue is that allocating > > > an EPC page acts like a blocking syscall when the EPC is under pressure, > > > i.e. an EPC page isn't immediately available. This causes Go's scheduler > > > to thrash and tank performance[1]. > > > > I don't see any major issues having that kthread. All the code that > > maps the enclave would be removed. > > > > I would only allow to map enclave to process address space after the > > enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH. > > Some refined thoughts. > > PTE insertion can done in the #PF handler. In fact, we can PoC this > already with the current architecture (and I will right after sending > v18). > > The backing space is a bit more nasty issue in the add pager thread. > The previous shmem swapping would have been a better fit. Maybe that > should be reconsidered? > > If shmem was used, all the commits up to "SGX Enclave Driver" could > be reworked to the new model. > > When we think about the swapping code, there uprises some difficulties. > Namely, when a page is swapped, the enclave must unmap the PTE from all > processes that have it mapped. That's what unmap_mapping_range(), etc do for you, no? IOW make a struct address_space that represents the logical enclave address space, i.e. address 0 is the start and the pages count up from there. You can unmap pages whenever you want, and the core mm code will take care of zapping the pages from all vmas referencing that address_space.
Re: [PATCH v17 18/23] platform/x86: Intel SGX driver
On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson wrote: > > My brain is still sorting out the details, but I generally like the idea > of allocating an anon inode when creating an enclave, and exposing the > other ioctls() via the returned fd. This is essentially the approach > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs > and vCPUS. There are even similarities to accessing physical memory via > multiple disparate domains, e.g. host kernel, host userspace and guest. > In my mind, opening /dev/sgx would give you the requisite inode. I'm not 100% sure that the chardev infrastructure allows this, but I think it does. > The only potential hiccup I can see is the build flow. Currently, > EADD+EEXTEND is done via a work queue to avoid major performance issues > (10x regression) when userspace is building multiple enclaves in parallel > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler, > but I've only confirmed the Golang case). The issue is that allocating > an EPC page acts like a blocking syscall when the EPC is under pressure, > i.e. an EPC page isn't immediately available. This causes Go's scheduler > to thrash and tank performance[1]. What's the issue, and how does a workqueue help? I'm wondering if a nicer solution would be an ioctl to add lots of pages in a single call. > > Alternatively, we could change the EADD+EEXTEND flow to not insert the > added page's PFN into the owner's process space, i.e. force userspace to > fault when it runs the enclave. But that only delays the issue because > eventually we'll want to account EPC pages, i.e. add a cgroup, at which > point we'll likely need current->mm anyways. You should be able to account the backing pages to a cgroup without actually sticking them into the EPC, no? Or am I misunderstanding? I guess we'll eventually want a cgroup to limit use of the limited EPC resources.
Re: [Regression 4.15] Can't kill CONFIG_UNWINDER_ORC with fire or plague.
On Mon, Dec 17, 2018 at 6:45 AM Paul Gortmaker wrote: > > [Re: [Regression 4.15] Can't kill CONFIG_UNWINDER_ORC with fire or plague.] > On 29/12/2017 (Fri 13:18) Paul Gortmaker wrote: > > > [Re: [Regression 4.15] Can't kill CONFIG_UNWINDER_ORC with fire or plague.] > > On 29/12/2017 (Fri 10:47) Josh Poimboeuf wrote: > > > > > This seems to be related to a kconfig quirk where only silentoldconfig > > > updates the include/config/auto.conf file. The other config targets > > > (oldconfig, defconfig, etc) don't touch it. It seems intentional, but I > > > have no idea why. > > > > > > That causes the Makefile to get stale data for 'CONFIG_*' variables when > > > it includes auto.conf. So I don't think this is specific to the ORC > > > check. It seems like it could also cause bugs elsewhere. > > > > OK, good - you agree with my initial diagnosis of stale auto.conf then. > > Not sure what Randy was testing when he said he couldn't reproduce it. > > > > > The below (ugly) patch fixes it, though I'm not sure this is the best > > > way to do it. We probably need Masahiro or Michal to chime in here. > > > > Yep, that is why I intentionally put the kbuild folks on the To/Cc of > > the original report (and ran away screaming at the prospect of debugging > > Makefiles on xmas day). But with holidays and all, it might not be > > until early January before they have a chance to reply. > > It is nearly a year later and we still have this false positive. > > paul@sm:~/git/linux-head$ make -j12 > /dev/null > Makefile:966: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, > please install libelf-dev, libelf-devel or elfutils-libelf-devel". Stop. > paul@sm:~/git/linux-head$ grep UNWINDER_ORC .config > # CONFIG_UNWINDER_ORC is not set > > We do know a bit more now -- the auto.conf issue has been independently > confirmed and "fixed" for other subsystems/issues since, like RETPOLINE: > > - > commit 25896d073d8a0403b07e6dec56f58e6c33678207 > Author: Masahiro Yamada > Date: Wed Dec 5 15:27:19 2018 +0900 > > x86/build: Fix compiler support check for CONFIG_RETPOLINE > > It is troublesome to add a diagnostic like this to the Makefile > parse stage because the top-level Makefile could be parsed with > a stale include/config/auto.conf. > > Once you are hit by the error about non-retpoline compiler, the > compilation still breaks even after disabling CONFIG_RETPOLINE. > - > > I'm not sure if we want to treat this on a per config option each time > again and again, or undertake a more global kbuild approach, but it does > warrant a mention and a re-examination before we "solve" this again. I did not notice this thread (perhaps, it fell into my crack during the holidays) but I actually tried to fix this twice in a sophisticated way in the past. The first attempt (https://patchwork.kernel.org/patch/10516049/) was rejected by Josh Poimboeuf. The second one (https://patchwork.kernel.org/patch/10643245/) turned out not working as expected. Now, I am preparing for the third attempt, but it will take time for review. What I can do now is the similar fix-up as commit 25896d073d8. I will post a cheesy fix-up patch. -- Best Regards Masahiro Yamada
Re: [PATCH v17 18/23] platform/x86: Intel SGX driver
On Mon, Dec 17, 2018 at 5:39 PM Jarkko Sakkinen wrote: > > On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote: > > The only potential hiccup I can see is the build flow. Currently, > > EADD+EEXTEND is done via a work queue to avoid major performance issues > > (10x regression) when userspace is building multiple enclaves in parallel > > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler, > > but I've only confirmed the Golang case). The issue is that allocating > > an EPC page acts like a blocking syscall when the EPC is under pressure, > > i.e. an EPC page isn't immediately available. This causes Go's scheduler > > to thrash and tank performance[1]. > > I don't see any major issues having that kthread. All the code that > maps the enclave would be removed. > > I would only allow to map enclave to process address space after the > enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH. > What's SGX_IOC_ENCLAVE_ATTACH? Why would it be needed at all? I would imagine that all pages would be faulted in as needed (or prefaulted as an optimization) and the enclave would just work in any process.
[PATCH v4 2/2] trace nvme submit queue status
export nvme disk name, queue id, sq_head, sq_tail to trace event usage example: go to the event directory: cd /sys/kernel/debug/tracing/events/nvme/nvme_sq filter by disk name: echo 'disk=="nvme1n1"' > filter enable the event: echo 1 > enable check results from trace_pipe: cat /sys/kernel/debug/tracing/trace_pipe In practice, this patch help me debug hardware related performant issue. Signed-off-by: yupeng --- drivers/nvme/host/pci.c | 5 + drivers/nvme/host/trace.c | 2 ++ drivers/nvme/host/trace.h | 21 + 3 files changed, 28 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index c33bb201b884..52df2f7fef37 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -32,6 +32,7 @@ #include #include +#include "trace.h" #include "nvme.h" #define SQ_SIZE(depth) (depth * sizeof(struct nvme_command)) @@ -899,6 +900,10 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx) } req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id); + trace_nvme_sq(req->rq_disk, + nvmeq->qid, + le16_to_cpu(cqe->sq_head), + nvmeq->sq_tail); nvme_end_request(req, cqe->status, cqe->result); } diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c index 8ca7079ed2bc..7bfaace23e1e 100644 --- a/drivers/nvme/host/trace.c +++ b/drivers/nvme/host/trace.c @@ -142,3 +142,5 @@ const char *nvme_trace_disk_name(struct trace_seq *p, char *name) return ret; } EXPORT_SYMBOL_GPL(nvme_trace_disk_name); + +EXPORT_TRACEPOINT_SYMBOL(nvme_sq); diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h index 196d5bd56718..3606cd7000f4 100644 --- a/drivers/nvme/host/trace.h +++ b/drivers/nvme/host/trace.h @@ -184,6 +184,27 @@ TRACE_EVENT(nvme_async_event, #undef aer_name +TRACE_EVENT(nvme_sq, + TP_PROTO(void *rq_disk, int qid, int sq_head, int sq_tail), + TP_ARGS(rq_disk, qid, sq_head, sq_tail), + TP_STRUCT__entry( + __array(char, disk, DISK_NAME_LEN) + __field(int, qid) + __field(int, sq_head) + __field(int, sq_tail) + ), + TP_fast_assign( + __assign_disk_name(__entry->disk, rq_disk); + __entry->qid = qid; + __entry->sq_head = sq_head; + __entry->sq_tail = sq_tail; + ), + TP_printk("nvme: %s qid=%d head=%d tail=%d", + __print_disk_name(__entry->disk), + __entry->qid, __entry->sq_head, __entry->sq_tail + ) +); + #endif /* _TRACE_NVME_H */ #undef TRACE_INCLUDE_PATH -- 2.17.1
[PATCH 2/2] arm64: dts: ls1088: add missing dma-coherent property in fsl-mc
Signed-off-by: Nipun Gupta --- arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi index dec0c2d..b8e31a1 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi @@ -577,6 +577,7 @@ <0x 0x0834 0 0x4>; /* MC control reg */ msi-parent = <&its>; iommu-map = <0 &smmu 0 0>; /* This is fixed-up by u-boot */ + dma-coherent; #address-cells = <3>; #size-cells = <1>; -- 1.9.1
[PATCH v4 1/2] export trace.c helper functions to other modules
Export bellow three functions: nvme_trace_parse_admin_cmd nvme_trace_parse_nvm_cmd nvme_trace_disk_name Thus any other modules which depends on nvme-core could use the trace events in trace.h Signed-off-by: yupeng --- drivers/nvme/host/trace.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c index 25b0e310f4a8..8ca7079ed2bc 100644 --- a/drivers/nvme/host/trace.c +++ b/drivers/nvme/host/trace.c @@ -113,6 +113,7 @@ const char *nvme_trace_parse_admin_cmd(struct trace_seq *p, return nvme_trace_common(p, cdw10); } } +EXPORT_SYMBOL_GPL(nvme_trace_parse_admin_cmd); const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p, u8 opcode, u8 *cdw10) @@ -128,6 +129,7 @@ const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p, return nvme_trace_common(p, cdw10); } } +EXPORT_SYMBOL_GPL(nvme_trace_parse_nvm_cmd); const char *nvme_trace_disk_name(struct trace_seq *p, char *name) { @@ -139,3 +141,4 @@ const char *nvme_trace_disk_name(struct trace_seq *p, char *name) return ret; } +EXPORT_SYMBOL_GPL(nvme_trace_disk_name); -- 2.17.1
[PATCH 1/2] arm64: dts: ls1088: add smmu device node
This patch also adds the iommu-map property in fsl-mc node, so that fsl-mc can use iommu. Signed-off-by: Nipun Gupta --- These patches are based over: git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git, as there are couple of changes related to fsl-mc bus in this tree: https://lore.kernel.org/patchwork/patch/1021020/ arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 92 +- 1 file changed, 91 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi index de93b42..dec0c2d 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi @@ -576,6 +576,7 @@ reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */ <0x 0x0834 0 0x4>; /* MC control reg */ msi-parent = <&its>; + iommu-map = <0 &smmu 0 0>; /* This is fixed-up by u-boot */ #address-cells = <3>; #size-cells = <1>; @@ -641,6 +642,96 @@ }; }; }; + + smmu: iommu@500 { + compatible = "arm,mmu-500"; + reg = <0 0x500 0 0x80>; + #iommu-cells = <1>; + stream-match-mask = <0x7C00>; + #global-interrupts = <12>; +// global secure fault + interrupts = , +// combined secure +, +// global non-secure fault +, +// combined non-secure +, +// performance counter interrupts 0-7 +, +, +, +, +, +, +, +, +// per context interrupt, 64 interrupts +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +; + }; }; firmware { @@ -649,5 +740,4 @@ method = "smc";
Re: [PATCH v2 0/3] x86: kprobes: Show correct blaclkist in debugfs
On Mon, 17 Dec 2018 16:47:13 +0100 Andrea Righi wrote: > On Mon, Dec 17, 2018 at 05:20:25PM +0900, Masami Hiramatsu wrote: > > This is v2 series for showing correct kprobe blacklist in > > debugfs. > > > > v1 is here: > > > > https://lkml.org/lkml/2018/12/7/517 > > > > I splitted the RFC v1 patch into x86 and generic parts, > > also added a patch to remove unneeded arch-specific > > blacklist check function (because those have been added > > to the generic blacklist.) > > > > If this style is good, I will make another series for the > > archs which have own arch_within_kprobe_blacklist(), and > > eventually replace that with arch_populate_kprobe_blacklist() > > so that user can get the correct kprobe blacklist in debugfs. > > > > Thank you, > > Looks good to me. Thanks! > > Tested-by: Andrea Righi Thank you for testing! > > Side question: there are certain symbols in arch/x86/xen that should be > blacklisted explicitly, because they're non-attachable. > > More exactly, all functions defined in arch/x86/xen/spinlock.c, > arch/x86/xen/time.c and arch/x86/xen/irq.c. > > The reason is that these files are compiled without -pg to allow the > usage of ftrace within a Xen domain apparently (from > arch/x86/xen/Makefile): > > ifdef CONFIG_FUNCTION_TRACER > # Do not profile debug and lowlevel utilities > CFLAGS_REMOVE_spinlock.o = -pg > CFLAGS_REMOVE_time.o = -pg > CFLAGS_REMOVE_irq.o = -pg > endif Actually, the reason why you can not probe those functions via tracing/kprobe_events is just a side effect. You can probe it if you write a kprobe module. Since the kprobe_events depends on some ftrace tracing functions, it sometimes cause a recursive call problem. To avoid this issue, I have introduced a CONFIG_KPROBE_EVENTS_ON_NOTRACE, see commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on notrace function"). If you set CONFIG_KPROBE_EVENTS_ON_NOTRACE=n, you can continue putting probes on Xen spinlock functions too. > Do you see a nice and clean way to blacklist all these functions > (something like arch_populate_kprobe_blacklist()), or should we just > flag all of them explicitly with NOKPROBE_SYMBOL()? As I pointed, you can probe it via your own kprobe module. Like systemtap, you still can probe it. The blacklist is for "kprobes", not for "kprobe_events". (Those are used to same, but since the above commit, those are different now) I think the most sane solution is, identifying which (combination of) functions in ftrace (kernel/trace/*) causes a problem, marking those NOKPROBE_SYMBOL() and removing CONFIG_KPROBE_EVENTS_ON_NOTRACE. Thank you, -- Masami Hiramatsu
Re: [PATCH] Export mm_update_next_owner function for unuse_mm.
On Tue, Dec 18, 2018 at 11:42:11AM +0800, gchen.guo...@gmail.com wrote: > From: guomin chen > > When mm->owner is modified by exit_mm, if the new owner directly calls > unuse_mm to exit, it will cause Use-After-Free. Due to the unuse_mm() > directly sets tsk->mm=NULL. > > Under normal circumstances,When do_exit exits, mm->owner will > be updated on exit_mm(). but when the kernel process calls > unuse_mm() and then exits,mm->owner cannot be updated. And it > will point to a task that has been released. > > The current issue flow is as follows: > Process C Process A Process B > qemu-system-x86_64: kernel:vhost_net kernel: vhost_net > open /dev/vhost-net > VHOST_SET_OWNER create kthread vhost-%d create kthread vhost-%d > network init use_mm() use_mm() >... ... >Abnormal exited >... > do_exit > exit_mm() > update mm->owner to A > exit_files() >close_files() >kthread_should_stop() unuse_mm() > Stop Process A tsk->mm=NULL > do_exit() > can't update owner > A exit completed vhost-%d rcv first package >vhost-%d build rcv buffer for vq >page fault >access mm & mm->owner >NOW,mm->owner still pointer A >kernel UAF > stop Process B > > Although I am having this issue on vhost_net,But it affects all users of > unuse_mm. > > Cc: "Eric W. Biederman" > Cc: Andrew Morton > Cc: "Luis R. Rodriguez" > Cc: Dominik Brodowski > Cc: Arnd Bergmann > Cc: linux-kernel@vger.kernel.org > Cc: linux...@kvack.org > Cc: "Michael S. Tsirkin" > Cc: Jason Wang > Cc: Christoph Hellwig > Signed-off-by: guomin chen > --- > kernel/exit.c| 1 + > mm/mmu_context.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/kernel/exit.c b/kernel/exit.c > index 0e21e6d..9e046dd 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -486,6 +486,7 @@ void mm_update_next_owner(struct mm_struct *mm) > task_unlock(c); > put_task_struct(c); > } > +EXPORT_SYMBOL(mm_update_next_owner); > #endif /* CONFIG_MEMCG */ > > /* So why export it? Is that still needed? > diff --git a/mm/mmu_context.c b/mm/mmu_context.c > index 3e612ae..9eb81aa 100644 > --- a/mm/mmu_context.c > +++ b/mm/mmu_context.c > @@ -60,5 +60,6 @@ void unuse_mm(struct mm_struct *mm) > /* active_mm is still 'mm' */ > enter_lazy_tlb(mm, tsk); > task_unlock(tsk); > + mm_update_next_owner(mm); > } > EXPORT_SYMBOL_GPL(unuse_mm); > -- > 1.8.3.1
[PATCH v6 4/6] mm: Shuffle initial free memory to improve memory-side-cache utilization
Randomization of the page allocator improves the average utilization of a direct-mapped memory-side-cache. Memory side caching is a platform capability that Linux has been previously exposed to in HPC (high-performance computing) environments on specialty platforms. In that instance it was a smaller pool of high-bandwidth-memory relative to higher-capacity / lower-bandwidth DRAM. Now, this capability is going to be found on general purpose server platforms where DRAM is a cache in front of higher latency persistent memory [1]. Robert offered an explanation of the state of the art of Linux interactions with memory-side-caches [2], and I copy it here: It's been a problem in the HPC space: http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/ A kernel module called zonesort is available to try to help: https://software.intel.com/en-us/articles/xeon-phi-software and this abandoned patch series proposed that for the kernel: https://lkml.org/lkml/2017/8/23/195 Dan's patch series doesn't attempt to ensure buffers won't conflict, but also reduces the chance that the buffers will. This will make performance more consistent, albeit slower than "optimal" (which is near impossible to attain in a general-purpose kernel). That's better than forcing users to deploy remedies like: "To eliminate this gradual degradation, we have added a Stream measurement to the Node Health Check that follows each job; nodes are rebooted whenever their measured memory bandwidth falls below 300 GB/s." A replacement for zonesort was merged upstream in commit cc9aec03e58f "x86/numa_emulation: Introduce uniform split capability". With this numa_emulation capability, memory can be split into cache sized ("near-memory" sized) numa nodes. A bind operation to such a node, and disabling workloads on other nodes, enables full cache performance. However, once the workload exceeds the cache size then cache conflicts are unavoidable. While HPC environments might be able to tolerate time-scheduling of cache sized workloads, for general purpose server platforms, the oversubscribed cache case will be the common case. The worst case scenario is that a server system owner benchmarks a workload at boot with an un-contended cache only to see that performance degrade over time, even below the average cache performance due to excessive conflicts. Randomization clips the peaks and fills in the valleys of cache utilization to yield steady average performance. Here are some performance impact details of the patches: 1/ An Intel internal synthetic memory bandwidth measurement tool, saw a 3X speedup in a contrived case that tries to force cache conflicts. The contrived cased used the numa_emulation capability to force an instance of the benchmark to be run in two of the near-memory sized numa nodes. If both instances were placed on the same emulated they would fit and cause zero conflicts. While on separate emulated nodes without randomization they underutilized the cache and conflicted unnecessarily due to the in-order allocation per node. 2/ A well known Java server application benchmark was run with a heap size that exceeded cache size by 3X. The cache conflict rate was 8% for the first run and degraded to 21% after page allocator aging. With randomization enabled the rate levelled out at 11%. 3/ A MongoDB workload did not observe measurable difference in cache-conflict rates, but the overall throughput dropped by 7% with randomization in one case. 4/ Mel Gorman ran his suite of performance workloads with randomization enabled on platforms without a memory-side-cache and saw a mix of some improvements and some losses [3]. While there is potentially significant improvement for applications that depend on low latency access across a wide working-set, the performance may be negligible to negative for other workloads. For this reason the shuffle capability defaults to off unless a direct-mapped memory-side-cache is detected. Even then, the page_alloc.shuffle=0 parameter can be specified to disable the randomization on those systems. Outside of memory-side-cache utilization concerns there is potentially security benefit from randomization. Some data exfiltration and return-oriented-programming attacks rely on the ability to infer the location of sensitive data objects. The kernel page allocator, especially early in system boot, has predictable first-in-first out behavior for physical pages. Pages are freed in physical address order when first onlined. Quoting Kees: "While we already have a base-address randomization (CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and memory layouts would certainly be using the predictability of allocation ordering (i.e. for attacks where the base address isn't important: only the relative positions between allocated memory). This is common in lots of heap-style attacks. They
[PATCH v6 6/6] mm: Maintain randomization of page free lists
When freeing a page with an order >= shuffle_page_order randomly select the front or back of the list for insertion. While the mm tries to defragment physical pages into huge pages this can tend to make the page allocator more predictable over time. Inject the front-back randomness to preserve the initial randomness established by shuffle_free_memory() when the kernel was booted. The overhead of this manipulation is constrained by only being applied for MAX_ORDER sized pages by default. Cc: Michal Hocko Cc: Kees Cook Cc: Dave Hansen Signed-off-by: Dan Williams --- include/linux/mmzone.h | 10 ++ include/linux/shuffle.h | 12 mm/page_alloc.c | 11 +-- mm/shuffle.c| 16 4 files changed, 47 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 35cc33af87f2..338929647eea 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -98,6 +98,8 @@ extern int page_group_by_mobility_disabled; struct free_area { struct list_headfree_list[MIGRATE_TYPES]; unsigned long nr_free; + u64 rand; + u8 rand_bits; }; /* Used for pages not on another list */ @@ -116,6 +118,14 @@ static inline void add_to_free_area_tail(struct page *page, struct free_area *ar area->nr_free++; } +#ifdef CONFIG_SHUFFLE_PAGE_ALLOCATOR +/* Used to preserve page allocation order entropy */ +void add_to_free_area_random(struct page *page, struct free_area *area, + int migratetype); +#else +#define add_to_free_area_random add_to_free_area +#endif + /* Used for pages which are on another list */ static inline void move_to_free_area(struct page *page, struct free_area *area, int migratetype) diff --git a/include/linux/shuffle.h b/include/linux/shuffle.h index a8a168919cb5..8b3941a87c2c 100644 --- a/include/linux/shuffle.h +++ b/include/linux/shuffle.h @@ -29,6 +29,13 @@ static inline void shuffle_zone(struct zone *z, unsigned long start_pfn, return; __shuffle_zone(z, start_pfn, end_pfn); } + +static inline bool is_shuffle_order(int order) +{ + if (!static_branch_unlikely(&page_alloc_shuffle_key)) +return false; + return order >= CONFIG_SHUFFLE_PAGE_ORDER; +} #else static inline void shuffle_free_memory(pg_data_t *pgdat, unsigned long start_pfn, unsigned long end_pfn) @@ -43,5 +50,10 @@ static inline void shuffle_zone(struct zone *z, unsigned long start_pfn, static inline void page_alloc_shuffle(void) { } + +static inline bool is_shuffle_order(int order) +{ + return false; +} #endif #endif /* _MM_SHUFFLE_H */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index de8b5eb78d13..3a932ba23daf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include @@ -851,7 +852,8 @@ static inline void __free_one_page(struct page *page, * so it's less likely to be used soon and more likely to be merged * as a higher order page */ - if ((order < MAX_ORDER-2) && pfn_valid_within(buddy_pfn)) { + if ((order < MAX_ORDER-2) && pfn_valid_within(buddy_pfn) + && !is_shuffle_order(order)) { struct page *higher_page, *higher_buddy; combined_pfn = buddy_pfn & pfn; higher_page = page + (combined_pfn - pfn); @@ -865,7 +867,12 @@ static inline void __free_one_page(struct page *page, } } - add_to_free_area(page, &zone->free_area[order], migratetype); + if (is_shuffle_order(order)) + add_to_free_area_random(page, &zone->free_area[order], + migratetype); + else + add_to_free_area(page, &zone->free_area[order], migratetype); + } /* diff --git a/mm/shuffle.c b/mm/shuffle.c index 07961ff41a03..4cadf51c9b40 100644 --- a/mm/shuffle.c +++ b/mm/shuffle.c @@ -213,3 +213,19 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat, unsigned long start_pfn, for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) shuffle_zone(z, start_pfn, end_pfn); } + +void add_to_free_area_random(struct page *page, struct free_area *area, + int migratetype) +{ + if (area->rand_bits == 0) { + area->rand_bits = 64; + area->rand = get_random_u64(); + } + + if (area->rand & 1) + add_to_free_area(page, area, migratetype); + else + add_to_free_area_tail(page, area, migratetype); + area->rand_bits--; + area->rand >>= 1; +}
[PATCH v6 2/6] acpi: Add HMAT to generic parsing tables
From: Keith Busch The HMAT table header has different field lengths than the existing parsing uses. Add the HMAT type to the parsing rules so it may be generically parsed. Signed-off-by: Keith Busch Signed-off-by: Dan Williams --- drivers/acpi/tables.c |9 + include/linux/acpi.h |1 + 2 files changed, 10 insertions(+) diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index e9643b4267c7..bc1addf715dc 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -51,6 +51,7 @@ static int acpi_apic_instance __initdata; enum acpi_subtable_type { ACPI_SUBTABLE_COMMON, + ACPI_SUBTABLE_HMAT, }; struct acpi_subtable_entry { @@ -232,6 +233,8 @@ acpi_get_entry_type(struct acpi_subtable_entry *entry) switch (entry->type) { case ACPI_SUBTABLE_COMMON: return entry->hdr->common.type; + case ACPI_SUBTABLE_HMAT: + return entry->hdr->hmat.type; } return 0; } @@ -242,6 +245,8 @@ acpi_get_entry_length(struct acpi_subtable_entry *entry) switch (entry->type) { case ACPI_SUBTABLE_COMMON: return entry->hdr->common.length; + case ACPI_SUBTABLE_HMAT: + return entry->hdr->hmat.length; } return 0; } @@ -252,6 +257,8 @@ acpi_get_subtable_header_length(struct acpi_subtable_entry *entry) switch (entry->type) { case ACPI_SUBTABLE_COMMON: return sizeof(entry->hdr->common); + case ACPI_SUBTABLE_HMAT: + return sizeof(entry->hdr->hmat); } return 0; } @@ -259,6 +266,8 @@ acpi_get_subtable_header_length(struct acpi_subtable_entry *entry) static enum acpi_subtable_type __init acpi_get_subtable_type(char *id) { + if (strncmp(id, ACPI_SIG_HMAT, 4) == 0) + return ACPI_SUBTABLE_HMAT; return ACPI_SUBTABLE_COMMON; } diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 18805a967c70..4373f5ba0f95 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -143,6 +143,7 @@ enum acpi_address_range_id { /* Table Handlers */ union acpi_subtable_headers { struct acpi_subtable_header common; + struct acpi_hmat_structure hmat; }; typedef int (*acpi_tbl_table_handler)(struct acpi_table_header *table);