date:20181217

Re: [-next] strace tests fail because of "y2038: socket: Add compat_sys_recvmmsg_time64"

2018-12-17 Thread Heiko Carstens

On Mon, Dec 17, 2018 at 11:05:06PM +0100, Arnd Bergmann wrote:
> On Mon, Dec 17, 2018 at 10:40 PM Arnd Bergmann  wrote:
> >
> > On Mon, Dec 17, 2018 at 2:06 PM Heiko Carstens
> >  wrote:
> > >
> > > Hi Arnd,
> > >
> > > in linux-next as of today 16 strace self tests fail on s390. I could
> > > bisect this to b136972b063b ("y2038: socket: Add 
> > > compat_sys_recvmmsg_time64").
> > >
> > > The following tests fail:
> >
> > Hi Heiko,
> >
> > Thanks for the report and sorry I broke things. I'll have a closer look
> > tomorrow if I don't find it right away. I suppose the regression was in
> > native system calls, not the compat syscalls with 31-bit user space,
> > right?

Yes, I was talking about 64 bit native system calls.

> I found a bug in my patch by inspection. Can you try if the patch
> below makes it all work (apologies for the garbled whitespace),
> I'm considering a rewrite of that function now (to split it into two
> again), but want to make sure there isn't another problem in my
> original patch.

With your patch below applied, the tests pass again.

Thanks!

> 
> diff --git a/net/socket.c b/net/socket.c
> index 3bb2ee083f97..7f9f225d0b6c 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -2486,12 +2486,12 @@ int __sys_recvmmsg(int fd, struct mmsghdr __user 
> *mmsg,
>  return -EFAULT;
> 
>  if (!timeout && !timeout32)
> -do_recvmmsg(fd, mmsg, vlen, flags, NULL);
> +return do_recvmmsg(fd, mmsg, vlen, flags, NULL);
> 
>  datagrams = do_recvmmsg(fd, mmsg, vlen, flags, &timeout_sys);
> 
> -if (!datagrams)
> -return 0;
> +if (datagrams <= 0)
> +return datagrams;
> 
>  if (timeout && put_timespec64(&timeout_sys, timeout))
>  datagrams = -EFAULT;
>

[PATCH v4 3/6] arm64/kvm: add a userspace option to enable pointer authentication

2018-12-17 Thread Amit Daniel Kachhap

This feature will allow the KVM guest to allow the handling of
pointer authentication instructions or to treat them as undefined
if not set. It uses the existing vcpu API KVM_ARM_VCPU_INIT to
supply this parameter instead of creating a new API.

A new register is not created to pass this parameter via
SET/GET_ONE_REG interface as just a flag (KVM_ARM_VCPU_PTRAUTH)
supplied is enough to select this feature.

Signed-off-by: Amit Daniel Kachhap 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: kvm...@lists.cs.columbia.edu
---
 Documentation/arm64/pointer-authentication.txt |  9 +
 Documentation/virtual/kvm/api.txt  |  4 
 arch/arm/include/asm/kvm_host.h|  4 
 arch/arm64/include/asm/kvm_host.h  |  7 ---
 arch/arm64/include/uapi/asm/kvm.h  |  1 +
 arch/arm64/kvm/handle_exit.c   |  2 +-
 arch/arm64/kvm/hyp/ptrauth-sr.c| 16 
 arch/arm64/kvm/reset.c |  3 +++
 include/uapi/linux/kvm.h   |  1 +
 9 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/Documentation/arm64/pointer-authentication.txt 
b/Documentation/arm64/pointer-authentication.txt
index 5baca42..8c0f338 100644
--- a/Documentation/arm64/pointer-authentication.txt
+++ b/Documentation/arm64/pointer-authentication.txt
@@ -87,7 +87,8 @@ used to get and set the keys for a thread.
 Virtualization
 --
 
-Pointer authentication is not currently supported in KVM guests. KVM
-will mask the feature bits from ID_AA64ISAR1_EL1, and attempted use of
-the feature will result in an UNDEFINED exception being injected into
-the guest.
+Pointer authentication is enabled in KVM guest when virtual machine is
+created by passing a flag (KVM_ARM_VCPU_PTRAUTH) requesting this feature
+to be enabled. Without this flag, pointer authentication is not enabled
+in KVM guests and attempted use of the feature will result in an UNDEFINED
+exception being injected into the guest.
diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index cd209f7..e20583a 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2634,6 +2634,10 @@ Possible features:
  Depends on KVM_CAP_ARM_PSCI_0_2.
- KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
  Depends on KVM_CAP_ARM_PMU_V3.
+   - KVM_ARM_VCPU_PTRAUTH: Emulate Pointer authentication for the CPU.
+ Depends on KVM_CAP_ARM_PTRAUTH and only on arm64 architecture. If
+ set, then the KVM guest allows the execution of pointer authentication
+ instructions or treats them as undefined if not set.
 
 
 4.83 KVM_ARM_PREFERRED_TARGET
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 02d9bfc..62a85d9 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -352,6 +352,10 @@ static inline int kvm_arm_have_ssbd(void)
 static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {}
 static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu) {}
+static inline bool kvm_arm_vcpu_ptrauth_allowed(struct kvm_vcpu *vcpu)
+{
+   return false;
+}
 
 #define __KVM_HAVE_ARCH_VM_ALLOC
 struct kvm *kvm_arch_alloc_vm(void);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 629712d..f853a95 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -43,7 +43,7 @@
 
 #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
 
-#define KVM_VCPU_MAX_FEATURES 4
+#define KVM_VCPU_MAX_FEATURES 5
 
 #define KVM_REQ_SLEEP \
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
@@ -453,14 +453,15 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
 
 void kvm_arm_vcpu_ptrauth_enable(struct kvm_vcpu *vcpu);
 void kvm_arm_vcpu_ptrauth_disable(struct kvm_vcpu *vcpu);
+bool kvm_arm_vcpu_ptrauth_allowed(struct kvm_vcpu *vcpu);
 
 static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu)
 {
/* Disable ptrauth and use it in a lazy context via traps */
-   if (has_vhe() && system_supports_ptrauth())
+   if (has_vhe() && system_supports_ptrauth()
+   && kvm_arm_vcpu_ptrauth_allowed(vcpu))
kvm_arm_vcpu_ptrauth_disable(vcpu);
 }
-
 void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
 
 static inline void kvm_arch_hardware_unsetup(void) {}
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 97c3478..5f82ca1 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -102,6 +102,7 @@ struct kvm_regs {
 #define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */
 #define KVM_ARM_VCPU_PSCI_0_2  2 /* CPU uses PSCI v0.2 */
 #define KVM_ARM_VCPU_PMU_V33 /* Support guest PMUv3 */
+#define KVM_ARM_VCPU_PTRAUTH

[PATCH v4 1/6] arm64/kvm: preserve host HCR_EL2 value

2018-12-17 Thread Amit Daniel Kachhap

When restoring HCR_EL2 for the host, KVM uses HCR_HOST_VHE_FLAGS, which
is a constant value. This works today, as the host HCR_EL2 value is
always the same, but this will get in the way of supporting extensions
that require HCR_EL2 bits to be set conditionally for the host.

To allow such features to work without KVM having to explicitly handle
every possible host feature combination, this patch has KVM save/restore
the host HCR when switching to/from a guest HCR. The saving of the
register is done once during cpu hypervisor initialization state and is
just restored after switch from guest.

For fetching HCR_EL2 during kvm initilisation, a hyp call is made using
kvm_call_hyp and is helpful in NHVE case.

For the hyp TLB maintenance code, __tlb_switch_to_host_vhe() is updated
to toggle the TGE bit with a RMW sequence, as we already do in
__tlb_switch_to_guest_vhe().

Signed-off-by: Mark Rutland 
Signed-off-by: Amit Daniel Kachhap 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: kvm...@lists.cs.columbia.edu
---
 arch/arm/include/asm/kvm_host.h   |  2 ++
 arch/arm64/include/asm/kvm_asm.h  |  2 ++
 arch/arm64/include/asm/kvm_host.h | 14 --
 arch/arm64/kvm/hyp/switch.c   | 15 +--
 arch/arm64/kvm/hyp/sysreg-sr.c| 11 +++
 arch/arm64/kvm/hyp/tlb.c  |  6 +-
 virt/kvm/arm/arm.c|  2 ++
 7 files changed, 43 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 5ca5d9a..0f012c8 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -273,6 +273,8 @@ static inline void __cpu_init_stage2(void)
kvm_call_hyp(__init_stage2_translation);
 }
 
+static inline void __cpu_copy_host_registers(void) {}
+
 static inline int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
return 0;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index aea01a0..25ac9fa 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -73,6 +73,8 @@ extern void __vgic_v3_init_lrs(void);
 
 extern u32 __kvm_get_mdcr_el2(void);
 
+extern u64 __read_hyp_hcr_el2(void);
+
 /* Home-grown __this_cpu_{ptr,read} variants that always work at HYP */
 #define __hyp_this_cpu_ptr(sym)
\
({  \
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 52fbc82..1b9eed9 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -196,13 +196,17 @@ enum vcpu_sysreg {
 
 #define NR_COPRO_REGS  (NR_SYS_REGS * 2)
 
+struct kvm_cpu_init_host_regs {
+   u64 hcr_el2;
+};
+
 struct kvm_cpu_context {
struct kvm_regs gp_regs;
union {
u64 sys_regs[NR_SYS_REGS];
u32 copro[NR_COPRO_REGS];
};
-
+   struct kvm_cpu_init_host_regs init_regs;
struct kvm_vcpu *__hyp_running_vcpu;
 };
 
@@ -211,7 +215,7 @@ typedef struct kvm_cpu_context kvm_cpu_context_t;
 struct kvm_vcpu_arch {
struct kvm_cpu_context ctxt;
 
-   /* HYP configuration */
+   /* Guest HYP configuration */
u64 hcr_el2;
u32 mdcr_el2;
 
@@ -455,6 +459,12 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 
 static inline void __cpu_init_stage2(void) {}
 
+static inline void __cpu_copy_host_registers(void)
+{
+   kvm_cpu_context_t *host_cxt = this_cpu_ptr(&kvm_host_cpu_state);
+   host_cxt->init_regs.hcr_el2 = kvm_call_hyp(__read_hyp_hcr_el2);
+}
+
 /* Guest/host FPSIMD coordination helpers */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index f6e02cc..85a2a5c 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -139,15 +139,15 @@ static void __hyp_text __activate_traps(struct kvm_vcpu 
*vcpu)
__activate_traps_nvhe(vcpu);
 }
 
-static void deactivate_traps_vhe(void)
+static void deactivate_traps_vhe(struct kvm_cpu_context *host_ctxt)
 {
extern char vectors[];  /* kernel exception vectors */
-   write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+   write_sysreg(host_ctxt->init_regs.hcr_el2, hcr_el2);
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
write_sysreg(vectors, vbar_el1);
 }
 
-static void __hyp_text __deactivate_traps_nvhe(void)
+static void __hyp_text __deactivate_traps_nvhe(struct kvm_cpu_context 
*host_ctxt)
 {
u64 mdcr_el2 = read_sysreg(mdcr_el2);
 
@@ -157,12 +157,15 @@ static void __hyp_text __deactivate_traps_nvhe(void)
mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
 
write_sysreg(mdcr_el2, mdcr_el2);
-   write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2);
+   write_sysreg(host_ctxt->init_regs.hcr_el2, hcr_el2);
write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);

[PATCH v4 4/6] arm64/kvm: enable pointer authentication cpufeature conditionally

2018-12-17 Thread Amit Daniel Kachhap

According to userspace settings, pointer authentication cpufeature
is enabled/disabled from guests.

Signed-off-by: Amit Daniel Kachhap 
Cc: Mark Rutland 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: kvm...@lists.cs.columbia.edu
---
 Documentation/arm64/pointer-authentication.txt |  3 +++
 arch/arm64/kvm/sys_regs.c  | 33 --
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/Documentation/arm64/pointer-authentication.txt 
b/Documentation/arm64/pointer-authentication.txt
index 8c0f338..a65dca2 100644
--- a/Documentation/arm64/pointer-authentication.txt
+++ b/Documentation/arm64/pointer-authentication.txt
@@ -92,3 +92,6 @@ created by passing a flag (KVM_ARM_VCPU_PTRAUTH) requesting 
this feature
 to be enabled. Without this flag, pointer authentication is not enabled
 in KVM guests and attempted use of the feature will result in an UNDEFINED
 exception being injected into the guest.
+
+Additionally, when KVM_ARM_VCPU_PTRAUTH is not set then KVM will mask the
+feature bits from ID_AA64ISAR1_EL1.
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6af6c7d..ce6144a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1055,7 +1055,7 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
 }
 
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
+static u64 read_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_desc const *r, 
bool raz)
 {
u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
@@ -1066,6 +1066,15 @@ static u64 read_id_reg(struct sys_reg_desc const *r, 
bool raz)
kvm_debug("SVE unsupported for guests, suppressing\n");
 
val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
+   } else if (id == SYS_ID_AA64ISAR1_EL1) {
+   const u64 ptrauth_mask = (0xfUL << ID_AA64ISAR1_APA_SHIFT) |
+(0xfUL << ID_AA64ISAR1_API_SHIFT) |
+(0xfUL << ID_AA64ISAR1_GPA_SHIFT) |
+(0xfUL << ID_AA64ISAR1_GPI_SHIFT);
+   if (!kvm_arm_vcpu_ptrauth_allowed(vcpu)) {
+   kvm_debug("ptrauth unsupported for guests, 
suppressing\n");
+   val &= ~ptrauth_mask;
+   }
} else if (id == SYS_ID_AA64MMFR1_EL1) {
if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT))
kvm_debug("LORegions unsupported for guests, 
suppressing\n");
@@ -1086,7 +1095,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
if (p->is_write)
return write_to_read_only(vcpu, p, r);
 
-   p->regval = read_id_reg(r, raz);
+   p->regval = read_id_reg(vcpu, r, raz);
return true;
 }
 
@@ -1115,17 +1124,17 @@ static u64 sys_reg_to_index(const struct sys_reg_desc 
*reg);
  * are stored, and for set_id_reg() we don't allow the effective value
  * to be changed.
  */
-static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
-   bool raz)
+static int __get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+   void __user *uaddr, bool raz)
 {
const u64 id = sys_reg_to_index(rd);
-   const u64 val = read_id_reg(rd, raz);
+   const u64 val = read_id_reg(vcpu, rd, raz);
 
return reg_to_user(uaddr, &val, id);
 }
 
-static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
-   bool raz)
+static int __set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+   void __user *uaddr, bool raz)
 {
const u64 id = sys_reg_to_index(rd);
int err;
@@ -1136,7 +1145,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, 
void __user *uaddr,
return err;
 
/* This is what we mean by invariant: you can't change it. */
-   if (val != read_id_reg(rd, raz))
+   if (val != read_id_reg(vcpu, rd, raz))
return -EINVAL;
 
return 0;
@@ -1145,25 +1154,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, 
void __user *uaddr,
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-   return __get_id_reg(rd, uaddr, false);
+   return __get_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-   return __set_id_reg(rd, uaddr, false);
+   return __set_id_reg(vcpu, rd, uaddr, false);
 }
 
 static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
  const struct kvm_one_reg *reg, void __user *uaddr)
 {
-   return __get_id_reg(rd, uaddr, true);
+   ret

[PATCH net] net: phy: Fix the issue that netif always links up after resuming

2018-12-17 Thread Kunihiko Hayashi

Even though the link is down before entering hibernation,
there is an issue that the network interface always links up after resuming
from hibernation.

If the link is still down before enabling the network interface,
and after resuming from hibernation, the phydev->state is forcibly set
to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.

In suspend sequence, only if the PHY is attached, mdio_bus_phy_suspend()
calls phy_stop_machine(), and mdio_bus_phy_resume() calls
phy_start_machine().
In resume sequence, it's enough to do the same as mdio_bus_phy_resume()
because the state has been preserved.

This patch fixes the issue by calling phy_start_machine() in
mdio_bus_phy_restore() in the same way as mdio_bus_phy_resume().

Suggested-by: Heiner Kallweit 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/net/phy/phy_device.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

This patch is based on the RFC patch discussion [1].
[1] https://www.spinics.net/lists/netdev/msg537326.html

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 7d5d698..3685be4 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -315,11 +315,8 @@ static int mdio_bus_phy_restore(struct device *dev)
if (ret < 0)
return ret;
 
-   /* The PHY needs to renegotiate. */
-   phydev->link = 0;
-   phydev->state = PHY_UP;
-
-   phy_start_machine(phydev);
+   if (phydev->attached_dev && phydev->adjust_link)
+   phy_start_machine(phydev);
 
return 0;
 }
-- 
2.7.4

[PATCH v8 1/2] dt-bindings: PCI: meson: add DT bindings for Amlogic Meson PCIe controller

2018-12-17 Thread Hanjie Lin

From: Yue Wang 

The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare
PCI core. This patch adds documentation for the DT bindings in Meson PCIe
controller.

Signed-off-by: Yue Wang 
Signed-off-by: Hanjie Lin 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/pci/amlogic,meson-pcie.txt | 70 ++
 1 file changed, 70 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt 
b/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt
new file mode 100644
index 000..12b18f8
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt
@@ -0,0 +1,70 @@
+Amlogic Meson AXG DWC PCIE SoC controller
+
+Amlogic Meson PCIe host controller is based on the Synopsys DesignWare PCI 
core.
+It shares common functions with the PCIe DesignWare core driver and
+inherits common properties defined in
+Documentation/devicetree/bindings/pci/designware-pci.txt.
+
+Additional properties are described here:
+
+Required properties:
+- compatible:
+   should contain "amlogic,axg-pcie" to identify the core.
+- reg:
+   should contain the configuration address space.
+- reg-names: Must be
+   - "elbi"External local bus interface registers
+   - "cfg" Meson specific registers
+   - "phy" Meson PCIE PHY registers
+   - "config"  PCIe configuration space
+- reset-gpios: The GPIO to generate PCIe PERST# assert and deassert signal.
+- clocks: Must contain an entry for each entry in clock-names.
+- clock-names: Must include the following entries:
+   - "pclk"   PCIe GEN 100M PLL clock
+   - "port"   PCIe_x(A or B) RC clock gate
+   - "general"PCIe Phy clock
+   - "mipi"   PCIe_x(A or B) 100M ref clock gate
+- resets: phandle to the reset lines.
+- reset-names: must contain "phy" "port" and "apb"
+   - "phy" Share PHY reset
+   - "port"Port A or B reset
+   - "apb" Share APB reset
+- device_type:
+   should be "pci". As specified in designware-pcie.txt
+
+
+Example configuration:
+
+   pcie: pcie@f980 {
+   compatible = "amlogic,axg-pcie", "snps,dw-pcie";
+   reg = <0x0 0xf980 0x0 0x40
+   0x0 0xff646000 0x0 0x2000
+   0x0 0xff644000 0x0 0x2000
+   0x0 0xf9f0 0x0 0x10>;
+   reg-names = "elbi", "cfg", "phy", "config";
+   reset-gpios = <&gpio GPIOX_19 GPIO_ACTIVE_HIGH>;
+   interrupts = ;
+   #interrupt-cells = <1>;
+   interrupt-map-mask = <0 0 0 0>;
+   interrupt-map = <0 0 0 0 &gic GIC_SPI 179 
IRQ_TYPE_EDGE_RISING>;
+   bus-range = <0x0 0xff>;
+   #address-cells = <3>;
+   #size-cells = <2>;
+   device_type = "pci";
+   ranges = <0x8200 0 0 0x0 0xf9c0 0 0x0030>;
+
+   clocks = <&clkc CLKID_USB
+   &clkc CLKID_MIPI_ENABLE
+   &clkc CLKID_PCIE_A
+   &clkc CLKID_PCIE_CML_EN0>;
+   clock-names = "general",
+   "mipi",
+   "pclk",
+   "port";
+   resets = <&reset RESET_PCIE_PHY>,
+   <&reset RESET_PCIE_A>,
+   <&reset RESET_PCIE_APB>;
+   reset-names = "phy",
+   "port",
+   "apb";
+   };
-- 
2.7.4

[PATCH v7 0/2] add the Amlogic Meson PCIe controller driver

2018-12-17 Thread Hanjie Lin

The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare
PCI core. This patchset add the driver and dt-bindings of the controller.
Changes since v7: [6]
 - include files in alphabetical order
 - get rid of unused MACROs and variables
 - optimize meson_pcie_link_up() while loop
 
Changes since v6: [5]
 - fix bad usage of ERR_PTR(ENXIO)
 - fix meson_pcie_rd_own_conf() when read PCI_CLASS_DEVICE reg 

Changes since v5: [4]
 - update MAINTAINER file in alphabetical order
 - remove meaningless comment
 - use ERR_PTR function instead of (void *) cast
 - use is_power_of_2(size) instead of size & (size - 1)
 - add comment for PCI_CLASS_REVISION register operation
 
Changes since v4: [3]
 - fix kbuild test robot and compile warnings

Changes since v3: [2]
 - modify subject format
 - update Kconfig
 - update MAINTAINER file
 - add comment and error handle for meson_pcie_get_mem_shared()
 - drop useless initialization code
 - add comment for meson_size_to_payload()
 - optimize meson_pcie_establish_link() return code
 - optimize meson_pcie_enable_interrupts() redundant function
 - drop device_attch related code
 - drop dw_pcie_ops read_dbi and write_dbi function
 - add error handle for meson_add_pcie_port() when probe

Changes since v2: [1]
 - abandon phy driver, move reset to the controller
 - use devm_add_action_or_reset() to use clock res
 - format correcting

Changes since v1: [0]
 - use gpio lib instead open code
 - move 'apb' and 'port' reset from phy driver
 - format correcting

[0] : 
https://lkml.kernel.org/r/1534227522-186798-1-git-send-email-hanjie@amlogic.com
[1] : 
https://lkml.kernel.org/r/1535096165-45827-1-git-send-email-hanjie@amlogic.com
[2] : 
https://lkml.kernel.org/r/1537509820-52040-1-git-send-email-hanjie@amlogic.com
 
[3] : 
https://lkml.kernel.org/r/1538999834-156423-3-git-send-email-hanjie@amlogic.com
[4] : 
https://lkml.kernel.org/r/1539049990-30810-1-git-send-email-hanjie@amlogic.com
[5] : 
https://lkml.kernel.org/r/1542876836-191355-1-git-send-email-hanjie@amlogic.com

Yue Wang (2):
  dt-bindings: PCI: meson: add DT bindings for Amlogic Meson PCIe
controller
  PCI: amlogic: Add the Amlogic Meson PCIe controller driver

 .../devicetree/bindings/pci/amlogic,meson-pcie.txt |  70 +++
 MAINTAINERS|   7 +
 drivers/pci/controller/dwc/Kconfig |  10 +
 drivers/pci/controller/dwc/Makefile|   1 +
 drivers/pci/controller/dwc/pci-meson.c | 595 +
 5 files changed, 683 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt
 create mode 100644 drivers/pci/controller/dwc/pci-meson.c

-- 
2.7.4

[PATCH v4 5/6] arm64/kvm: control accessibility of ptrauth key registers

2018-12-17 Thread Amit Daniel Kachhap

According to userspace settings, ptrauth key registers are conditionally
present in guest system register list based on user specified flag
KVM_ARM_VCPU_PTRAUTH.

Signed-off-by: Amit Daniel Kachhap 
Cc: Mark Rutland 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: kvm...@lists.cs.columbia.edu
---
 Documentation/arm64/pointer-authentication.txt |  3 +-
 arch/arm64/kvm/sys_regs.c  | 42 +++---
 2 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/Documentation/arm64/pointer-authentication.txt 
b/Documentation/arm64/pointer-authentication.txt
index a65dca2..729055a 100644
--- a/Documentation/arm64/pointer-authentication.txt
+++ b/Documentation/arm64/pointer-authentication.txt
@@ -94,4 +94,5 @@ in KVM guests and attempted use of the feature will result in 
an UNDEFINED
 exception being injected into the guest.
 
 Additionally, when KVM_ARM_VCPU_PTRAUTH is not set then KVM will mask the
-feature bits from ID_AA64ISAR1_EL1.
+feature bits from ID_AA64ISAR1_EL1 and pointer authentication key registers
+are hidden from userspace.
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index ce6144a..09302b2 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1343,12 +1343,6 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
 
-   PTRAUTH_KEY(APIA),
-   PTRAUTH_KEY(APIB),
-   PTRAUTH_KEY(APDA),
-   PTRAUTH_KEY(APDB),
-   PTRAUTH_KEY(APGA),
-
{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
@@ -1500,6 +1494,14 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_FPEXC32_EL2), NULL, reset_val, FPEXC32_EL2, 0x70 },
 };
 
+static const struct sys_reg_desc ptrauth_reg_descs[] = {
+   PTRAUTH_KEY(APIA),
+   PTRAUTH_KEY(APIB),
+   PTRAUTH_KEY(APDA),
+   PTRAUTH_KEY(APDB),
+   PTRAUTH_KEY(APGA),
+};
+
 static bool trap_dbgidr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
@@ -2100,6 +2102,8 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
r = find_reg(params, table, num);
if (!r)
r = find_reg(params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+   if (!r && kvm_arm_vcpu_ptrauth_allowed(vcpu))
+   r = find_reg(params, ptrauth_reg_descs, 
ARRAY_SIZE(ptrauth_reg_descs));
 
if (likely(r)) {
perform_access(vcpu, params, r);
@@ -2213,6 +2217,8 @@ static const struct sys_reg_desc 
*index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
r = find_reg_by_id(id, ¶ms, table, num);
if (!r)
r = find_reg(¶ms, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+   if (!r && kvm_arm_vcpu_ptrauth_allowed(vcpu))
+   r = find_reg(¶ms, ptrauth_reg_descs, 
ARRAY_SIZE(ptrauth_reg_descs));
 
/* Not saved in the sys_reg array and not otherwise accessible? */
if (r && !(r->reg || r->get_user))
@@ -2494,18 +2500,22 @@ static int walk_one_sys_reg(const struct sys_reg_desc 
*rd,
 }
 
 /* Assumed ordered tables, see kvm_sys_reg_table_init. */
-static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
+static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind,
+   const struct sys_reg_desc *desc, unsigned int len)
 {
const struct sys_reg_desc *i1, *i2, *end1, *end2;
unsigned int total = 0;
size_t num;
int err;
 
+   if (desc == ptrauth_reg_descs && !kvm_arm_vcpu_ptrauth_allowed(vcpu))
+   return total;
+
/* We check for duplicates here, to allow arch-specific overrides. */
i1 = get_target_table(vcpu->arch.target, true, &num);
end1 = i1 + num;
-   i2 = sys_reg_descs;
-   end2 = sys_reg_descs + ARRAY_SIZE(sys_reg_descs);
+   i2 = desc;
+   end2 = desc + len;
 
BUG_ON(i1 == end1 || i2 == end2);
 
@@ -2533,7 +2543,10 @@ unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu 
*vcpu)
 {
return ARRAY_SIZE(invariant_sys_regs)
+ num_demux_regs()
-   + walk_sys_regs(vcpu, (u64 __user *)NULL);
+   + walk_sys_regs(vcpu, (u64 __user *)NULL, sys_reg_descs,
+   ARRAY_SIZE(sys_reg_descs))
+   + walk_sys_regs(vcpu, (u64 __user *)NULL, ptrauth_reg_descs,
+   ARRAY_SIZE(ptrauth_reg_descs));
 }
 
 int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
@@ -2548,7 +2561,12 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, 
u64 __user *uindices)
uindices++;
}
 
-   err = walk_sys_regs(vcpu, uindices);
+

[PATCH v4 6/6] arm/kvm: arm64: Add a vcpu feature for pointer authentication

2018-12-17 Thread Amit Daniel Kachhap

This is a runtime feature and can be enabled by --ptrauth option.

Signed-off-by: Amit Daniel Kachhap 
Cc: Mark Rutland 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: kvm...@lists.cs.columbia.edu
---
 arm/aarch32/include/kvm/kvm-cpu-arch.h| 2 ++
 arm/aarch64/include/asm/kvm.h | 3 +++
 arm/aarch64/include/kvm/kvm-arch.h| 1 +
 arm/aarch64/include/kvm/kvm-config-arch.h | 4 +++-
 arm/aarch64/include/kvm/kvm-cpu-arch.h| 2 ++
 arm/aarch64/kvm-cpu.c | 5 +
 arm/include/arm-common/kvm-config-arch.h  | 1 +
 arm/kvm-cpu.c | 7 +++
 include/linux/kvm.h   | 1 +
 9 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arm/aarch32/include/kvm/kvm-cpu-arch.h 
b/arm/aarch32/include/kvm/kvm-cpu-arch.h
index d28ea67..5779767 100644
--- a/arm/aarch32/include/kvm/kvm-cpu-arch.h
+++ b/arm/aarch32/include/kvm/kvm-cpu-arch.h
@@ -13,4 +13,6 @@
 #define ARM_CPU_ID 0, 0, 0
 #define ARM_CPU_ID_MPIDR   5
 
+unsigned int kvm__cpu_ptrauth_get_feature(void) {}
+
 #endif /* KVM__KVM_CPU_ARCH_H */
diff --git a/arm/aarch64/include/asm/kvm.h b/arm/aarch64/include/asm/kvm.h
index c286035..0fd183d 100644
--- a/arm/aarch64/include/asm/kvm.h
+++ b/arm/aarch64/include/asm/kvm.h
@@ -98,6 +98,9 @@ struct kvm_regs {
 #define KVM_ARM_VCPU_PSCI_0_2  2 /* CPU uses PSCI v0.2 */
 #define KVM_ARM_VCPU_PMU_V33 /* Support guest PMUv3 */
 
+/* CPU uses address authentication and A key */
+#define KVM_ARM_VCPU_PTRAUTH   4
+
 struct kvm_vcpu_init {
__u32 target;
__u32 features[7];
diff --git a/arm/aarch64/include/kvm/kvm-arch.h 
b/arm/aarch64/include/kvm/kvm-arch.h
index 9de623a..bd566cb 100644
--- a/arm/aarch64/include/kvm/kvm-arch.h
+++ b/arm/aarch64/include/kvm/kvm-arch.h
@@ -11,4 +11,5 @@
 
 #include "arm-common/kvm-arch.h"
 
+
 #endif /* KVM__KVM_ARCH_H */
diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h 
b/arm/aarch64/include/kvm/kvm-config-arch.h
index 04be43d..2074684 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -8,7 +8,9 @@
"Create PMUv3 device"), \
OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed, \
"Specify random seed for Kernel Address Space " \
-   "Layout Randomization (KASLR)"),
+   "Layout Randomization (KASLR)"),\
+   OPT_BOOLEAN('\0', "ptrauth", &(cfg)->has_ptrauth,   \
+   "Enable address authentication"),
 
 #include "arm-common/kvm-config-arch.h"
 
diff --git a/arm/aarch64/include/kvm/kvm-cpu-arch.h 
b/arm/aarch64/include/kvm/kvm-cpu-arch.h
index a9d8563..f7b64b7 100644
--- a/arm/aarch64/include/kvm/kvm-cpu-arch.h
+++ b/arm/aarch64/include/kvm/kvm-cpu-arch.h
@@ -17,4 +17,6 @@
 #define ARM_CPU_CTRL   3, 0, 1, 0
 #define ARM_CPU_CTRL_SCTLR_EL1 0
 
+unsigned int kvm__cpu_ptrauth_get_feature(void);
+
 #endif /* KVM__KVM_CPU_ARCH_H */
diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index 1b29374..10da2cb 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -123,6 +123,11 @@ void kvm_cpu__reset_vcpu(struct kvm_cpu *vcpu)
return reset_vcpu_aarch64(vcpu);
 }
 
+unsigned int kvm__cpu_ptrauth_get_feature(void)
+{
+   return (1UL << KVM_ARM_VCPU_PTRAUTH);
+}
+
 int kvm_cpu__get_endianness(struct kvm_cpu *vcpu)
 {
struct kvm_one_reg reg;
diff --git a/arm/include/arm-common/kvm-config-arch.h 
b/arm/include/arm-common/kvm-config-arch.h
index 6a196f1..eb872db 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -10,6 +10,7 @@ struct kvm_config_arch {
boolaarch32_guest;
boolhas_pmuv3;
u64 kaslr_seed;
+   boolhas_ptrauth;
enum irqchip_type irqchip;
 };
 
diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
index 7780251..5afd727 100644
--- a/arm/kvm-cpu.c
+++ b/arm/kvm-cpu.c
@@ -68,6 +68,13 @@ struct kvm_cpu *kvm_cpu__arch_init(struct kvm *kvm, unsigned 
long cpu_id)
vcpu_init.features[0] |= (1UL << KVM_ARM_VCPU_PSCI_0_2);
}
 
+   /* Set KVM_ARM_VCPU_PTRAUTH_I_A if available */
+   if (kvm__supports_extension(kvm, KVM_CAP_ARM_PTRAUTH)) {
+   if (kvm->cfg.arch.has_ptrauth)
+   vcpu_init.features[0] |=
+   kvm__cpu_ptrauth_get_feature();
+   }
+
/*
 * If the preferred target ioctl is successful then
 * use preferred target else try each and every target type
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index f51d508..ffd8f5c 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -883,6 +883,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_MMU_RADIX 134
 #define KVM_CAP_PPC_MMU_HASH_V3 135
 #define KVM_CAP_IMMEDIATE_

[PATCH v4 2/6] arm64/kvm: context-switch ptrauth registers

2018-12-17 Thread Amit Daniel Kachhap

When pointer authentication is supported, a guest may wish to use it.
This patch adds the necessary KVM infrastructure for this to work, with
a semi-lazy context switch of the pointer auth state.

Pointer authentication feature is only enabled when VHE is built
into the kernel and present into CPU implementation so only VHE code
paths are modified.

When we schedule a vcpu, we disable guest usage of pointer
authentication instructions and accesses to the keys. While these are
disabled, we avoid context-switching the keys. When we trap the guest
trying to use pointer authentication functionality, we change to eagerly
context-switching the keys, and enable the feature. The next time the
vcpu is scheduled out/in, we start again.

Pointer authentication consists of address authentication and generic
authentication, and CPUs in a system might have varied support for
either. Where support for either feature is not uniform, it is hidden
from guests via ID register emulation, as a result of the cpufeature
framework in the host.

Unfortunately, address authentication and generic authentication cannot
be trapped separately, as the architecture provides a single EL2 trap
covering both. If we wish to expose one without the other, we cannot
prevent a (badly-written) guest from intermittently using a feature
which is not uniformly supported (when scheduled on a physical CPU which
supports the relevant feature). When the guest is scheduled on a
physical CPU lacking the feature, these attemts will result in an UNDEF
being taken by the guest.

Signed-off-by: Mark Rutland 
Signed-off-by: Amit Daniel Kachhap 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: kvm...@lists.cs.columbia.edu
---
 arch/arm/include/asm/kvm_host.h |  1 +
 arch/arm64/include/asm/cpufeature.h |  6 +++
 arch/arm64/include/asm/kvm_host.h   | 24 
 arch/arm64/include/asm/kvm_hyp.h|  7 
 arch/arm64/kernel/traps.c   |  1 +
 arch/arm64/kvm/handle_exit.c| 24 +++-
 arch/arm64/kvm/hyp/Makefile |  1 +
 arch/arm64/kvm/hyp/ptrauth-sr.c | 73 +
 arch/arm64/kvm/hyp/switch.c |  4 ++
 arch/arm64/kvm/sys_regs.c   | 40 
 virt/kvm/arm/arm.c  |  2 +
 11 files changed, 166 insertions(+), 17 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/ptrauth-sr.c

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 0f012c8..02d9bfc 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -351,6 +351,7 @@ static inline int kvm_arm_have_ssbd(void)
 
 static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {}
 static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu) {}
 
 #define __KVM_HAVE_ARCH_VM_ALLOC
 struct kvm *kvm_arch_alloc_vm(void);
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 1c8393f..ac7d496 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -526,6 +526,12 @@ static inline bool system_supports_generic_auth(void)
cpus_have_const_cap(ARM64_HAS_GENERIC_AUTH);
 }
 
+static inline bool system_supports_ptrauth(void)
+{
+   return IS_ENABLED(CONFIG_ARM64_PTR_AUTH) &&
+   cpus_have_const_cap(ARM64_HAS_ADDRESS_AUTH);
+}
+
 #define ARM64_SSBD_UNKNOWN -1
 #define ARM64_SSBD_FORCE_DISABLE   0
 #define ARM64_SSBD_KERNEL  1
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 1b9eed9..629712d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -146,6 +146,18 @@ enum vcpu_sysreg {
PMSWINC_EL0,/* Software Increment Register */
PMUSERENR_EL0,  /* User Enable Register */
 
+   /* Pointer Authentication Registers */
+   APIAKEYLO_EL1,
+   APIAKEYHI_EL1,
+   APIBKEYLO_EL1,
+   APIBKEYHI_EL1,
+   APDAKEYLO_EL1,
+   APDAKEYHI_EL1,
+   APDBKEYLO_EL1,
+   APDBKEYHI_EL1,
+   APGAKEYLO_EL1,
+   APGAKEYHI_EL1,
+
/* 32bit specific registers. Keep them at the end of the range */
DACR32_EL2, /* Domain Access Control Register */
IFSR32_EL2, /* Instruction Fault Status Register */
@@ -439,6 +451,18 @@ static inline bool kvm_arch_check_sve_has_vhe(void)
return true;
 }
 
+void kvm_arm_vcpu_ptrauth_enable(struct kvm_vcpu *vcpu);
+void kvm_arm_vcpu_ptrauth_disable(struct kvm_vcpu *vcpu);
+
+static inline void kvm_arm_vcpu_ptrauth_config(struct kvm_vcpu *vcpu)
+{
+   /* Disable ptrauth and use it in a lazy context via traps */
+   if (has_vhe() && system_supports_ptrauth())
+   kvm_arm_vcpu_ptrauth_disable(vcpu);
+}
+
+void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
+
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(st

[PATCH v8 2/2] PCI: amlogic: Add the Amlogic Meson PCIe controller driver

2018-12-17 Thread Hanjie Lin

From: Yue Wang 

The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare
PCI core. This patch adds the driver support for Meson PCIe controller.

Signed-off-by: Yue Wang 
Signed-off-by: Hanjie Lin 
---
 MAINTAINERS|   7 +
 drivers/pci/controller/dwc/Kconfig |  10 +
 drivers/pci/controller/dwc/Makefile|   1 +
 drivers/pci/controller/dwc/pci-meson.c | 595 +
 4 files changed, 613 insertions(+)
 create mode 100644 drivers/pci/controller/dwc/pci-meson.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7fe120f..21ed916 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11600,6 +11600,13 @@ T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/
 S: Supported
 F: drivers/pci/controller/
 
+PCIE DRIVER FOR AMLOGIC MESON
+M: Yue Wang 
+L: linux-...@vger.kernel.org
+L: linux-amlo...@lists.infradead.org
+S: Maintained
+F: drivers/pci/controller/dwc/pci-meson.c
+
 PCIE DRIVER FOR AXIS ARTPEC
 M: Jesper Nilsson 
 L: linux-arm-ker...@axis.com
diff --git a/drivers/pci/controller/dwc/Kconfig 
b/drivers/pci/controller/dwc/Kconfig
index 91b0194..7800322 100644
--- a/drivers/pci/controller/dwc/Kconfig
+++ b/drivers/pci/controller/dwc/Kconfig
@@ -193,4 +193,14 @@ config PCIE_HISI_STB
help
   Say Y here if you want PCIe controller support on HiSilicon STB SoCs
 
+config PCI_MESON
+   bool "MESON PCIe controller"
+   depends on PCI_MSI_IRQ_DOMAIN
+   select PCIE_DW_HOST
+   help
+ Say Y here if you want to enable PCI controller support on Amlogic
+ SoCs. The PCI controller on Amlogic is based on DesignWare hardware
+ and therefore the driver re-uses the DesignWare core functions to
+ implement the driver.
+
 endmenu
diff --git a/drivers/pci/controller/dwc/Makefile 
b/drivers/pci/controller/dwc/Makefile
index fcf91ea..e05a015 100644
--- a/drivers/pci/controller/dwc/Makefile
+++ b/drivers/pci/controller/dwc/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o
 obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o
 obj-$(CONFIG_PCIE_KIRIN) += pcie-kirin.o
 obj-$(CONFIG_PCIE_HISI_STB) += pcie-histb.o
+obj-$(CONFIG_PCI_MESON) += pci-meson.o
 
 # The following drivers are for devices that use the generic ACPI
 # pci_root.c driver but don't support standard ECAM config access.
diff --git a/drivers/pci/controller/dwc/pci-meson.c 
b/drivers/pci/controller/dwc/pci-meson.c
new file mode 100644
index 000..7993f9d
--- /dev/null
+++ b/drivers/pci/controller/dwc/pci-meson.c
@@ -0,0 +1,595 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCIe host controller driver for Amlogic MESON SoCs
+ *
+ * Copyright (c) 2018 Amlogic, inc.
+ * Author: Yue Wang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pcie-designware.h"
+
+#define to_meson_pcie(x) dev_get_drvdata((x)->dev)
+
+/* External local bus interface registers */
+#define PLR_OFFSET 0x700
+#define PCIE_PORT_LINK_CTRL_OFF(PLR_OFFSET + 0x10)
+#define FAST_LINK_MODE BIT(7)
+#define LINK_CAPABLE_MASK  GENMASK(21, 16)
+#define LINK_CAPABLE_X1BIT(16)
+
+#define PCIE_GEN2_CTRL_OFF (PLR_OFFSET + 0x10c)
+#define NUM_OF_LANES_MASK  GENMASK(12, 8)
+#define NUM_OF_LANES_X1BIT(8)
+#define DIRECT_SPEED_CHANGEBIT(17)
+
+#define TYPE1_HDR_OFFSET   0x0
+#define PCIE_STATUS_COMMAND(TYPE1_HDR_OFFSET + 0x04)
+#define PCI_IO_EN  BIT(0)
+#define PCI_MEM_SPACE_EN   BIT(1)
+#define PCI_BUS_MASTER_EN  BIT(2)
+
+#define PCIE_BASE_ADDR0(TYPE1_HDR_OFFSET + 0x10)
+#define PCIE_BASE_ADDR1(TYPE1_HDR_OFFSET + 0x14)
+
+#define PCIE_CAP_OFFSET0x70
+#define PCIE_DEV_CTRL_DEV_STUS (PCIE_CAP_OFFSET + 0x08)
+#define PCIE_CAP_MAX_PAYLOAD_MASK  GENMASK(7, 5)
+#define PCIE_CAP_MAX_PAYLOAD_SIZE(x)   ((x) << 5)
+#define PCIE_CAP_MAX_READ_REQ_MASK GENMASK(14, 12)
+#define PCIE_CAP_MAX_READ_REQ_SIZE(x)  ((x) << 12)
+
+/* PCIe specific config registers */
+#define PCIE_CFG0  0x0
+#define APP_LTSSM_ENABLE   BIT(7)
+
+#define PCIE_CFG_STATUS12  0x30
+#define IS_SMLH_LINK_UP(x) ((x) & (1 << 6))
+#define IS_RDLH_LINK_UP(x) ((x) & (1 << 16))
+#define IS_LTSSM_UP(x) x) >> 10) & 0x1f) == 0x11)
+
+#define PCIE_CFG_STATUS17  0x44
+#define PM_CURRENT_STATE(x)(((x) >> 7) & 0x1)
+
+#define WAIT_LINKUP_TIMEOUT4000
+#define PORT_CLK_RATE  1UL
+#define MAX_PAYLOAD_SIZE   256
+#define MAX_READ_REQ_SIZE  256
+#define MESON_PCIE_PHY_POWERUP 0x1c
+#define PCIE_RESET_DELAY   500
+#defin

[PATCH v4 0/6] Add ARMv8.3 pointer authentication for kvm guest

2018-12-17 Thread Amit Daniel Kachhap

Hi,

This patch series adds pointer authentication support for KVM guest and
is based on top of Linux 4.20-rc5 and generic pointer authentication patch
series[1]. The first two patch in this series was originally posted by
Mark Rutland earlier[2,3] and contains some history of this work.

Extension Overview:
=

The ARMv8.3 pointer authentication extension adds functionality to detect
modification of pointer values, mitigating certain classes of attack such as
stack smashing, and making return oriented programming attacks harder.

The extension introduces the concept of a pointer authentication code (PAC),
which is stored in some upper bits of pointers. Each PAC is derived from the
original pointer, another 64-bit value (e.g. the stack pointer), and a secret
128-bit key.

New instructions are added which can be used to:

* Insert a PAC into a pointer
* Strip a PAC from a pointer
* Authenticate and strip a PAC from a pointer

The detailed description of ARMv8.3 pointer authentication support in
userspace/kernel can be found in Kristina's generic pointer authentication
patch series[1].

KVM guest work:
==

If pointer authentication is enabled for KVM guests then the new PAC intructions
will not trap to EL2. If not then they may be ignored if in HINT region or 
trapped
in EL2 as illegal instruction. Since KVM guest vcpu runs as a thread so they 
have
a key initialised which will be used by PAC. When world switch happens between
host and guest then this key is exchanged.

There were some review comments by Christoffer Dall in the original 
series[2,3,4]
and this patch series tries to implement them. The original series enabled 
pointer
authentication for both userspace and kvm userspace. However it is now
bifurcated and this series contains only KVM guest support.

Changes since v3 [4]:
* Use pointer authentication only when VHE is present as ARM8.3 implies ARM8.1
  features to be present.
* Added lazy context handling of ptrauth instructions from V2 version again. 
* Added more details in Documentation.
* Rebased to new version of generic ptrauth patches [1].

Changes since v2 [2,3]:
* Allow host and guest to have different HCR_EL2 settings and not just constant
  value HCR_HOST_VHE_FLAGS or HCR_HOST_NVHE_FLAGS.
* Optimise the reading of HCR_EL2 in host/guest switch by fetching it once
  during KVM initialisation state and using it later.
* Context switch pointer authentication keys when switching between guest
  and host. Pointer authentication was enabled in a lazy context earlier[2] and
  is removed now to make it simple. However it can be revisited later if there
  is significant performance issue.
* Added a userspace option to choose pointer authentication.
* Based on the userspace option, ptrauth cpufeature will be visible.
* Based on the userspace option, ptrauth key registers will be accessible.
* A small document is added on how to enable pointer authentication from
  userspace KVM API.

Looking for feedback and comments.

Thanks,
Amit

[1]: https://lkml.org/lkml/2018/12/7/666
[2]: https://lore.kernel.org/lkml/20171127163806.31435-11-mark.rutl...@arm.com/
[3]: https://lore.kernel.org/lkml/20171127163806.31435-10-mark.rutl...@arm.com/
[4]: https://lkml.org/lkml/2018/10/17/594


Linux (4.20-rc5 based):


Amit Daniel Kachhap (5):
  arm64/kvm: preserve host HCR_EL2 value
  arm64/kvm: context-switch ptrauth registers
  arm64/kvm: add a userspace option to enable pointer authentication
  arm64/kvm: enable pointer authentication cpufeature conditionally
  arm64/kvm: control accessibility of ptrauth key registers

 Documentation/arm64/pointer-authentication.txt | 13 ++--
 Documentation/virtual/kvm/api.txt  |  4 ++
 arch/arm/include/asm/kvm_host.h|  7 ++
 arch/arm64/include/asm/cpufeature.h|  6 ++
 arch/arm64/include/asm/kvm_asm.h   |  2 +
 arch/arm64/include/asm/kvm_host.h  | 41 +++-
 arch/arm64/include/asm/kvm_hyp.h   |  7 ++
 arch/arm64/include/uapi/asm/kvm.h  |  1 +
 arch/arm64/kernel/traps.c  |  1 +
 arch/arm64/kvm/handle_exit.c   | 24 ---
 arch/arm64/kvm/hyp/Makefile|  1 +
 arch/arm64/kvm/hyp/ptrauth-sr.c| 89 +
 arch/arm64/kvm/hyp/switch.c| 19 --
 arch/arm64/kvm/hyp/sysreg-sr.c | 11 
 arch/arm64/kvm/hyp/tlb.c   |  6 +-
 arch/arm64/kvm/reset.c |  3 +
 arch/arm64/kvm/sys_regs.c  | 91 --
 include/uapi/linux/kvm.h   |  1 +
 virt/kvm/arm/arm.c |  4 ++
 19 files changed, 289 insertions(+), 42 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/ptrauth-sr.c

kvmtool:

Repo: git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git
Amit Daniel Kachhap (1):

Re: [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"

2018-12-17 Thread Heiko Carstens

On Mon, Dec 17, 2018 at 05:39:49PM +0100, Michal Hocko wrote:
> On Mon 17-12-18 17:03:50, Michal Hocko wrote:
> > On Mon 17-12-18 16:59:22, Heiko Carstens wrote:
> > > Hi Michal,
> > > 
> > > with linux-next as of today on s390 I see tons of messages like
> > > 
> > > [   20.536664] page dumped because: has_unmovable_pages
> > > [   20.536792] page:03d081ff4080 count:1 mapcount:0 
> > > mapping:8ff88600 index:0x0 compound_mapcount: 0
> > > [   20.536794] flags: 0x3fffe010200(slab|head)
> > > [   20.536795] raw: 03fffe010200 0100 0200 
> > > 8ff88600
> > > [   20.536796] raw:  00200041 0001 
> > > 
> > > [   20.536797] page dumped because: has_unmovable_pages
> > > [   20.536814] page:03d0823b count:1 mapcount:0 
> > > mapping: index:0x0
> > > [   20.536815] flags: 0x7fffe00()
> > > [   20.536817] raw: 07fffe00 0100 0200 
> > > 
> > > [   20.536818] raw:   0001 
> > > 
> > > 
> > > bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for 
> > > memory offline failures")
> > > which is the first commit with which the messages appear.
> > 
> > I would bet this is CMA allocator. How much is tons? Maybe we want a
> > rate limit or the other user is not really interested in them at all?

Yes, the system in question has a 4NB CMA area. "tons" translates to several 
hundred.

> In other words, this should silence those messages.

Yes, with the patch below applied the messages don't appear anymore.

> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 4ae347cbc36d..4eb26d278046 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -30,8 +30,11 @@ static inline bool is_migrate_isolate(int migratetype)
>  }
>  #endif
> 
> +#define SKIP_HWPOISON0x1
> +#define REPORT_FAILURE   0x2
> +
>  bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> -  int migratetype, bool skip_hwpoisoned_pages);
> +  int migratetype, int flags);
>  void set_pageblock_migratetype(struct page *page, int migratetype);
>  int move_freepages_block(struct zone *zone, struct page *page,
>   int migratetype, int *num_movable);
> @@ -44,10 +47,14 @@ int move_freepages_block(struct zone *zone, struct page 
> *page,
>   * For isolating all pages in the range finally, the caller have to
>   * free all pages in the range. test_page_isolated() can be used for
>   * test it.
> + *
> + * The following flags are allowed (they can be combined in a bit mask)
> + * SKIP_HWPOISON - ignore hwpoison pages
> + * REPORT_FAILURE - report details about the failure to isolate the range
>   */
>  int
>  start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> -  unsigned migratetype, bool skip_hwpoisoned_pages);
> +  unsigned migratetype, int flags);
> 
>  /*
>   * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c82193db4be6..8537429d33a6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1226,7 +1226,7 @@ static bool is_pageblock_removable_nolock(struct page 
> *page)
>   if (!zone_spans_pfn(zone, pfn))
>   return false;
> 
> - return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true);
> + return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, 
> SKIP_HWPOISON);
>  }
> 
>  /* Checks if this range of memory is likely to be hot-removable. */
> @@ -1577,7 +1577,8 @@ static int __ref __offline_pages(unsigned long 
> start_pfn,
> 
>   /* set above range as isolated */
>   ret = start_isolate_page_range(start_pfn, end_pfn,
> -MIGRATE_MOVABLE, true);
> +MIGRATE_MOVABLE,
> +SKIP_HWPOISON | REPORT_FAILURE);
>   if (ret) {
>   mem_hotplug_done();
>   reason = "failure to isolate range";
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ec2c7916dc2d..ee4043419791 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7754,8 +7754,7 @@ void *__init alloc_large_system_hash(const char 
> *tablename,
>   * race condition. So you can't expect this function should be exact.
>   */
>  bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> -  int migratetype,
> -  bool skip_hwpoisoned_pages)
> +  int migratetype, int flags)
>  {
>   unsigned long pfn, iter, found;
> 
> @@ -7818,7 +7817,7 @@ bool has_unmovable_pages(struct zone *zone, struct page 
> *page, int count,
>* The HWPoisoned page may be not in buddy system, and
>* pag

Re: [alsa-devel] [PATCH] ASoC: sdm845: set jack only for a specific backend

2018-12-17 Thread Cheng-yi Chiang

On Fri, Dec 14, 2018 at 6:02 PM Rohit kumar  wrote:
>
> Headset codec is connected over PRIMARY_MI2S interface. Call
> set_jack for codec associated with Primary Mi2s interface.
> Also, set_jack to NULL when jack is freed.
>
> Signed-off-by: Rohit kumar 
> ---
>  sound/soc/qcom/sdm845.c | 31 ++-
>  1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/sound/soc/qcom/sdm845.c b/sound/soc/qcom/sdm845.c
> index 1db8ef66..6f66a58 100644
> --- a/sound/soc/qcom/sdm845.c
> +++ b/sound/soc/qcom/sdm845.c
> @@ -158,17 +158,24 @@ static int sdm845_snd_hw_params(struct 
> snd_pcm_substream *substream,
> return ret;
>  }
>
> +static void sdm845_jack_free(struct snd_jack *jack)
> +{
> +   struct snd_soc_component *component = jack->private_data;
> +
> +   snd_soc_component_set_jack(component, NULL, NULL);
> +}
> +
>  static int sdm845_dai_init(struct snd_soc_pcm_runtime *rtd)
>  {
> struct snd_soc_component *component;
> -   struct snd_soc_dai_link *dai_link = rtd->dai_link;
> struct snd_soc_card *card = rtd->card;
> +   struct snd_soc_dai *codec_dai = rtd->codec_dai;
> +   struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
> struct sdm845_snd_data *pdata = snd_soc_card_get_drvdata(card);
> -   int i, rval;
> +   struct snd_jack *jack;
> +   int rval;
>
> if (!pdata->jack_setup) {
> -   struct snd_jack *jack;
> -
> rval = snd_soc_card_jack_new(card, "Headset Jack",
> SND_JACK_HEADSET |
> SND_JACK_HEADPHONE |
> @@ -190,16 +197,22 @@ static int sdm845_dai_init(struct snd_soc_pcm_runtime 
> *rtd)
> pdata->jack_setup = true;
> }
>
> -   for (i = 0 ; i < dai_link->num_codecs; i++) {
> -   struct snd_soc_dai *dai = rtd->codec_dais[i];
> +   switch (cpu_dai->id) {
> +   case PRIMARY_MI2S_RX:
> +   jack  = pdata->jack.jack;
> +   component = codec_dai->component;
>
> -   component = dai->component;
> -   rval = snd_soc_component_set_jack(
> -   component, &pdata->jack, NULL);
> +   jack->private_data = component;
> +   jack->private_free = sdm845_jack_free;
> +   rval = snd_soc_component_set_jack(component,
> + &pdata->jack, NULL);
> if (rval != 0 && rval != -ENOTSUPP) {
> dev_warn(card->dev, "Failed to set jack: %d\n", rval);
> return rval;
> }
> +   break;
> +   default:
> +   break;
> }
>
> return 0;
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.,
> is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
>
> ___
> Alsa-devel mailing list
> alsa-de...@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Thanks a lot for the fix!

Reviewed-by: Cheng-Yi Chiang 
Tested-by: Cheng-Yi Chiang

Re: [PATCH v1 00/12] Add some functionalities for Tegra soctherm

2018-12-17 Thread Wei Ni

Sorry, please ignore this cover-letter.

Wei.

On 18/12/2018 3:34 PM, Wei Ni wrote:
> Move the hw/sw shutdown patches into this serial. There already have
> some discussion for it in https://lkml.org/lkml/2018/12/7/225.
> Support GPU HW throttle, thermal IRQ, set_trips(), EDP IRQ and OC
> hw throttle.
> 
> Wei Ni (12):
>   of: Add bindings of thermtrip for Tegra soctherm
>   thermal: tegra: support hw and sw shutdown
>   arm64: dts: tegra210: set thermtrip
>   of: Add bindings of gpu hw throttle for Tegra soctherm
>   thermal: tegra: add support for gpu hw-throttle
>   arm64: dts: tegra210: set gpu hw throttle level
>   thermal: tegra: add support for thermal IRQ
>   thermal: tegra: add set_trips functionality
>   thermal: tegra: add support for EDP IRQ
>   arm64: dts: tegra210: set EDP interrupt line
>   of: Add bindings of OC hw throttle for Tegra soctherm
>   thermal: tegra: enable OC hw throttle
> 
>  .../bindings/thermal/nvidia,tegra124-soctherm.txt  |  63 +-
>  arch/arm64/boot/dts/nvidia/tegra210.dtsi   |  20 +-
>  drivers/thermal/tegra/soctherm.c   | 955 
> +++--
>  drivers/thermal/tegra/soctherm.h   |  16 +
>  drivers/thermal/tegra/tegra124-soctherm.c  |   7 +-
>  drivers/thermal/tegra/tegra132-soctherm.c  |   7 +-
>  drivers/thermal/tegra/tegra210-soctherm.c  |  15 +-
>  include/dt-bindings/thermal/tegra124-soctherm.h|  22 +-
>  8 files changed, 1029 insertions(+), 76 deletions(-)
>

Re: [PATCH V2 00/10] unify the interface of the proportional-share policy in blkio/io

2018-12-17 Thread Paolo Valente

[RESENDING BECAUSE BOUNCED]

> Il giorno 10 dic 2018, alle ore 14:45, Angelo Ruocco 
>  ha scritto:
> 
> 2018-11-30 19:53 GMT+01:00, Paolo Valente :
>> 
>> 
>>> Il giorno 30 nov 2018, alle ore 19:42, Tejun Heo  ha
>>> scritto:
>>> 
>>> Hello, Paolo.
>>> 
>>> On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote:
> Then we understood that exactly the same happens with throttling, in
> case the latter is activated on different devices w.r.t. bfq.
> 
> In addition, the same may happen, in the near future, with the
> bandwidth controller Josef is working on.  If the controller can be
> configured per device, as with throttling, then statistics may differ,
> for the same interface files, between bfq, throttling and that
> controller.
>>> 
>>> So, regardless of how all these are implemented, what's presented to
>>> user should be consistent and clear.  There's no other way around it.
>>> Only what's relevant should be visible to userspace.
>>> 
 have you had time to look into this?  Any improvement to this
 interface is ok for us. We are only interested in finally solving this
 interface issue, as, for what concerns us directly, it has been
 preventing legacy code to use bfq for years.
>>> 
>>> Unfortunately, I don't have any implementation proposal, but we can't
>>> show things this way to userspace.
>>> 
>> 
>> Well, this is not very helpful to move forward :)
>> 
>> Let me try to repeat the problem, to try to help you help us unblock
>> the situation.
>> 
>> If we have multiple entities attached to the same interface output
>> file, you don't find it clear that each entity shows the number it
>> wants to show.  But you have no idea either of how that differentiated
>> information should be shown.  Is this the situation, or is the problem
>> somewhere 'above' this level?
>> 
>> If the problem is as I described it, here are some proposal attempts:
>> 1) Do you want file sharing to be allowed only if all entities will
>> output the same number?  (this seems excessive, but maybe it makes
>> sense)
>> 2) Do you want only one number to be shown, equal to the sum of the
>> numbers of each entity?  (in some cases, this may make sense)
>> 3) Do you prefer an average?
>> 4) Do you have any other idea, even if just germinal?
> 
> To further add to what Paolo said and better expose the problem, I'd like to
> say that all those proposals have issues.
> If we only allow "same output" cftypes to be shared then we lose all the
> flexibility of this solution, and we need a way for an entity to know other
> entities internal variables beforehand, which sounds at least very hard, and
> maybe is not even an acceptable thing to do.
> To put the average, sum or some other mathematical function in the file only
> makes sense for certain cftypes, so also doesn't sound like a good idea. In
> fact I can think of scenarios where only seeing the different values of the
> entities makes sense for a user.
> 
> I understand that the problem is inconsistency: having a file that behaves
> differently depending on the situation, and the only way to prevent this I can
> think of is to *always* show the entity owner of a certain file (or part of 
> the
> output), even when the output would be the same among entities or when the
> file is not currently shared but could be. Can this be an acceptable solution?
> 
> Angelo
> 

Hi Jens, all,
let me push for this interface to be fixed too.  If we don't fix it in
some way, then from 4.21 we well end up with a ridiculous paradox: the
proportional share policy (weights) will of course be available, but
unusable in practice.  In fact, as Lennart--and not only Lennart--can
confirm, no piece of code uses bfq.weight to set weights, or will do
it.

A trivial solution would be to throw away all our work to fix this
issue by extending the interface, and just let bfq use the former cfq
names.  But then the same mess will happen as, e.g., Josef will
propose his proportional-share controller.

Before making this solution, we proposed it and waited for it to be
approved several months ago, so I hope that Tejun concern can be
addressed somehow.

If Tejun cannot see any solution to his concern, then can we just
switch to this extension, considering that
- for non-shared names the interface is *identical* to the current
  one;
- by using this new interface, and getting feedback we could
  understand how to better handle Tejun's concern?

A lot of systems do use weights, and people don't even know that these
systems don't work correctly in blk-mq.  And they won't work correctly
in any available configuration from 4.21, if we don't fix this problem.

Thanks.
Paolo

>> 
>> Looking forward to your feedback,
>> Paolo
>> 
>> 
>>> Thanks.
>>> 
>>> --
>>> tejun
>>> 
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "bfq-iosched" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to bfq-io

Re: [PATCH v11 6/7] ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set

2018-12-17 Thread Hanjun Guo

On 2018/12/18 10:56, Sinan Kaya wrote:
> Remove PCI dependent code out of iort.c when CONFIG_PCI is not defined.
> A quick search reveals the following functions:
> 1. pci_request_acs()
> 2. pci_domain_nr()
> 3. pci_is_root_bus()
> 4. to_pci_dev()
> 
> Both pci_domain_nr() and pci_is_root_bus() are defined in linux/pci.h.
> pci_domain_nr() is a stub function when CONFIG_PCI is not set and
> pci_is_root_bus() just returns a reference to a structure member which
> is still valid without CONFIG_PCI set.
> 
> to_pci_dev() is a macro that expands to container_of.
> 
> pci_request_acs() is the only code that gets pulled in from drivers/pci/*.c

Actually we have pci_for_each_dma_alias() too which is from
drivers/pci/search.c without stub function in linux/pci.h, but I
didn't get the compile error at link time, I think the compiler
just do some optimization to remove the dead code because
dev_is_pci() is obvious false.

Thanks
Hanjun

Re: [PATCH v4 2/2] trace nvme submit queue status

2018-12-17 Thread peng yu

On Mon, Dec 17, 2018 at 11:26 PM Sagi Grimberg  wrote:
>
>
> > @@ -899,6 +900,10 @@ static inline void nvme_handle_cqe(struct nvme_queue 
> > *nvmeq, u16 idx)
> >   }
> >
> >   req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id);
> > + trace_nvme_sq(req->rq_disk,
> > + nvmeq->qid,
> > + le16_to_cpu(cqe->sq_head),
> > + nvmeq->sq_tail);
>
> Why the newline escapes? why not escape at the 80 char border?
>

Sorry, I don't quite understand your meaning. Do you mean I'd better
change this:
trace_nvme_sq(req->rq_disk,
nvmeq->qid,
le16_to_cpu(cqe->sq_head),
nvmeq->sq_tail);
to something like below:
trace_nvme_sq(req->rq_disk, nvmeq->qid, le16_to_cpu(cqe->sq_head),
nvmeq->sq_tail);
Please let me know whether my understanding is correct.

RE: [PATCH 02/18] mfd: adp5520: Make it explicitly non-modular

2018-12-17 Thread Hennerich, Michael




> -Original Message-
> From: Paul Gortmaker [mailto:paul.gortma...@windriver.com]
> Sent: Montag, 17. Dezember 2018 21:31
> To: Lee Jones 
> Cc: linux-kernel@vger.kernel.org; Paul Gortmaker 
> ; Hennerich, Michael
> 
> Subject: [PATCH 02/18] mfd: adp5520: Make it explicitly non-modular
> 
> The Makefile/Kconfig currently controlling compilation of this code is:
> 
> drivers/mfd/Makefile:obj-$(CONFIG_PMIC_ADP5520) += adp5520.o
> drivers/mfd/Kconfig:config PMIC_ADP5520
> drivers/mfd/Kconfig:bool "Analog Devices ADP5520/01 MFD PMIC Core Support"
> 
> ...meaning that it currently is not being built as a module by anyone.
> 
> Lets remove the modular code that is essentially orphaned, so that
> when reading the driver there is no doubt it is builtin-only.
> 
> We explicitly disallow a driver unbind, since that doesn't have a
> sensible use case anyway, and it allows us to drop the ".remove"
> code for non-modular drivers.
> 
> Since module_i2c_driver() uses the same init level priority as
> builtin_i2c_driver() the init ordering remains unchanged with
> this commit.
> 
> Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.
> 
> We also delete the MODULE_LICENSE tag etc. since all that information
> was (or is now) contained at the top of the file in the comments.
> 
> Cc: Michael Hennerich 
> Cc: Lee Jones 
> Signed-off-by: Paul Gortmaker 
> Acked-by: Linus Walleij 

Acked-by: Michael Hennerich 

> ---
>  drivers/mfd/adp5520.c | 30 +++---
>  1 file changed, 7 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/mfd/adp5520.c b/drivers/mfd/adp5520.c
> index be0497b96720..2cdd39cb8a18 100644
> --- a/drivers/mfd/adp5520.c
> +++ b/drivers/mfd/adp5520.c
> @@ -7,6 +7,8 @@
>   *
>   * Copyright 2009 Analog Devices Inc.
>   *
> + * Author: Michael Hennerich 
> + *
>   * Derived from da903x:
>   * Copyright (C) 2008 Compulab, Ltd.
>   *   Mike Rapoport 
> @@ -18,7 +20,7 @@
>   */
> 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -304,18 +306,6 @@ static int adp5520_probe(struct i2c_client *client,
>   return ret;
>  }
> 
> -static int adp5520_remove(struct i2c_client *client)
> -{
> - struct adp5520_chip *chip = dev_get_drvdata(&client->dev);
> -
> - if (chip->irq)
> - free_irq(chip->irq, chip);
> -
> - adp5520_remove_subdevs(chip);
> - adp5520_write(chip->dev, ADP5520_MODE_STATUS, 0);
> - return 0;
> -}
> -
>  #ifdef CONFIG_PM_SLEEP
>  static int adp5520_suspend(struct device *dev)
>  {
> @@ -346,20 +336,14 @@ static const struct i2c_device_id adp5520_id[] = {
>   { "pmic-adp5501", ID_ADP5501 },
>   { }
>  };
> -MODULE_DEVICE_TABLE(i2c, adp5520_id);
> 
>  static struct i2c_driver adp5520_driver = {
>   .driver = {
> - .name   = "adp5520",
> - .pm = &adp5520_pm,
> + .name   = "adp5520",
> + .pm = &adp5520_pm,
> + .suppress_bind_attrs= true,
>   },
>   .probe  = adp5520_probe,
> - .remove = adp5520_remove,
>   .id_table   = adp5520_id,
>  };
> -
> -module_i2c_driver(adp5520_driver);
> -
> -MODULE_AUTHOR("Michael Hennerich ");
> -MODULE_DESCRIPTION("ADP5520(01) PMIC-MFD Driver");
> -MODULE_LICENSE("GPL");
> +builtin_i2c_driver(adp5520_driver);
> --
> 2.7.4

linux-next: Tree for Dec 18

2018-12-17 Thread Stephen Rothwell

Hi all,

Changes since 20181217:

The nfs-anna tree lost its build failure.

The vfs tree gained a conflict against the fscrypt tree.

The hwmon-staging tree lost its build failure.

The rdma tree still had its build failure so I used a supplied patch.

The net-next tree lost its build failure.

The mfd tree lost its build failure.

The selinux tree gained a conflict against the vfs tree.

The gpio tree lost its build failure.

The y2038 tree lost its build failure.

Non-merge commits (relative to Linus' tree): 9464
 9665 files changed, 463878 insertions(+), 252256 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 291 trees (counting Linus' and 69 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (7566ec393f41 Linux 4.20-rc7)
Merging fixes/master (d8c137546ef8 powerpc: tag implicit fall throughs)
Merging kbuild-current/fixes (ccda4af0f4b9 Linux 4.20-rc2)
Merging arc-current/for-curr (4c567a448b30 ARC: perf: remove useless ifdefs)
Merging arm-current/fixes (c2a3831df6dc ARM: 8816/1: dma-mapping: fix potential 
uninitialized return)
Merging arm64-fixes/for-next/fixes (3238c359acee arm64: dma-mapping: Fix 
FORCE_CONTIGUOUS buffer clearing)
Merging m68k-current/for-linus (58c116fb7dc6 m68k/sun3: Remove is_medusa and 
m68k_pgtable_cachemode)
Merging powerpc-fixes/fixes (a225f1567405 powerpc/ptrace: replace 
ptrace_report_syscall() with a tracehook call)
Merging sparc/master (cf76c364a1e1 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (369a094d500f Merge branch 'hns-fixes')
Merging bpf/master (8203e2d844d3 net: clear skb->tstamp in forwarding paths)
Merging ipsec/master (4a135e538962 xfrm_user: fix freeing of xfrm states on 
acquire)
Merging netfilter/master (9e69efd45321 Merge branch 'vhost-fixes')
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (eca1e56ceedd iwlwifi: mvm: don't send 
GEO_TX_POWER_LIMIT to old firmwares)
Merging mac80211/master (312ca38ddda6 cfg80211: Fix busy loop regression in 
ieee80211_ie_split_ric())
Merging rdma-fixes/for-rc (37fbd834b4e4 IB/core: Fix oops in 
netdev_next_upper_dev_rcu())
Merging sound-current/for-linus (0bea4cc83835 ALSA: hda/realtek: Enable audio 
jacks of ASUS UX433FN/UX333FA with ALC294)
Merging sound-asoc-fixes/for-linus (9683863aecfe Merge branch 'asoc-4.20' into 
asoc-linus)
Merging regmap-fixes/for-linus (40e020c129cf Linux 4.20-rc6)
Merging regulator-fixes/for-linus (1a365a5c7959 Merge branch 'regulator-4.20' 
into regulator-linus)
Merging spi-fixes/for-linus (6594150c7db6 Merge branch 'spi-4.20' into 
spi-linus)
Merging pci-current/for-linus (1063a5148ac9 PCI/AER: Queue one GHES event, not 
several uninitialized ones)
Merging driver-core.current/driver-core-linus (2595646791c3 Linux 4.20-rc5)
Merging tty.current/tty-linus (3c9dc275dba1 Revert "serial: 8250: Fix clearing 
FIFOs in RS485 mode again")
Merging usb.current/usb-linus (2419f30a4a4f USB: xhci: fix 'broken_suspend' 
placement in struct xchi_hcd)
Merging usb-gadget-fixes/fixes (069caf5

Re: [PATCH v5 2/2] media: usb: pwc: Don't use coherent DMA buffers for ISO transfer

2018-12-17 Thread Christoph Hellwig

On Tue, Dec 18, 2018 at 04:22:43PM +0900, Tomasz Figa wrote:
> It kind of limits the usability of this API, since it enforces
> contiguous allocations even for big sizes even for devices behind
> IOMMU (contrary to the case when DMA_ATTR_NON_CONSISTENT is not set),
> but given that it's just a temporary solution for devices like these
> USB cameras, I guess that's fine.

The problem is that you can't have flexibility and simplicity at the
same time.  Once you use kernel virtual address remapping you need to
be prepared to have multiple segments.

So as I said you can call dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT
in a loop with a suitably small chunk size, then stuff the results into
a scatterlist and map that again for the device share with if you don't
want a single contigous region.  You just have to either deal with
non-contigous access from the kernel or use vmap and the right vmap
cache flushing helpers.

> Note that in V4L2 we use the DMA API extensively, so that we don't
> need to embed any device-specific or integration-specific knowledge in
> the framework. Right now we're using dma_alloc_attrs() with
> driver-provided attrs [1], but current driver never request
> non-consistent memory. We're however thinking about making it possible
> to allocate non-consistent memory. What would you suggest for this?
> 
> [1] 
> https://elixir.bootlin.com/linux/v4.20-rc7/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L139

I would advice against new non-consistent users until this series
goes through, mostly because dma_cache_sync is such an amazing bad
API.  Otherwise things will just work at the allocation side, you'll
just need to be careful to transfer ownership between the cpu and
the device(s) carefully using the dma_sync_* APIs.

Re: [PATCH v2] mm, page_alloc: Fix has_unmovable_pages for HugePages

2018-12-17 Thread Michal Hocko

On Mon 17-12-18 15:07:26, Andrew Morton wrote:
> On Mon, 17 Dec 2018 23:51:13 +0100 Oscar Salvador  wrote:
> 
> > v1 -> v2:
> > - Fix the logic for skipping pages by Michal
> > 
> > ---
> 
> Please be careful with the "^---$".  It signifies end-of-changelog, so
> I ended up without a changelog!
> 
> > >From e346b151037d3c37feb10a981a4d2a25018acf81 Mon Sep 17 00:00:00 2001
> > From: Oscar Salvador 
> > Date: Mon, 17 Dec 2018 14:53:35 +0100
> > Subject: [PATCH] mm, page_alloc: Fix has_unmovable_pages for HugePages
> > 
> > While playing with gigantic hugepages and memory_hotplug, I triggered
> > the following #PF when "cat memoryX/removable":
> > 
> > ...
> >
> > Also, since gigantic pages span several pageblocks, re-adjust the logic
> > for skipping pages.
> > 
> > Signed-off-by: Oscar Salvador 

Acked-by: Michal Hocko 

> cc:stable?

See http://lkml.kernel.org/r/20181217152936.gr30...@dhcp22.suse.cz. I
believe nobody is simply using gigantic pages and hotplug at the same
time and those pages do not seem to cross cma regions as well. At least
not since hugepage_migration_supported stops reporting giga pages as
migrateable.

That being said, I do not think we really need it in stable but it
should be relatively easy to backport so no objection from me to put it
there.

-- 
Michal Hocko
SUSE Labs

[PATCH v1 07/12] thermal: tegra: add support for thermal IRQ

2018-12-17 Thread Wei Ni

Support to generate an interrupt when the temperature
crosses a programmed threshold and notify the thermal framework.

Signed-off-by: Wei Ni 
---
 drivers/thermal/tegra/soctherm.c | 136 +++
 1 file changed, 136 insertions(+)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index d3cef88a3f22..c66fdd546ef0 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -86,6 +86,20 @@
 #define THERMCTL_LVL0_UP_STATS 0x10
 #define THERMCTL_LVL0_DN_STATS 0x14
 
+#define THERMCTL_INTR_STATUS   0x84
+#define THERMCTL_INTR_ENABLE   0x88
+#define THERMCTL_INTR_DISABLE  0x8c
+
+#define TH_INTR_MD0_MASK   BIT(25)
+#define TH_INTR_MU0_MASK   BIT(24)
+#define TH_INTR_GD0_MASK   BIT(17)
+#define TH_INTR_GU0_MASK   BIT(16)
+#define TH_INTR_CD0_MASK   BIT(9)
+#define TH_INTR_CU0_MASK   BIT(8)
+#define TH_INTR_PD0_MASK   BIT(1)
+#define TH_INTR_PU0_MASK   BIT(0)
+#define TH_INTR_IGNORE_MASK0xFCFCFCFC
+
 #define THERMCTL_STATS_CTL 0x94
 #define STATS_CTL_CLR_DN   0x8
 #define STATS_CTL_EN_DN0x4
@@ -242,6 +256,8 @@ struct tegra_soctherm {
void __iomem *clk_regs;
void __iomem *ccroc_regs;
 
+   int thermal_irq;
+
u32 *calib;
struct thermal_zone_device **thermctl_tzs;
struct tegra_soctherm_soc *soc;
@@ -640,6 +656,98 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
return 0;
 }
 
+static irqreturn_t soctherm_thermal_isr(int irq, void *dev_id)
+{
+   struct tegra_soctherm *ts = dev_id;
+   u32 r;
+
+   r = readl(ts->regs + THERMCTL_INTR_STATUS);
+   writel(r, ts->regs + THERMCTL_INTR_DISABLE);
+
+   return IRQ_WAKE_THREAD;
+}
+
+/**
+ * soctherm_thermal_isr_thread() - Handles a thermal interrupt request
+ * @irq:   The interrupt number being requested; not used
+ * @dev_id:Opaque pointer to tegra_soctherm;
+ *
+ * Clears the interrupt status register if there are expected
+ * interrupt bits set.
+ * The interrupt(s) are then handled by updating the corresponding
+ * thermal zones.
+ *
+ * An error is logged if any unexpected interrupt bits are set.
+ *
+ * Disabled interrupts are re-enabled.
+ *
+ * Return: %IRQ_HANDLED. Interrupt was handled and no further processing
+ * is needed.
+ */
+static irqreturn_t soctherm_thermal_isr_thread(int irq, void *dev_id)
+{
+   struct tegra_soctherm *ts = dev_id;
+   struct thermal_zone_device *tz;
+   u32 st, ex = 0, cp = 0, gp = 0, pl = 0, me = 0;
+
+   st = readl(ts->regs + THERMCTL_INTR_STATUS);
+
+   /* deliberately clear expected interrupts handled in SW */
+   cp |= st & TH_INTR_CD0_MASK;
+   cp |= st & TH_INTR_CU0_MASK;
+
+   gp |= st & TH_INTR_GD0_MASK;
+   gp |= st & TH_INTR_GU0_MASK;
+
+   pl |= st & TH_INTR_PD0_MASK;
+   pl |= st & TH_INTR_PU0_MASK;
+
+   me |= st & TH_INTR_MD0_MASK;
+   me |= st & TH_INTR_MU0_MASK;
+
+   ex |= cp | gp | pl | me;
+   if (ex) {
+   writel(ex, ts->regs + THERMCTL_INTR_STATUS);
+   st &= ~ex;
+
+   if (cp) {
+   tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_CPU];
+   thermal_zone_device_update(tz,
+  THERMAL_EVENT_UNSPECIFIED);
+   }
+
+   if (gp) {
+   tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_GPU];
+   thermal_zone_device_update(tz,
+  THERMAL_EVENT_UNSPECIFIED);
+   }
+
+   if (pl) {
+   tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_PLLX];
+   thermal_zone_device_update(tz,
+  THERMAL_EVENT_UNSPECIFIED);
+   }
+
+   if (me) {
+   tz = ts->thermctl_tzs[TEGRA124_SOCTHERM_SENSOR_MEM];
+   thermal_zone_device_update(tz,
+  THERMAL_EVENT_UNSPECIFIED);
+   }
+   }
+
+   /* deliberately ignore expected interrupts NOT handled in SW */
+   ex |= TH_INTR_IGNORE_MASK;
+   st &= ~ex;
+
+   if (st) {
+   /* Whine about any other unexpected INTR bits still set */
+   pr_err("soctherm: Ignored unexpected INTRs 0x%08x\n", st);
+   writel(st, ts->regs + THERMCTL_INTR_STATUS);
+   }
+
+   return IRQ_HANDLED;
+}
+
 #ifdef CONFIG_DEBUG_FS
 static int regs_show(struct seq_file *s, void *data)
 {
@@ -1312,6 +1420,32 @@ static void tegra_soctherm_throttle(struct devic

[PATCH v1 03/12] arm64: dts: tegra210: set thermtrip

2018-12-17 Thread Wei Ni

Set "nvidia,thermtrips" property, it used to set
HW shutdown temperatures.

Signed-off-by: Wei Ni 
---
 arch/arm64/boot/dts/nvidia/tegra210.dtsi | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 2205d66b0443..36c7dce7fa69 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -1332,6 +1332,9 @@
reset-names = "soctherm";
#thermal-sensor-cells = <1>;
 
+   nvidia,thermtrips = ;
+
throttle-cfgs {
throttle_heavy: heavy {
nvidia,priority = <100>;
@@ -1351,8 +1354,8 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_CPU>;
 
trips {
-   cpu-shutdown-trip {
-   temperature = <102500>;
+   cpu-critical-trip {
+   temperature = <102000>;
hysteresis = <0>;
type = "critical";
};
@@ -1379,7 +1382,7 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_MEM>;
 
trips {
-   mem-shutdown-trip {
+   mem-critical-trip {
temperature = <103000>;
hysteresis = <0>;
type = "critical";
@@ -1401,8 +1404,8 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_GPU>;
 
trips {
-   gpu-shutdown-trip {
-   temperature = <103000>;
+   gpu-critical-trip {
+   temperature = <102500>;
hysteresis = <0>;
type = "critical";
};
@@ -1429,7 +1432,7 @@
<&soctherm TEGRA124_SOCTHERM_SENSOR_PLLX>;
 
trips {
-   pllx-shutdown-trip {
+   pllx-critical-trip {
temperature = <103000>;
hysteresis = <0>;
type = "critical";
-- 
2.7.4

[PATCH v1 01/12] of: Add bindings of thermtrip for Tegra soctherm

2018-12-17 Thread Wei Ni

Add optional property "nvidia,thermtrips".
If present, these trips will be used as HW shutdown trips,
and critical trips will be used as SW shutdown trips.

Signed-off-by: Wei Ni 
---
 .../bindings/thermal/nvidia,tegra124-soctherm.txt| 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt 
b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
index b6c0ae53d4dc..ab66d6feab4b 100644
--- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
+++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
@@ -55,10 +55,21 @@ Required properties :
   - #cooling-cells: Should be 1. This cooling device only support on/off 
state.
 See ./thermal.txt for a description of this property.
 
+Optional properties:
+- nvidia,thermtrips : When present, this property specifies the temperature at
+  which the soctherm hardware will assert the thermal trigger signal to the
+  Power Management IC, which can be configured to reset or shutdown the device.
+  It is an array of pairs where each pair represents a tsensor id followed by a
+  temperature in milli Celcius. In the absence of this property the critical
+  trip point will be used for thermtrip temperature.
+
 Note:
-- the "critical" type trip points will be set to SOC_THERM hardware as the
-shut down temperature. Once the temperature of this thermal zone is higher
-than it, the system will be shutdown or reset by hardware.
+- the "critical" type trip points will be used to set the temperature at which
+the SOC_THERM hardware will assert a thermal trigger if the "nvidia,thermtrips"
+property is missing. When the thermtrips property is present, the breach of a
+critical trip point is reported back to the thermal framework to implement
+software shutdown.
+
 - the "hot" type trip points will be set to SOC_THERM hardware as the throttle
 temperature. Once the the temperature of this thermal zone is higher
 than it, it will trigger the HW throttle event.
@@ -79,6 +90,9 @@ Example :
 
#thermal-sensor-cells = <1>;
 
+   nvidia,thermtrips = ;
+
throttle-cfgs {
/*
 * When the "heavy" cooling device triggered,
-- 
2.7.4

[PATCH v1 02/12] thermal: tegra: support hw and sw shutdown

2018-12-17 Thread Wei Ni

Currently the critical trip points in thermal framework are the only
way to specify a temperature at which HW should shutdown. This is
insufficient for certain platforms which would want an orderly
software shutdown in addition to HW shutdown.

This change support to parse "nvidia, thermtrips" property,
it allows soctherm DT to specify thermtrip temperatures so that
critical trip points framework can be used for doing software
shutdown.

Signed-off-by: Wei Ni 
---
 drivers/thermal/tegra/soctherm.c  | 99 ++-
 drivers/thermal/tegra/soctherm.h  |  6 ++
 drivers/thermal/tegra/tegra210-soctherm.c |  8 +++
 3 files changed, 98 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index ed28110a3535..673c3ffa9001 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -446,6 +446,24 @@ find_throttle_cfg_by_name(struct tegra_soctherm *ts, const 
char *name)
return NULL;
 }
 
+static int tsensor_group_thermtrip_get(struct tegra_soctherm *ts, int id)
+{
+   int i, temp = min_low_temp;
+   struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
+
+   if (id >= TEGRA124_SOCTHERM_SENSOR_NUM)
+   return temp;
+
+   if (tt) {
+   for (i = 0; i < ts->soc->num_ttgs; i++) {
+   if (tt[i].id == id)
+   return tt[i].temp;
+   }
+   }
+
+   return temp;
+}
+
 static int tegra_thermctl_set_trip_temp(void *data, int trip, int temp)
 {
struct tegra_thermctl_zone *zone = data;
@@ -464,7 +482,16 @@ static int tegra_thermctl_set_trip_temp(void *data, int 
trip, int temp)
return ret;
 
if (type == THERMAL_TRIP_CRITICAL) {
-   return thermtrip_program(dev, sg, temp);
+   /*
+* If thermtrips property is set in DT,
+* doesn't need to program critical type trip to HW,
+* if not, program critical trip to HW.
+*/
+   if (min_low_temp == tsensor_group_thermtrip_get(ts, sg->id))
+   return thermtrip_program(dev, sg, temp);
+   else
+   return 0;
+
} else if (type == THERMAL_TRIP_HOT) {
int i;
 
@@ -523,7 +550,8 @@ static int get_hot_temp(struct thermal_zone_device *tz, int 
*trip, int *temp)
  * @dev: struct device * of the SOC_THERM instance
  *
  * Configure the SOC_THERM HW trip points, setting "THERMTRIP"
- * "THROTTLE" trip points , using "critical" or "hot" type trip_temp
+ * "THROTTLE" trip points , using "thermtrips", "critical" or "hot"
+ * type trip_temp
  * from thermal zone.
  * After they have been configured, THERMTRIP or THROTTLE will take
  * action when the configured SoC thermal sensor group reaches a
@@ -545,28 +573,23 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
 {
struct tegra_soctherm *ts = dev_get_drvdata(dev);
struct soctherm_throt_cfg *stc;
-   int i, trip, temperature;
-   int ret;
+   int i, trip, temperature, ret;
 
-   ret = tz->ops->get_crit_temp(tz, &temperature);
-   if (ret) {
-   dev_warn(dev, "thermtrip: %s: missing critical temperature\n",
-sg->name);
-   goto set_throttle;
-   }
+   /* Get thermtrips. If missing, try to get critical trips. */
+   temperature = tsensor_group_thermtrip_get(ts, sg->id);
+   if (min_low_temp == temperature)
+   if (tz->ops->get_crit_temp(tz, &temperature))
+   temperature = max_high_temp;
 
ret = thermtrip_program(dev, sg, temperature);
if (ret) {
-   dev_err(dev, "thermtrip: %s: error during enable\n",
-   sg->name);
+   dev_err(dev, "thermtrip: %s: error during enable\n", sg->name);
return ret;
}
 
-   dev_info(dev,
-"thermtrip: will shut down when %s reaches %d mC\n",
+   dev_info(dev, "thermtrip: will shut down when %s reaches %d mC\n",
 sg->name, temperature);
 
-set_throttle:
ret = get_hot_temp(tz, &trip, &temperature);
if (ret) {
dev_warn(dev, "throttrip: %s: missing hot temperature\n",
@@ -907,6 +930,50 @@ static const struct thermal_cooling_device_ops 
throt_cooling_ops = {
.set_cur_state = throt_set_cdev_state,
 };
 
+static int soctherm_thermtrips_parse(struct platform_device *pdev)
+{
+   struct device *dev = &pdev->dev;
+   struct tegra_soctherm *ts = dev_get_drvdata(dev);
+   struct tsensor_group_thermtrips *tt = ts->soc->thermtrips;
+   const int max_num_prop = ts->soc->num_ttgs * 2;
+   u32 *tlb;
+   int i, j, n, ret;
+
+   if (!tt)
+   return -ENOMEM;
+
+   n = of_property_count_u32_elems(dev->of_node, "nvidia,thermtrips");
+   if (n <= 0) {
+

[PATCH v1 09/12] thermal: tegra: add support for EDP IRQ

2018-12-17 Thread Wei Ni

Add support to generate OC (over-current) interrupts to
indicate the OC event and print out alarm messages.

Signed-off-by: Wei Ni 
---
 drivers/thermal/tegra/soctherm.c | 420 +++
 1 file changed, 420 insertions(+)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index eefbb29b3b7d..37108f2290f9 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -104,6 +106,16 @@
 #define STATS_CTL_CLR_UP   0x2
 #define STATS_CTL_EN_UP0x1
 
+#define OC_INTR_STATUS 0x39c
+#define OC_INTR_ENABLE 0x3a0
+#define OC_INTR_DISABLE0x3a4
+
+#define OC_INTR_OC1_MASK   BIT(0)
+#define OC_INTR_OC2_MASK   BIT(1)
+#define OC_INTR_OC3_MASK   BIT(2)
+#define OC_INTR_OC4_MASK   BIT(3)
+#define OC_INTR_OC5_MASK   BIT(4)
+
 #define THROT_GLOBAL_CFG   0x400
 #define THROT_GLOBAL_ENB_MASK  BIT(0)
 
@@ -212,9 +224,23 @@ static const int max_high_temp = 127000;
 enum soctherm_throttle_id {
THROTTLE_LIGHT = 0,
THROTTLE_HEAVY,
+   THROTTLE_OC1,
+   THROTTLE_OC2,
+   THROTTLE_OC3,
+   THROTTLE_OC4,
+   THROTTLE_OC5, /* OC5 is reserved */
THROTTLE_SIZE,
 };
 
+enum soctherm_oc_irq_id {
+   TEGRA_SOC_OC_IRQ_1,
+   TEGRA_SOC_OC_IRQ_2,
+   TEGRA_SOC_OC_IRQ_3,
+   TEGRA_SOC_OC_IRQ_4,
+   TEGRA_SOC_OC_IRQ_5,
+   TEGRA_SOC_OC_IRQ_MAX,
+};
+
 enum soctherm_throttle_dev_id {
THROTTLE_DEV_CPU = 0,
THROTTLE_DEV_GPU,
@@ -224,6 +250,11 @@ enum soctherm_throttle_dev_id {
 static const char *const throt_names[] = {
[THROTTLE_LIGHT] = "light",
[THROTTLE_HEAVY] = "heavy",
+   [THROTTLE_OC1]   = "oc1",
+   [THROTTLE_OC2]   = "oc2",
+   [THROTTLE_OC3]   = "oc3",
+   [THROTTLE_OC4]   = "oc4",
+   [THROTTLE_OC5]   = "oc5",
 };
 
 struct tegra_soctherm;
@@ -255,6 +286,7 @@ struct tegra_soctherm {
void __iomem *ccroc_regs;
 
int thermal_irq;
+   int edp_irq;
 
u32 *calib;
struct thermal_zone_device **thermctl_tzs;
@@ -267,6 +299,15 @@ struct tegra_soctherm {
struct mutex thermctl_lock;
 };
 
+struct soctherm_oc_irq_chip_data {
+   struct mutexirq_lock; /* serialize OC IRQs */
+   struct irq_chip irq_chip;
+   struct irq_domain   *domain;
+   int irq_enable;
+};
+
+static struct soctherm_oc_irq_chip_data soc_irq_cdata;
+
 /**
  * ccroc_writel() - writes a value to a CCROC register
  * @ts: pointer to a struct tegra_soctherm
@@ -807,6 +848,360 @@ static irqreturn_t soctherm_thermal_isr_thread(int irq, 
void *dev_id)
return IRQ_HANDLED;
 }
 
+/**
+ * soctherm_oc_intr_enable() - Enables the soctherm over-current interrupt
+ * @alarm: The soctherm throttle id
+ * @enable:Flag indicating enable the soctherm over-current
+ * interrupt or disable it
+ *
+ * Enables a specific over-current pins @alarm to raise an interrupt if the 
flag
+ * is set and the alarm corresponds to OC1, OC2, OC3, or OC4.
+ */
+static void soctherm_oc_intr_enable(struct tegra_soctherm *ts,
+   enum soctherm_throttle_id alarm,
+   bool enable)
+{
+   u32 r;
+
+   if (!enable)
+   return;
+
+   r = readl(ts->regs + OC_INTR_ENABLE);
+   switch (alarm) {
+   case THROTTLE_OC1:
+   r = REG_SET_MASK(r, OC_INTR_OC1_MASK, 1);
+   break;
+   case THROTTLE_OC2:
+   r = REG_SET_MASK(r, OC_INTR_OC2_MASK, 1);
+   break;
+   case THROTTLE_OC3:
+   r = REG_SET_MASK(r, OC_INTR_OC3_MASK, 1);
+   break;
+   case THROTTLE_OC4:
+   r = REG_SET_MASK(r, OC_INTR_OC4_MASK, 1);
+   break;
+   default:
+   r = 0;
+   break;
+   }
+   writel(r, ts->regs + OC_INTR_ENABLE);
+}
+
+/**
+ * soctherm_handle_alarm() - Handles soctherm alarms
+ * @alarm: The soctherm throttle id
+ *
+ * "Handles" over-current alarms (OC1, OC2, OC3, and OC4) by printing
+ * a warning or informative message.
+ *
+ * Return: -EINVAL for @alarm = THROTTLE_OC3, otherwise 0 (success).
+ */
+static int soctherm_handle_alarm(enum soctherm_throttle_id alarm)
+{
+   int rv = -EINVAL;
+
+   switch (alarm) {
+   case THROTTLE_OC1:
+   pr_debug("soctherm: Successfully handled OC1 alarm\n");
+   rv = 0;
+   break;
+
+   case THROTTLE_OC2:
+   pr_debug("soctherm: Successfully handled OC2 alarm\n");
+   rv =

[PATCH v1 08/12] thermal: tegra: add set_trips functionality

2018-12-17 Thread Wei Ni

Implement set_trips ops to set passive trip points.

Signed-off-by: Wei Ni 
---
 drivers/thermal/tegra/soctherm.c  | 64 ++-
 drivers/thermal/tegra/soctherm.h  | 10 +
 drivers/thermal/tegra/tegra124-soctherm.c |  7 +++-
 drivers/thermal/tegra/tegra132-soctherm.c |  7 +++-
 drivers/thermal/tegra/tegra210-soctherm.c |  7 +++-
 5 files changed, 90 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index c66fdd546ef0..eefbb29b3b7d 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -87,8 +87,6 @@
 #define THERMCTL_LVL0_DN_STATS 0x14
 
 #define THERMCTL_INTR_STATUS   0x84
-#define THERMCTL_INTR_ENABLE   0x88
-#define THERMCTL_INTR_DISABLE  0x8c
 
 #define TH_INTR_MD0_MASK   BIT(25)
 #define TH_INTR_MU0_MASK   BIT(24)
@@ -265,6 +263,8 @@ struct tegra_soctherm {
struct soctherm_throt_cfg throt_cfgs[THROTTLE_SIZE];
 
struct dentry *debugfs_dir;
+
+   struct mutex thermctl_lock;
 };
 
 /**
@@ -542,9 +542,59 @@ static int tegra_thermctl_set_trip_temp(void *data, int 
trip, int temp)
return 0;
 }
 
+static void thermal_irq_enable(struct tegra_thermctl_zone *zn)
+{
+   u32 r;
+
+   /* multiple zones could be handling and setting trips at once */
+   mutex_lock(&zn->ts->thermctl_lock);
+   r = readl(zn->ts->regs + THERMCTL_INTR_ENABLE);
+   r = REG_SET_MASK(r, zn->sg->thermctl_isr_mask, TH_INTR_UP_DN_EN);
+   writel(r, zn->ts->regs + THERMCTL_INTR_ENABLE);
+   mutex_unlock(&zn->ts->thermctl_lock);
+}
+
+static void thermal_irq_disable(struct tegra_thermctl_zone *zn)
+{
+   u32 r;
+
+   /* multiple zones could be handling and setting trips at once */
+   mutex_lock(&zn->ts->thermctl_lock);
+   r = readl(zn->ts->regs + THERMCTL_INTR_DISABLE);
+   r = REG_SET_MASK(r, zn->sg->thermctl_isr_mask, 0);
+   writel(r, zn->ts->regs + THERMCTL_INTR_DISABLE);
+   mutex_unlock(&zn->ts->thermctl_lock);
+}
+
+static int tegra_thermctl_set_trips(void *data, int lo, int hi)
+{
+   struct tegra_thermctl_zone *zone = data;
+   u32 r;
+
+   thermal_irq_disable(zone);
+
+   r = readl(zone->ts->regs + zone->sg->thermctl_lvl0_offset);
+   r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 0);
+   writel(r, zone->ts->regs + zone->sg->thermctl_lvl0_offset);
+
+   lo = enforce_temp_range(zone->dev, lo) / zone->ts->soc->thresh_grain;
+   hi = enforce_temp_range(zone->dev, hi) / zone->ts->soc->thresh_grain;
+   dev_dbg(zone->dev, "%s hi:%d, lo:%d\n", __func__, hi, lo);
+
+   r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_up_thresh_mask, hi);
+   r = REG_SET_MASK(r, zone->sg->thermctl_lvl0_dn_thresh_mask, lo);
+   r = REG_SET_MASK(r, THERMCTL_LVL0_CPU0_EN_MASK, 1);
+   writel(r, zone->ts->regs + zone->sg->thermctl_lvl0_offset);
+
+   thermal_irq_enable(zone);
+
+   return 0;
+}
+
 static const struct thermal_zone_of_device_ops tegra_of_thermal_ops = {
.get_temp = tegra_thermctl_get_temp,
.set_trip_temp = tegra_thermctl_set_trip_temp,
+   .set_trips = tegra_thermctl_set_trips,
 };
 
 static int get_hot_temp(struct thermal_zone_device *tz, int *trip, int *temp)
@@ -661,6 +711,15 @@ static irqreturn_t soctherm_thermal_isr(int irq, void 
*dev_id)
struct tegra_soctherm *ts = dev_id;
u32 r;
 
+   /* Case for no lock:
+* Although interrupts are enabled in set_trips, there is still no need
+* to lock here because the interrupts are disabled before programming
+* new trip points. Hence there cant be a interrupt on the same sensor.
+* An interrupt can however occur on a sensor while trips are being
+* programmed on a different one. This beign a LEVEL interrupt won't
+* cause a new interrupt but this is taken care of by the re-reading of
+* the STATUS register in the thread function.
+*/
r = readl(ts->regs + THERMCTL_INTR_STATUS);
writel(r, ts->regs + THERMCTL_INTR_DISABLE);
 
@@ -1523,6 +1582,7 @@ static int tegra_soctherm_probe(struct platform_device 
*pdev)
if (!tegra)
return -ENOMEM;
 
+   mutex_init(&tegra->thermctl_lock);
dev_set_drvdata(&pdev->dev, tegra);
 
tegra->soc = soc;
diff --git a/drivers/thermal/tegra/soctherm.h b/drivers/thermal/tegra/soctherm.h
index c05c7e37e968..70501e73d586 100644
--- a/drivers/thermal/tegra/soctherm.h
+++ b/drivers/thermal/tegra/soctherm.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
  *
@@ -29,6 +30,14 @@
 #define THERMCTL_THERMTRIP_CTL 0x80
 /* BITs are defined in device file */
 
+#define THERMCTL_INTR_ENABLE   0x88
+#define THERMCTL_INTR_DISABLE

[PATCH v1 10/12] arm64: dts: tegra210: set EDP interrupt line

2018-12-17 Thread Wei Ni

Set EDP interrupt line.

Signed-off-by: Wei Ni 
---
 arch/arm64/boot/dts/nvidia/tegra210.dtsi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 57dae9cc7b7d..b4531a8c18f6 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -1324,7 +1324,8 @@
reg = <0x0 0x700e2000 0x0 0x600 /* SOC_THERM reg_base */
0x0 0x60006000 0x0 0x400>; /* CAR reg_base */
reg-names = "soctherm-reg", "car-reg";
-   interrupts = ;
+   interrupts = ;
clocks = <&tegra_car TEGRA210_CLK_TSENSOR>,
<&tegra_car TEGRA210_CLK_SOC_THERM>;
clock-names = "tsensor", "soctherm";
-- 
2.7.4

[PATCH v1 04/12] of: Add bindings of gpu hw throttle for Tegra soctherm

2018-12-17 Thread Wei Ni

Add "nvidia,gpu-throt-level" property to set gpu hw
throttle level.

Signed-off-by: Wei Ni 
---
 .../bindings/thermal/nvidia,tegra124-soctherm.txt  | 17 +++--
 include/dt-bindings/thermal/tegra124-soctherm.h| 22 ++
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt 
b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
index ab66d6feab4b..cf6d0be56b7a 100644
--- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
+++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
@@ -52,6 +52,15 @@ Required properties :
 Must set as following values:
 TEGRA_SOCTHERM_THROT_LEVEL_LOW, TEGRA_SOCTHERM_THROT_LEVEL_MED
 TEGRA_SOCTHERM_THROT_LEVEL_HIGH, TEGRA_SOCTHERM_THROT_LEVEL_NONE
+  - nvidia,gpu-throt-level: This property is for Tegra124 and Tegra210.
+It is the level of pulse skippers, which used to throttle clock
+frequencies. It indicates gpu clock throttling depth and can be
+programmed to any of the following values which represent a throttling
+percentage:
+TEGRA_SOCTHERM_THROT_LEVEL_NONE (0%)
+TEGRA_SOCTHERM_THROT_LEVEL_LOW (50%),
+TEGRA_SOCTHERM_THROT_LEVEL_MED (75%),
+TEGRA_SOCTHERM_THROT_LEVEL_HIGH (85%).
   - #cooling-cells: Should be 1. This cooling device only support on/off 
state.
 See ./thermal.txt for a description of this property.
 
@@ -96,22 +105,26 @@ Example :
throttle-cfgs {
/*
 * When the "heavy" cooling device triggered,
-* the HW will skip cpu clock's pulse in 85% depth
+* the HW will skip cpu clock's pulse in 85% depth,
+* skip gpu clock's pulse in 85% level
 */
throttle_heavy: heavy {
nvidia,priority = <100>;
nvidia,cpu-throt-percent = <85>;
+   nvidia,gpu-throt-level = 
;
 
#cooling-cells = <1>;
};
 
/*
 * When the "light" cooling device triggered,
-* the HW will skip cpu clock's pulse in 50% depth
+* the HW will skip cpu clock's pulse in 50% depth,
+* skip gpu clock's pulse in 50% level
 */
throttle_light: light {
nvidia,priority = <80>;
nvidia,cpu-throt-percent = <50>;
+   nvidia,gpu-throt-level = 
;
 
#cooling-cells = <1>;
};
diff --git a/include/dt-bindings/thermal/tegra124-soctherm.h 
b/include/dt-bindings/thermal/tegra124-soctherm.h
index c15e8b709a0d..75853df1c609 100644
--- a/include/dt-bindings/thermal/tegra124-soctherm.h
+++ b/include/dt-bindings/thermal/tegra124-soctherm.h
@@ -1,5 +1,19 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
+ * Copyright (c) 2014 - 2018, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author:
+ *  Mikko Perttunen 
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
  * This header provides constants for binding nvidia,tegra124-soctherm.
  */
 
@@ -12,9 +26,9 @@
 #define TEGRA124_SOCTHERM_SENSOR_PLLX 3
 #define TEGRA124_SOCTHERM_SENSOR_NUM 4
 
-#define TEGRA_SOCTHERM_THROT_LEVEL_LOW  0
-#define TEGRA_SOCTHERM_THROT_LEVEL_MED  1
-#define TEGRA_SOCTHERM_THROT_LEVEL_HIGH 2
-#define TEGRA_SOCTHERM_THROT_LEVEL_NONE -1
+#define TEGRA_SOCTHERM_THROT_LEVEL_NONE 0
+#define TEGRA_SOCTHERM_THROT_LEVEL_LOW  1
+#define TEGRA_SOCTHERM_THROT_LEVEL_MED  2
+#define TEGRA_SOCTHERM_THROT_LEVEL_HIGH 3
 
 #endif
-- 
2.7.4

[PATCH v1 00/12] Add some functionalities for Tegra soctherm

2018-12-17 Thread Wei Ni

Move the hw/sw shutdown patches into this serial. There already have
some discussion for it in https://lkml.org/lkml/2018/12/7/225.
Support GPU HW throttle, thermal IRQ, set_trips(), EDP IRQ and OC
hw throttle.

Wei Ni (12):
  of: Add bindings of thermtrip for Tegra soctherm
  thermal: tegra: support hw and sw shutdown
  arm64: dts: tegra210: set thermtrip
  of: Add bindings of gpu hw throttle for Tegra soctherm
  thermal: tegra: add support for gpu hw-throttle
  arm64: dts: tegra210: set gpu hw throttle level
  thermal: tegra: add support for thermal IRQ
  thermal: tegra: add set_trips functionality
  thermal: tegra: add support for EDP IRQ
  arm64: dts: tegra210: set EDP interrupt line
  of: Add bindings of OC hw throttle for Tegra soctherm
  thermal: tegra: enable OC hw throttle

 .../bindings/thermal/nvidia,tegra124-soctherm.txt  |  63 +-
 arch/arm64/boot/dts/nvidia/tegra210.dtsi   |  20 +-
 drivers/thermal/tegra/soctherm.c   | 959 +++--
 drivers/thermal/tegra/soctherm.h   |  16 +
 drivers/thermal/tegra/tegra124-soctherm.c  |   7 +-
 drivers/thermal/tegra/tegra132-soctherm.c  |   7 +-
 drivers/thermal/tegra/tegra210-soctherm.c  |  15 +-
 include/dt-bindings/thermal/tegra124-soctherm.h|  22 +-
 8 files changed, 1033 insertions(+), 76 deletions(-)

-- 
2.7.4

[PATCH v1 12/12] thermal: tegra: enable OC hw throttle

2018-12-17 Thread Wei Ni

Parse Over Current settings from DT and program them to
generate interrupts. Also enable hw throttling whenever
there are OC events. Log the OC events as debug messages.

Signed-off-by: Wei Ni 
---
 drivers/thermal/tegra/soctherm.c | 128 ---
 1 file changed, 118 insertions(+), 10 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 37108f2290f9..c2a0b048a085 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -106,9 +106,26 @@
 #define STATS_CTL_CLR_UP   0x2
 #define STATS_CTL_EN_UP0x1
 
+#define OC1_CFG0x310
+#define OC1_CFG_LONG_LATENCY_MASK  BIT(6)
+#define OC1_CFG_HW_RESTORE_MASKBIT(5)
+#define OC1_CFG_PWR_GOOD_MASK_MASK BIT(4)
+#define OC1_CFG_THROTTLE_MODE_MASK (0x3 << 2)
+#define OC1_CFG_ALARM_POLARITY_MASKBIT(1)
+#define OC1_CFG_EN_THROTTLE_MASK   BIT(0)
+
+#define OC1_CNT_THRESHOLD  0x314
+#define OC1_THROTTLE_PERIOD0x318
+#define OC1_ALARM_COUNT0x31c
+#define OC1_FILTER 0x320
+#define OC1_STATS  0x3a8
+
 #define OC_INTR_STATUS 0x39c
 #define OC_INTR_ENABLE 0x3a0
 #define OC_INTR_DISABLE0x3a4
+#define OC_STATS_CTL   0x3c4
+#define OC_STATS_CTL_CLR_ALL   0x2
+#define OC_STATS_CTL_EN_ALL0x1
 
 #define OC_INTR_OC1_MASK   BIT(0)
 #define OC_INTR_OC2_MASK   BIT(1)
@@ -207,6 +224,25 @@
 #define THROT_DELAY_CTRL(throt)(THROT_DELAY_LITE + \
(THROT_OFFSET * throt))
 
+#define ALARM_OFFSET   0x14
+#define ALARM_CFG(throt)   (OC1_CFG + \
+   (ALARM_OFFSET * (throt - THROTTLE_OC1)))
+
+#define ALARM_CNT_THRESHOLD(throt) (OC1_CNT_THRESHOLD + \
+   (ALARM_OFFSET * (throt - THROTTLE_OC1)))
+
+#define ALARM_THROTTLE_PERIOD(throt)   (OC1_THROTTLE_PERIOD + \
+   (ALARM_OFFSET * (throt - THROTTLE_OC1)))
+
+#define ALARM_ALARM_COUNT(throt)   (OC1_ALARM_COUNT + \
+   (ALARM_OFFSET * (throt - THROTTLE_OC1)))
+
+#define ALARM_FILTER(throt)(OC1_FILTER + \
+   (ALARM_OFFSET * (throt - THROTTLE_OC1)))
+
+#define ALARM_STATS(throt) (OC1_STATS + \
+   (4 * (throt - THROTTLE_OC1)))
+
 /* get CCROC_THROT_PSKIP_xxx offset per HIGH/MED/LOW vect*/
 #define CCROC_THROT_OFFSET 0x0c
 #define CCROC_THROT_PSKIP_CTRL_CPU_REG(vect)(CCROC_THROT_PSKIP_CTRL_CPU + \
@@ -218,6 +254,9 @@
 #define THERMCTL_LVL_REGS_SIZE 0x20
 #define THERMCTL_LVL_REG(rg, lv)   ((rg) + ((lv) * THERMCTL_LVL_REGS_SIZE))
 
+#define OC_THROTTLE_MODE_DISABLED  0
+#define OC_THROTTLE_MODE_BRIEF 2
+
 static const int min_low_temp = -127000;
 static const int max_high_temp = 127000;
 
@@ -266,6 +305,15 @@ struct tegra_thermctl_zone {
const struct tegra_tsensor_group *sg;
 };
 
+struct soctherm_oc_cfg {
+   u32 active_low;
+   u32 throt_period;
+   u32 alarm_cnt_thresh;
+   u32 alarm_filter;
+   u32 mode;
+   bool intr_en;
+};
+
 struct soctherm_throt_cfg {
const char *name;
unsigned int id;
@@ -273,6 +321,7 @@ struct soctherm_throt_cfg {
u8 cpu_throt_level;
u32 cpu_throt_depth;
u32 gpu_throt_level;
+   struct soctherm_oc_cfg oc_cfg;
struct thermal_cooling_device *cdev;
bool init;
 };
@@ -715,7 +764,7 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
return 0;
}
 
-   for (i = 0; i < THROTTLE_SIZE; i++) {
+   for (i = 0; i < THROTTLE_OC1; i++) {
struct thermal_cooling_device *cdev;
 
if (!ts->throt_cfgs[i].init)
@@ -1547,6 +1596,30 @@ static int soctherm_thermtrips_parse(struct 
platform_device *pdev)
return 0;
 }
 
+static void soctherm_oc_cfg_parse(struct device *dev,
+   struct device_node *np_oc,
+   struct soctherm_throt_cfg *stc)
+{
+   u32 val;
+
+   if (!of_property_read_u32(np_oc, "nvidia,polarity-active-low", &val))
+   stc->oc_cfg.active_low = val;
+
+   if (!of_property_read_u32(np_oc, "nvidia,count-threshold", &val)) {
+   stc->oc_cfg.intr_en = 1;
+   stc->oc_cfg.alarm_cnt_thresh = val;
+   }
+
+   if (!of_property_read_u32(np_oc, "nvidia,throttle-period", &val))
+   stc->oc_cfg.throt_period = val;

[PATCH v1 11/12] of: Add bindings of OC hw throttle for Tegra soctherm

2018-12-17 Thread Wei Ni

Add OC HW throttle configuration for soctherm in DT.
It is used to describe the OCx throttle events.

Signed-off-by: Wei Ni 
---
 .../bindings/thermal/nvidia,tegra124-soctherm.txt  | 26 ++
 1 file changed, 26 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt 
b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
index cf6d0be56b7a..d112a8e59ec3 100644
--- a/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
+++ b/Documentation/devicetree/bindings/thermal/nvidia,tegra124-soctherm.txt
@@ -64,6 +64,21 @@ Required properties :
   - #cooling-cells: Should be 1. This cooling device only support on/off 
state.
 See ./thermal.txt for a description of this property.
 
+  Optional properties: The following properties are T210 specific and
+  valid only for OCx throttle events.
+  - nvidia,count-threshold: Specifies the number of OC events that are
+required for triggering an interrupt. Interrupts are not triggered if
+the property is missing. A value of 0 will interrupt on every OC alarm.
+  - nvidia,polarity-active-low: Configures the polarity of the OC alaram
+signal. Accepted values are 1 for assert low and 0 for assert high.
+Default value is 0.
+  - nvidia,alarm-filter: Number of clocks to filter event. When the filter
+expires (which means the OC event has not occurred for a long time),
+the counter is cleared and filter is rearmed. Default value is 0.
+  - nvidia,throttle-period: Specifies the number of uSec for which
+throttling is engaged after the OC event is deasserted. Default value
+is 0.
+
 Optional properties:
 - nvidia,thermtrips : When present, this property specifies the temperature at
   which the soctherm hardware will assert the thermal trigger signal to the
@@ -134,6 +149,17 @@ Example :
 * arbiter will select the highest priority as the 
final throttle
 * settings to skip cpu pulse.
 */
+
+   throttle_oc1: oc1 {
+   nvidia,priority = <50>;
+   nvidia,polarity-active-low = <1>;
+   nvidia,count-threshold = <100>;
+   nvidia,alarm-filter = <510>;
+   nvidia,throttle-period = <0>;
+   nvidia,cpu-throt-percent = <75>;
+   nvidia,gpu-throt-level =
+   
;
+};
};
};
 
-- 
2.7.4

[PATCH v1 06/12] arm64: dts: tegra210: set gpu hw throttle level

2018-12-17 Thread Wei Ni

Set gpu hw throttle level to TEGRA_SOCTHERM_THROT_LEVEL_HIGH

Signed-off-by: Wei Ni 
---
 arch/arm64/boot/dts/nvidia/tegra210.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 36c7dce7fa69..57dae9cc7b7d 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -1339,6 +1339,8 @@
throttle_heavy: heavy {
nvidia,priority = <100>;
nvidia,cpu-throt-percent = <85>;
+   nvidia,gpu-throt-level =
+   ;
 
#cooling-cells = <2>;
};
-- 
2.7.4

[PATCH v1 05/12] thermal: tegra: add support for gpu hw-throttle

2018-12-17 Thread Wei Ni

Add support to trigger pulse skippers on the GPU
when a HOT trip point is triggered. The pulse skippers
can be signalled to throttle at low, medium and high
depths\levels.

Signed-off-by: Wei Ni 
---
 drivers/thermal/tegra/soctherm.c | 118 ---
 1 file changed, 85 insertions(+), 33 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 673c3ffa9001..d3cef88a3f22 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -1,5 +1,6 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
- * Copyright (c) 2014, NVIDIA CORPORATION.  All rights reserved.
+ * Copyright (c) 2014 - 2018, NVIDIA CORPORATION.  All rights reserved.
  *
  * Author:
  * Mikko Perttunen 
@@ -160,6 +161,15 @@
 /* get dividend from the depth */
 #define THROT_DEPTH_DIVIDEND(depth)((256 * (100 - (depth)) / 100) - 1)
 
+/* gk20a nv_therm interface N:3 Mapping. Levels defined in tegra124-sochterm.h
+ * level   vector
+ * NONE3'b000
+ * LOW 3'b001
+ * MED 3'b011
+ * HIGH3'b111
+ */
+#define THROT_LEVEL_TO_DEPTH(level)((0x1 << (level)) - 1)
+
 /* get THROT_PSKIP_xxx offset per LIGHT/HEAVY throt and CPU/GPU dev */
 #define THROT_OFFSET   0x30
 #define THROT_PSKIP_CTRL(throt, dev)   (THROT_PSKIP_CTRL_LITE_CPU + \
@@ -219,6 +229,7 @@ struct soctherm_throt_cfg {
u8 priority;
u8 cpu_throt_level;
u32 cpu_throt_depth;
+   u32 gpu_throt_level;
struct thermal_cooling_device *cdev;
bool init;
 };
@@ -974,6 +985,50 @@ static int soctherm_thermtrips_parse(struct 
platform_device *pdev)
return 0;
 }
 
+static int soctherm_throt_cfg_parse(struct device *dev,
+   struct device_node *np,
+   struct soctherm_throt_cfg *stc)
+{
+   struct tegra_soctherm *ts = dev_get_drvdata(dev);
+   int ret;
+   u32 val;
+
+   ret = of_property_read_u32(np, "nvidia,priority", &val);
+   if (ret) {
+   dev_err(dev, "throttle-cfg: %s: invalid priority\n", stc->name);
+   return -EINVAL;
+   }
+   stc->priority = val;
+
+   ret = of_property_read_u32(np, ts->soc->use_ccroc ?
+  "nvidia,cpu-throt-level" :
+  "nvidia,cpu-throt-percent", &val);
+   if (!ret) {
+   if (ts->soc->use_ccroc &&
+   val <= TEGRA_SOCTHERM_THROT_LEVEL_HIGH)
+   stc->cpu_throt_level = val;
+   else if (!ts->soc->use_ccroc && val <= 100)
+   stc->cpu_throt_depth = val;
+   else
+   goto err;
+   } else {
+   goto err;
+   }
+
+   ret = of_property_read_u32(np, "nvidia,gpu-throt-level", &val);
+   if (!ret && val <= TEGRA_SOCTHERM_THROT_LEVEL_HIGH)
+   stc->gpu_throt_level = val;
+   else
+   goto err;
+
+   return 0;
+
+err:
+   dev_err(dev, "throttle-cfg: %s: no throt prop or invalid prop\n",
+   stc->name);
+   return -EINVAL;
+}
+
 /**
  * soctherm_init_hw_throt_cdev() - Parse the HW throttle configurations
  * and register them as cooling devices.
@@ -984,8 +1039,7 @@ static void soctherm_init_hw_throt_cdev(struct 
platform_device *pdev)
struct tegra_soctherm *ts = dev_get_drvdata(dev);
struct device_node *np_stc, *np_stcc;
const char *name;
-   u32 val;
-   int i, r;
+   int i;
 
for (i = 0; i < THROTTLE_SIZE; i++) {
ts->throt_cfgs[i].name = throt_names[i];
@@ -1003,6 +1057,7 @@ static void soctherm_init_hw_throt_cdev(struct 
platform_device *pdev)
for_each_child_of_node(np_stc, np_stcc) {
struct soctherm_throt_cfg *stc;
struct thermal_cooling_device *tcd;
+   int err;
 
name = np_stcc->name;
stc = find_throttle_cfg_by_name(ts, name);
@@ -1012,37 +1067,10 @@ static void soctherm_init_hw_throt_cdev(struct 
platform_device *pdev)
continue;
}
 
-   r = of_property_read_u32(np_stcc, "nvidia,priority", &val);
-   if (r) {
-   dev_info(dev,
-"throttle-cfg: %s: missing priority\n", name);
+
+   err = soctherm_throt_cfg_parse(dev, np_stcc, stc);
+   if (err)
continue;
-   }
-   stc->priority = val;
-
-   if (ts->soc->use_ccroc) {
-   r = of_property_read_u32(np_stcc,
-"nvidia,cpu-throt-level",
-&val);
-   if (r) {
-   dev_info(dev,
-"throttle-cfg: %s: missing 
cpu-throt-level

[PATCH v1 00/12] Add some functionalities for Tegra soctherm

2018-12-17 Thread Wei Ni

Move the hw/sw shutdown patches into this serial. There already have
some discussion for it in https://lkml.org/lkml/2018/12/7/225.
Support GPU HW throttle, thermal IRQ, set_trips(), EDP IRQ and OC
hw throttle.

Wei Ni (12):
  of: Add bindings of thermtrip for Tegra soctherm
  thermal: tegra: support hw and sw shutdown
  arm64: dts: tegra210: set thermtrip
  of: Add bindings of gpu hw throttle for Tegra soctherm
  thermal: tegra: add support for gpu hw-throttle
  arm64: dts: tegra210: set gpu hw throttle level
  thermal: tegra: add support for thermal IRQ
  thermal: tegra: add set_trips functionality
  thermal: tegra: add support for EDP IRQ
  arm64: dts: tegra210: set EDP interrupt line
  of: Add bindings of OC hw throttle for Tegra soctherm
  thermal: tegra: enable OC hw throttle

 .../bindings/thermal/nvidia,tegra124-soctherm.txt  |  63 +-
 arch/arm64/boot/dts/nvidia/tegra210.dtsi   |  20 +-
 drivers/thermal/tegra/soctherm.c   | 955 +++--
 drivers/thermal/tegra/soctherm.h   |  16 +
 drivers/thermal/tegra/tegra124-soctherm.c  |   7 +-
 drivers/thermal/tegra/tegra132-soctherm.c  |   7 +-
 drivers/thermal/tegra/tegra210-soctherm.c  |  15 +-
 include/dt-bindings/thermal/tegra124-soctherm.h|  22 +-
 8 files changed, 1029 insertions(+), 76 deletions(-)

-- 
2.7.4

Re: [PATCH 2/2 v3] kdump,vmcoreinfo: Export the value of sme mask to vmcoreinfo

2018-12-17 Thread lijiang

在 2018年12月17日 21:01, Borislav Petkov 写道:
> On Sun, Dec 16, 2018 at 09:16:17PM +0800, Lianbo Jiang wrote:
>> For AMD machine with SME feature, makedumpfile tools need to know
>> whether the crash kernel was encrypted or not. If SME is enabled
>> in the first kernel, the crash kernel's page table(pgd/pud/pmd/pte)
>> contains the memory encryption mask, so need to remove the sme mask
>> to obtain the true physical address.
>>
>> Signed-off-by: Lianbo Jiang 
>> ---
>>  arch/x86/kernel/machine_kexec_64.c | 14 ++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/arch/x86/kernel/machine_kexec_64.c 
>> b/arch/x86/kernel/machine_kexec_64.c
>> index 4c8acdfdc5a7..1860fe24117d 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -352,10 +352,24 @@ void machine_kexec(struct kimage *image)
>>  
>>  void arch_crash_save_vmcoreinfo(void)
>>  {
>> +u64 sme_mask = sme_me_mask;
>> +
>>  VMCOREINFO_NUMBER(phys_base);
>>  VMCOREINFO_SYMBOL(init_top_pgt);
>>  vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
>>  pgtable_l5_enabled());
>> +/*
>> + * Currently, the local variable 'sme_mask' stores the value of
>> + * sme_me_mask(bit 47), and also write the value of sme_mask to
>> + * the vmcoreinfo.
>> + * If need, the bit(sme_mask) might be redefined in the future,
>> + * but the 'bit63' will be reserved.
>> + * For example:
>> + * [ misc  ][ enc bit  ][ other misc SME info   ]
>> + * ____1000______..._
>> + * 63   59   55   51   47   43   39   35   31   27   ... 3
>> + */
> 
> This text belongs into the document.
> 
Ok, i will move it into VMCOREINFO document.

Thanks.

[PATCH] tools/power/x86/intel_pstate_tracer: Fix non root execution for post processing a trace file.

2018-12-17 Thread Doug Smythies

This script is supposed to be allowed to run with regular user privileges
if a previously captured trace is being post processed.

Commit fbe313884d7ddd73ce457473cbdf3763f5b1d3da
tools/power/x86/intel_pstate_tracer: Free the trace buffer memory
introduced a bug that breaks that option.

Commit 35459105deb26430653a7299a86bc66fb4dd5773
tools/power/x86/intel_pstate_tracer: Add optional setting of trace buffer 
memory allocation
moved the code but kept the bug.

This patch fixes the issue.

Signed-off-by: Doug Smythies 
---
 tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py 
b/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py
index 84e2b64..2fa3c57 100755
--- a/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py
+++ b/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py
@@ -585,9 +585,9 @@ current_max_cpu = 0
 
 read_trace_data(filename)
 
-clear_trace_file()
-# Free the memory
 if interval:
+clear_trace_file()
+# Free the memory
 free_trace_buffer()
 
 if graph_data_present == False:
-- 
2.7.4

[PATCH] drm/bochs: add edid present check

2018-12-17 Thread Gerd Hoffmann

Check first two header bytes before trying to read the edid blob,
to avoid the log being spammed in case qemu has no edid support (old
qemu or edid turned off).

Fixes: 01f23459cf drm/bochs: add edid support.
Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/bochs/bochs_hw.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/bochs/bochs_hw.c b/drivers/gpu/drm/bochs/bochs_hw.c
index c90a0d492f..f91e049625 100644
--- a/drivers/gpu/drm/bochs/bochs_hw.c
+++ b/drivers/gpu/drm/bochs/bochs_hw.c
@@ -89,6 +89,10 @@ int bochs_hw_load_edid(struct bochs_device *bochs)
if (!bochs->mmio)
return -1;
 
+   if (readb(bochs->mmio + 0) != 0x00 ||
+   readb(bochs->mmio + 1) != 0xff)
+   return -1;
+
kfree(bochs->edid);
bochs->edid = drm_do_get_edid(&bochs->connector,
  bochs_get_edid_block, bochs);
-- 
2.9.3

Re: [PATCH v11 2/7] ACPI / OSL: Stub out acpi_os_(read/write)_pci_configurations()

2018-12-17 Thread Christoph Hellwig

On Tue, Dec 18, 2018 at 02:56:01AM +, Sinan Kaya wrote:
> Getting ready to allow CONFIG_PCI to be unset with ACPI enabled. Stub out
> acpi_os_read_pci_configuration and acpi_os_write_pci_configuration
> functions when CONFIG_PCI is not defined.
> 
> Signed-off-by: Sinan Kaya 
> ---
>  drivers/acpi/osl.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index b48874b8e1ea..524fd5f33ea4 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -773,6 +773,7 @@ acpi_status
>  acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
>  u64 *value, u32 width)
>  {
> +#ifdef CONFIG_PCI
>   int result, size;
>   u32 value32;
>  
> @@ -799,12 +800,19 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * 
> pci_id, u32 reg,
>   *value = value32;
>  
>   return (result ? AE_ERROR : AE_OK);
> +#else
> + int rc;
> +
> + rc = pr_warn_once("PCI configuration space access is not supported\n");
> + return rc ? AE_SUPPORT : AE_OK;
> +#endif

Normally we provide a full separate stub version.  If we have enough
of them in a separate file.

Re: [PATCH 1/2 v3] kdump: add the vmcoreinfo documentation

2018-12-17 Thread lijiang

在 2018年12月17日 21:00, Borislav Petkov 写道:
> On Sun, Dec 16, 2018 at 09:16:16PM +0800, Lianbo Jiang wrote:
>> +clear_idx
>> +=
>> +The index that the next printk record to read after the last 'clear'
>> +command. It indicates the first record after the last SYSLOG_ACTION
>> +_CLEAR, like issued by 'dmesg -c'.
> 
> What is that used for by the userspace tools?
> 

The clear_idx is used when dumping the dmesg log.

>> +
>> +log_next_idx
>> +
>> +The index of the next record to store in the buffer 'log_buf'. It helps
>> +to compute the index of current strings position.
>> +
>> +printk_log
>> +==
>> +The size of a structure 'printk_log'. It helps to compute the size of
>> +messages, and extract dmesg log.
> 
> What is the difference between that and log_buf?
> 

The printk_log is used to output human readable text, it will encapsulate header
information for log_buf, such as timestamp, syslog level, etc.

> 
> 
>> +
>> +(printk_log, ts_nsec|len|text_len|dict_len)
>> +===
>> +It represents these field offsets in the structure 'printk_log'. User
>> +space tools can parse it and detect any changes to structure down the
>> +line.
> 
> What does that mean? "any changes down the line"?
> 

User space tools can parse it and check whether the values of printk_log's 
members
have been changed. 

I will improve it in patch v4.


>> +
>> +(free_area.free_list, MIGRATE_TYPES)
>> +
>> +The number of migrate types for pages. The free_list is divided into
>> +the array, it needs to know the number of the array.
> 
> ... for?
> 

It needs to know the number of the array when makedumpfile computes the number 
of
free pages.

>> +
>> +NR_FREE_PAGES
>> +=
>> +On linux-2.6.21 or later, the number of free_pages is in
>> +vm_stat[NR_FREE_PAGES]. It can get the number of free pages from the
>> +array.
>> +
>> +PG_lru|PG_private|PG_swapcache|PG_swapbacked|PG_slab|
>> +PG_hwpoision|PG_head_mask
>> +=
>> +It means the attribute of a page. These flags will be used to filter
>> +the free pages.
>> +
>> +PAGE_BUDDY_MAPCOUNT_VALUE or ~PG_buddy
>> +==
>> +The 'PG_buddy' flag indicates that the page is free and in the buddy
>> +system. Makedumpfile can exclude the free pages managed by a buddy.
> 
> That text belongs with the one above?
> 

It exported the value of (~PG_buddy), so it is placed here independently.

>> +
>> +HUGETLB_PAGE_DTOR
>> +=
>> +The 'HUGETLB_PAGE_DTOR' flag indicates the hugetlbfs pages. Makedumpfile
>> +will exclude these pages.
>> +
>> +
>> +x86_64 variables
>> +
>> +
>> +phys_base
>> +=
>> +In x86_64, the 'phys_base' is necessary to convert virtual address of
>> +exported kernel symbol to physical address.
>> +
>> +init_top_pgt
>> +
>> +The 'init_top_pgt' used to walk through the whole page table and convert
>> +virtual address to physical address.
> 
> This is the same as swapper_pg_dir?
> 

These two variables are somewhat similar, but they are used in different 
scenarios.

>> +
>> +pgtable_l5_enabled
>> +==
>> +User-space tools need to know whether the crash kernel was in 5-level
>> +paging mode or not.
>> +
>> +node_data
>> +=
>> +This is a struct 'pglist_data' array, it stores all numa nodes information.
>> +In general, Makedumpfile can get the pglist_data structure from symbol
>> +'node_data'.
>> +
>> +(node_data, MAX_NUMNODES)
>> +=
>> +The number of this 'node_data' array. It means the maximum number of the
>> +nodes in system.
>> +
>> +KERNELOFFSET
>> +
>> +Randomize the address of the kernel image. This is the offset of KASLR in
>> +VMCOREINFO ELF notes. It is used to compute the page offset in x86_64. If
>> +KASLE is disabled, this value is zero.
>> +
>> +KERNEL_IMAGE_SIZE
>> +=
>> +The size of 'KERNEL_IMAGE_SIZE', currently unused.
> 
> So remove?
> 

I'm not sure whether it should be removed, so i keep it.

>> +
>> +The old MODULES_VADDR need be decided by KERNEL_IMAGE_SIZE when kaslr
>> +enabled. Now MODULES_VADDR is not needed any more since Pratyush makes
>> +all VA to PA converting done by page table lookup.
> 
> Also, I did clean this up considerably - please include in your next
> version:
> 

Great, thanks for you help. I will post v4 later.

Regards,
Lianbo

> ---
> diff --git a/Documentation/kdump/vmcoreinfo.txt 
> b/Documentation/kdump/vmcoreinfo.txt
> index d71260bf383a..2ce34d952bfd 100644
> --- a/Documentation/kdump/vmcoreinfo.txt
> +++ b/Documentation/kdump/vmcoreinfo.txt
> @@ -1,18 +1,19 @@
>  
> - Documentation for VMCOREINFO
> + VMCOREINFO
>  
>  
>  ===
>  What is the V

Re: [PATCH v11 1/7] ACPI: Allow CONFIG_PCI to be unset for reboot

2018-12-17 Thread Christoph Hellwig

> +#ifdef CONFIG_PCI
> + unsigned int devfn;
> + struct pci_bus *bus0;
> +
>   /* The reset register can only live on bus 0. */
>   bus0 = pci_find_bus(0, 0);
>   if (!bus0)
> @@ -44,8 +47,9 @@ void acpi_reboot(void)
>   /* Write the value that resets us. */
>   pci_bus_write_config_byte(bus0, devfn,
>   (rr->address & 0x), reset_value);
> +#endif

This would be a lot cleaner if this was split into a little helper
function.

Re: [PATCH v4 1/2] export trace.c helper functions to other modules

2018-12-17 Thread Sagi Grimberg


Reviewed-by: Sagi Grimberg

Re: [PATCH v4 2/2] trace nvme submit queue status

2018-12-17 Thread Sagi Grimberg





@@ -899,6 +900,10 @@ static inline void nvme_handle_cqe(struct nvme_queue 
*nvmeq, u16 idx)
}
  
  	req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id);

+   trace_nvme_sq(req->rq_disk,
+   nvmeq->qid,
+   le16_to_cpu(cqe->sq_head),
+   nvmeq->sq_tail);


Why the newline escapes? why not escape at the 80 char border?

Other than that, looks fine,

Reviewed-by: Sagi Grimberg

Re: [PATCH v5 2/2] media: usb: pwc: Don't use coherent DMA buffers for ISO transfer

2018-12-17 Thread Tomasz Figa

On Fri, Dec 14, 2018 at 9:36 PM Christoph Hellwig  wrote:
>
> On Fri, Dec 14, 2018 at 12:12:38PM +0900, Tomasz Figa wrote:
> > > If the buffer always is physically contiguous, as it is in the currently
> > > posted series, we can always map it with a single dma_map_single call
> > > (if the hardware can handle that in a single segment is a different
> > > question, but out of scope here).
> >
> > Are you sure the buffer is always physically contiguous? At least the
> > ARM IOMMU dma_ops [1] and the DMA-IOMMU dma_ops [2] will simply
> > allocate pages without any continuity guarantees and remap the pages
> > into a contiguous kernel VA (unless DMA_ATTR_NO_KERNEL_MAPPING is
> > given, which makes them return an opaque cookie instead of the kernel
> > VA).
> >
> > [1] 
> > http://git.infradead.org/users/hch/misc.git/blob/2dbb028e4a3017e1b71a6ae3828a3548545eba24:/arch/arm/mm/dma-mapping.c#l1291
> > [2] 
> > http://git.infradead.org/users/hch/misc.git/blob/2dbb028e4a3017e1b71a6ae3828a3548545eba24:/drivers/iommu/dma-iommu.c#l450
>
> We never end up in this allocator for the new DMA_ATTR_NON_CONSISTENT
> case, and that is intentional.

It kind of limits the usability of this API, since it enforces
contiguous allocations even for big sizes even for devices behind
IOMMU (contrary to the case when DMA_ATTR_NON_CONSISTENT is not set),
but given that it's just a temporary solution for devices like these
USB cameras, I guess that's fine.

Note that in V4L2 we use the DMA API extensively, so that we don't
need to embed any device-specific or integration-specific knowledge in
the framework. Right now we're using dma_alloc_attrs() with
driver-provided attrs [1], but current driver never request
non-consistent memory. We're however thinking about making it possible
to allocate non-consistent memory. What would you suggest for this?

[1] 
https://elixir.bootlin.com/linux/v4.20-rc7/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L139

Best regards,
Tomasz

Re: [regression, bisected] Keyboard not responding after resuming from suspend/hibernate

2018-12-17 Thread Numan Demirdöğen

Sun, 2 Dec 2018 23:28:09 +0100 tarihinde
Pavel Machek  yazdı:

>On Fri 2018-11-30 15:44:55, Numan Demirdöğen wrote:
>> Sun, 28 Oct 2018 22:06:54 +0300 tarihinde
>> Numan Demirdöğen  yazdı:
>>   
>> >Thu, 25 Oct 2018 09:49:03 +0200 tarihinde
>> >Pavel Machek  yazdı:
>> >  
>> >> Hi!
>> >> 
>> >> Here's problem bisected down to:
>> >> 
>> >> commit 9d659ae14b545c4296e812c70493bfdc999b5c1c
>> >> Author: Peter Zijlstra 
>> >> Date:   Tue Aug 23 14:40:16 2016 +0200
>> >> 
>> >> locking/mutex: Add lock handoff to avoid starvation
>> >> 
>> >> Implement lock handoff to avoid lock starvation.
>> >> 
>> >> Numan, I assume revert of that patch on the 4.18 kernel still
>> >> makes it work?
>> >> 
>> >
>> >Unfortunately, I could not revert
>> >9d659ae14b545c4296e812c70493bfdc999b5c1c on kernels from 4.18.16 to
>> >4.10-rc1 because there were too much conflicts, which I could not
>> >solve as an "average Joe". I tried
>> >a3ea3d9b865c2a8f7fe455c7fa26db4b6fd066e3 which is parent of
>> >9d659ae14b545c4296e812c70493bfdc999b5c1c and succeeded to compile
>> >kernel.
>> >
>> >git checkout a3ea3d9b865c2a8f7fe455c7fa26db4b6fd066e3
>> >
>> >Then, I compiled kernel and rebooted with it. I tried a couples of
>> >times suspending and resuming, all of the time keyboard worked as
>> >expected.
>> >  
>> 
>> With this one line patch from Takashi Iwai, keyboard is working as
>> expected after resuming from suspend/hibernate.
>> 
>> --- a/kernel/locking/mutex.c
>> +++ b/kernel/locking/mutex.c
>> @@ -59,7 +59,7 @@ EXPORT_SYMBOL(__mutex_init);
>>   * Bit2 indicates handoff has been done and we're waiting for
>> pickup. */
>>  #define MUTEX_FLAG_WAITERS  0x01
>> -#define MUTEX_FLAG_HANDOFF  0x02
>> +#define MUTEX_FLAG_HANDOFF  0x00
>>  #define MUTEX_FLAG_PICKUP   0x04
>>  
>>  #define MUTEX_FLAGS 0x07
>> 
>> 
>> Thanks in advance and regards,  
>
>Ok. So it is a regression, and you can ask Linus to apply this
>.. but... that's kind of heavy solution. Peter, do you have any other
>ideas?
>
>   Pavel

Hi,

I did not mention the one line patch from Takashi Iwai as a means of fix
but as a hint. Sorry for misunderstanding.

Here is a another hint from another user:

I found that passing the options i8042.reset=1 i8042.dumbkbd=1 i8042.direct=1
results in the keyboard functioning after resume. However, there is a
long delay before the keyboard or mouse will respond to input on the
lock screen.[1]

[1] https://bugzilla.kernel.org/show_bug.cgi?id=195471#c39

-- 
Numan Demirdöğen


pgpbIY74iePYL.pgp
Description: Dijital OpenPGP imzası

Re: [PATCH v2] net/smc: fix TCP fallback socket release

2018-12-17 Thread Myungho Jung

On Mon, Dec 17, 2018 at 03:58:58PM +0100, Ursula Braun wrote:
> 

Hi Ursula,

Thank you for your suggestion. I have a question on your comment.

> 
> On 12/17/2018 06:21 AM, Myungho Jung wrote:
> > clcsock can be released while kernel_accept() references it in TCP
> > listen worker. Also, clcsock needs to wake up before released if TCP
> > fallback is used and the clcsock is blocked by accept. Add a lock to
> > safely release clcsock and call kernel_sock_shutdown() to wake up
> > clcsock from accept in smc_release().
> 
> Thanks for your effort to solve this problem. I have some minor
> improvement proposals:
> 
> > 
> > Reported-by: syzbot+0bf2e01269f1274b4...@syzkaller.appspotmail.com
> > Reported-by: syzbot+e3132895630f95730...@syzkaller.appspotmail.com
> > Signed-off-by: Myungho Jung 
> > ---
> >  net/smc/af_smc.c | 14 --
> >  net/smc/smc.h|  2 ++
> >  2 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> > index 5fbaf1901571..5d06fb1bbccf 100644
> > --- a/net/smc/af_smc.c
> > +++ b/net/smc/af_smc.c
> > @@ -147,8 +147,14 @@ static int smc_release(struct socket *sock)
> > sk->sk_shutdown |= SHUTDOWN_MASK;
> > }
> > if (smc->clcsock) {
> > +   if (smc->use_fallback && sk->sk_state == SMC_LISTEN) {
> > +   /* wake up clcsock accept */
> > +   rc = kernel_sock_shutdown(smc->clcsock, SHUT_RDWR);
> > +   }
> 
> This part is not needed, since an SMC socket in state SMC_LISTEN is never
> a use_fallback socket.

In smc_sendmsg(), set use_fallback to true if SMC socket is SMC_INIT
state and the message has MSG_FASTOPEN flag. After this, smc_listen()
would trigger smc_tcp_listen_work(). Is this not an expected scenario?
Then, what is the reason for not skipping smc_sendmsg() in SMC_INIT
state?

> 
> > +   mutex_lock(&smc->clcsock_release_lock);
> > sock_release(smc->clcsock);
> > smc->clcsock = NULL;
> > +   mutex_unlock(&smc->clcsock_release_lock);
> > }
> > if (smc->use_fallback) {
> > if (sk->sk_state != SMC_LISTEN && sk->sk_state != SMC_INIT)
> > @@ -205,6 +211,7 @@ static struct sock *smc_sock_alloc(struct net *net, 
> > struct socket *sock,
> > spin_lock_init(&smc->conn.send_lock);
> > sk->sk_prot->hash(sk);
> > sk_refcnt_debug_inc(sk);
> > +   mutex_init(&smc->clcsock_release_lock);
> >  
> > return sk;
> >  }
> > @@ -821,7 +828,7 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, 
> > struct smc_sock **new_smc)
> > struct socket *new_clcsock = NULL;
> > struct sock *lsk = &lsmc->sk;
> > struct sock *new_sk;
> > -   int rc;
> > +   int rc = 0;
> 
> Without clcsock the good path should not be executed. Thus I suggest
> to initialize with something negative like -EINVAL.
> 
> >  
> > release_sock(lsk);
> > new_sk = smc_sock_alloc(sock_net(lsk), NULL, lsk->sk_protocol);
> > @@ -834,7 +841,10 @@ static int smc_clcsock_accept(struct smc_sock *lsmc, 
> > struct smc_sock **new_smc)
> > }
> > *new_smc = smc_sk(new_sk);
> >  
> > -   rc = kernel_accept(lsmc->clcsock, &new_clcsock, 0);
> > +   mutex_lock(&lsmc->clcsock_release_lock);
> > +   if (lsmc->clcsock)
> > +   rc = kernel_accept(lsmc->clcsock, &new_clcsock, 0);
> > +   mutex_unlock(&lsmc->clcsock_release_lock);
> > lock_sock(lsk);
> > if  (rc < 0)
> > lsk->sk_err = -rc;
> > diff --git a/net/smc/smc.h b/net/smc/smc.h
> > index 08786ace6010..9a2795cf5d30 100644
> > --- a/net/smc/smc.h
> > +++ b/net/smc/smc.h
> > @@ -219,6 +219,8 @@ struct smc_sock {   /* smc 
> > sock container */
> >  * started, waiting for unsent
> >  * data to be sent
> >  */
> > +   struct mutexclcsock_release_lock;
> > +   /* protects clcsock */
> 
> I suggest to be more precise: "protects clcsock of a listen socket" 
> 
> >  };
> >  
> >  static inline struct smc_sock *smc_sk(const struct sock *sk)
> > 
>

[PATCH -next] regulator: act8945a-regulator: make symbol act8945a_pm static

2018-12-17 Thread Wei Yongjun

Fixes the following sparse warning:

drivers/regulator/act8945a-regulator.c:340:1: warning:
 symbol 'act8945a_pm' was not declared. Should it be static?

Fixes: 7482d6ecc68e ("regulator: act8945a-regulator: Implement PM 
functionalities")
Signed-off-by: Wei Yongjun 
---
 drivers/regulator/act8945a-regulator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/act8945a-regulator.c 
b/drivers/regulator/act8945a-regulator.c
index 90572b6..dc3942b 100644
--- a/drivers/regulator/act8945a-regulator.c
+++ b/drivers/regulator/act8945a-regulator.c
@@ -337,7 +337,7 @@ static int act8945a_suspend(struct device *pdev)
return regmap_write(act8945a->regmap, ACT8945A_SYS_CTRL, 0x42);
 }
 
-SIMPLE_DEV_PM_OPS(act8945a_pm, act8945a_suspend, NULL);
+static SIMPLE_DEV_PM_OPS(act8945a_pm, act8945a_suspend, NULL);
 
 static void act8945a_pmic_shutdown(struct platform_device *pdev)
 {

RE: [PATCH] arm64: dts: nxp: ls208xa: add more thermal zone support

2018-12-17 Thread Andy Tang

Hi,

PING.

BR,
Andy

> -Original Message-
> From: Yuantian Tang 
> Sent: 2018年10月31日 12:48
> To: shawn...@kernel.org
> Cc: Leo Li ; robh...@kernel.org; mark.rutl...@arm.com;
> linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org;
> linux-kernel@vger.kernel.org; rui.zh...@intel.com; daniel.lezc...@linaro.org;
> Andy Tang 
> Subject: [PATCH] arm64: dts: nxp: ls208xa: add more thermal zone support
> 
> Ls208xa has several thermal sensors. Add all the sensor id to dts to enable
> them.
> 
> To make the dts cleaner, re-organize the nodes to split out the common part so
> that it can be shared with other SoCs.
> 
> Signed-off-by: Yuantian Tang 
> ---
>  arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi  |8 +-
>  arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi  |8 +-
>  arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi  |   83 +++-
>  arch/arm64/boot/dts/freescale/fsl-tmu-map1.dtsi |   99 +
>  arch/arm64/boot/dts/freescale/fsl-tmu-map2.dtsi |   99 +
>  arch/arm64/boot/dts/freescale/fsl-tmu-map3.dtsi |   99 +
>  arch/arm64/boot/dts/freescale/fsl-tmu.dtsi  |  251
> +++
>  7 files changed, 591 insertions(+), 56 deletions(-)  create mode 100644
> arch/arm64/boot/dts/freescale/fsl-tmu-map1.dtsi
>  create mode 100644 arch/arm64/boot/dts/freescale/fsl-tmu-map2.dtsi
>  create mode 100644 arch/arm64/boot/dts/freescale/fsl-tmu-map3.dtsi
>  create mode 100644 arch/arm64/boot/dts/freescale/fsl-tmu.dtsi
> 
> diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> index f9c1d30..8f9788c 100644
> --- a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> +++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
> @@ -12,7 +12,7 @@
>  #include "fsl-ls208xa.dtsi"
> 
>  &cpu {
> - cpu0: cpu@0 {
> + cooling_map0: cpu0: cpu@0 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a57";
>   reg = <0x0>;
> @@ -32,7 +32,7 @@
>   #cooling-cells = <2>;
>   };
> 
> - cpu2: cpu@100 {
> + cooling_map1: cpu2: cpu@100 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a57";
>   reg = <0x100>;
> @@ -52,7 +52,7 @@
>   #cooling-cells = <2>;
>   };
> 
> - cpu4: cpu@200 {
> + cooling_map2: cpu4: cpu@200 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a57";
>   reg = <0x200>;
> @@ -72,7 +72,7 @@
>   #cooling-cells = <2>;
>   };
> 
> - cpu6: cpu@300 {
> + cooling_map3: cpu6: cpu@300 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a57";
>   reg = <0x300>;
> diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi
> b/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi
> index 7c882da..013fe16 100644
> --- a/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi
> +++ b/arch/arm64/boot/dts/freescale/fsl-ls2088a.dtsi
> @@ -12,7 +12,7 @@
>  #include "fsl-ls208xa.dtsi"
> 
>  &cpu {
> - cpu0: cpu@0 {
> + cooling_map0: cpu0: cpu@0 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a72";
>   reg = <0x0>;
> @@ -32,7 +32,7 @@
>   #cooling-cells = <2>;
>   };
> 
> - cpu2: cpu@100 {
> + cooling_map1: cpu2: cpu@100 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a72";
>   reg = <0x100>;
> @@ -52,7 +52,7 @@
>   #cooling-cells = <2>;
>   };
> 
> - cpu4: cpu@200 {
> + cooling_map2: cpu4: cpu@200 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a72";
>   reg = <0x200>;
> @@ -72,7 +72,7 @@
>   #cooling-cells = <2>;
>   };
> 
> - cpu6: cpu@300 {
> + cooling_map3: cpu6: cpu@300 {
>   device_type = "cpu";
>   compatible = "arm,cortex-a72";
>   reg = <0x300>;
> diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> index 8cb78dd..4102317 100644
> --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> @@ -75,54 +75,7 @@
>   mask = <0x2>;
>   };
> 
> - thermal-zones {
> - cpu_thermal: cpu-thermal {
> - polling-delay-passive = <1000>;
> - polling-delay = <5000>;
> -
> - thermal-sensors = <&tmu 4>;
> -
> - trips {
> - cpu_alert: cpu-alert {
> - temperature = <75000>;
> - hysteresis = <2000>;
> - type = "passive";
> - };
> - cpu_crit: cpu-crit {
> - temperature = <85000>;
> - hysteresis = <2000>;
> -

Re: [RFC PATCH net v3] net: phy: Fix the issue that netif always links up after resuming

2018-12-17 Thread Kunihiko Hayashi

Hi Heiner,

On Tue, 18 Dec 2018 07:44:33 +0100  wrote:

> On 18.12.2018 07:25, Kunihiko Hayashi wrote:
> > Hi Heiner,
> > 
> > On Mon, 17 Dec 2018 19:43:31 +0100  wrote:
> > 
> >> On 17.12.2018 19:41, Heiner Kallweit wrote:
> >>> On 17.12.2018 07:41, Kunihiko Hayashi wrote:
>  Hi,
> 
>  Gentle ping...
>  Are there any comments about changes since v2?
> 
>  v2: https://www.spinics.net/lists/netdev/msg536926.html
> 
>  Thank you,
> 
>  On Mon, 3 Dec 2018 17:22:29 +0900  wrote:
> 
> > Even though the link is down before entering hibernation,
> > there is an issue that the network interface always links up after 
> > resuming
> > from hibernation.
> >
> > The phydev->state is PHY_READY before enabling the network interface, so
> > the link is down. After resuming from hibernation, the phydev->state is
> > forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes 
> > up.
> >
> > This patch adds a new convenient function to check whether the PHY is in
> > a started state, and expects to solve the issue by changing 
> > phydev->state
> > to PHY_UP and calling phy_start_machine() only when the PHY is started.
> >
> >>> This convenience function and the related change to phy_stop() are part of
> >>> the following already and don't need to be part of your patch.
> >>> https://patchwork.ozlabs.org/patch/1014171/
> > 
> > I see. I'll follow your patch when necessary.
> > 
> > Suggested-by: Heiner Kallweit 
> > Signed-off-by: Kunihiko Hayashi 
> > ---
> >
> > Changes since v2:
> >  - add mutex lock/unlock for changing phydev->state
> >  - check whether the mutex is locked in phy_is_started()
> >  
> > Changes since v1:
> >  - introduce a new helper function phy_is_started() and use it instead 
> > of
> >checking link status
> >  - replace checking phydev->state with phy_is_started() in
> >phy_stop_machine()
> >
> > drivers/net/phy/phy.c|  2 +-
> >  drivers/net/phy/phy_device.c | 12 +---
> >  include/linux/phy.h  | 13 +
> >  3 files changed, 23 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> > index 1d73ac3..f484d03 100644
> > --- a/drivers/net/phy/phy.c
> > +++ b/drivers/net/phy/phy.c
> > @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev)
> > cancel_delayed_work_sync(&phydev->state_queue);
> >  
> > mutex_lock(&phydev->lock);
> > -   if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
> > +   if (phy_is_started(phydev))
> > phydev->state = PHY_UP;
> >>>
> >>> I'm wondering whether we need to do this. If the PHY is attached,
> >>> then mdio_bus_phy_suspend() calls phy_stop_machine() which does
> >>> exactly the same. If the PHY is not attached, then we don't have
> >>> to do anything. Therefore I think we just have to do the same as
> >>> in mdio_bus_phy_resume():
> >>>
> >>> if (phydev->attached_dev && phydev->adjust_link)
> >>>   phy_start_machine(phydev);
> > 
> > Agreed.
> > 
> > Although the original code changed phydev->state,
> > it seems that it's only enough to
> > - call phy_stop_machine() in mdio_bus_phy_suspend()
> > - call phy_start_machine() in mdio_bus_phy_resume() and 
> > mdio_bus_phy_restore()
> > if the PHY is attached.
> > 
> >>> Can you test this?
> > 
> > I tested your code instead of applying my entire patch, and I confirmed
> > that the code solved the issue in my environment.
> > 
> > Do you make new patch instead of my patch?
> > (and I can add Reported-by: for the issue and Tested-by:)
> > 
> Up to you. It's fine with me if you submit the patch, but I can also do it
> and mention you in Reported-by and Tested-by. Just let me know.

I see. I'll make and submit the patch as a fix for the issue.

Thank you,

> 
> > Thank you,
> > 
> > 
> >>>
> >> Sorry for the confusion, this comment is related to the next part
> >> of your patch.
> >>
> > mutex_unlock(&phydev->lock);
> >  }
> > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > index ab33d17..4897d24 100644
> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -309,10 +309,16 @@ static int mdio_bus_phy_restore(struct device 
> > *dev)
> > return ret;
> >  
> > /* The PHY needs to renegotiate. */
> > -   phydev->link = 0;
> > -   phydev->state = PHY_UP;
> > +   mutex_lock(&phydev->lock);
> > +   if (phy_is_started(phydev)) {
> > +   phydev->state = PHY_UP;
> > +   mutex_unlock(&phydev->lock);
> > +   phydev->link = 0;
> > +   phy_start_machine(phydev);
> > +   } else {
> > +   mutex_unlock(&phydev->lock)

Re: [PATCH v2 5/5] ASoC: qcom: Kconfig: select config for codec

2018-12-17 Thread Cheng-yi Chiang

On Wed, Nov 28, 2018 at 5:01 PM Cheng-Yi Chiang  wrote:
>
> Select SND_SOC_RT5663 and SND_SOC_MAX98927 for SND_SOC_SDM845.
>
> Signed-off-by: Rohit kumar 
> Signed-off-by: Cheng-Yi Chiang 
> ---
>  sound/soc/qcom/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/sound/soc/qcom/Kconfig b/sound/soc/qcom/Kconfig
> index 2a4c912d1e484..3528c4279cbae 100644
> --- a/sound/soc/qcom/Kconfig
> +++ b/sound/soc/qcom/Kconfig
> @@ -100,6 +100,8 @@ config SND_SOC_SDM845
> depends on QCOM_APR
> select SND_SOC_QDSP6
> select SND_SOC_QCOM_COMMON
> +   select SND_SOC_RT5663
> +   select SND_SOC_MAX98927
This line is actually not needed.

We can drop this patch as there was another patch merged already:
https://lkml.org/lkml/2018/12/10/875.

Thanks a lot for taking a look!


> help
>   To add support for audio on Qualcomm Technologies Inc.
>   SDM845 SoC-based systems.
> --
> 2.20.0.rc0.387.gc7a69e6b6c-goog
>

[RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2018-12-17 Thread Yang Shi

swap_vma_readahead()'s comment is missed, just add it.

Cc: Huang Ying 
Cc: Tim Chen 
Cc: Minchan Kim 
Signed-off-by: Yang Shi 
---
 mm/swap_state.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 7cc3c29..c12aedf 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -695,6 +695,23 @@ static void swap_ra_info(struct vm_fault *vmf,
pte_unmap(orig_pte);
 }
 
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault information
+ *
+ * Returns the struct page for entry and addr, after queueing swapin.
+ *
+ * Primitive swap readahead code. We simply read in a few pages whoes
+ * virtual addresses are around the fault address in the same vma.
+ *
+ * This has been extended to use the NUMA policies from the mm triggering
+ * the readahead.
+ *
+ * Caller must hold down_read on the vma->vm_mm if vmf->vma is not NULL.
+ *
+ */
 static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
   struct vm_fault *vmf)
 {
-- 
1.8.3.1

[PATCH] powerpc/setup: display reason for not booting

2018-12-17 Thread Christophe Leroy

When no machine description matches, display it clearly
before looping forever.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/setup-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 4fe7740917a7..ef7fb60534a8 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -634,7 +634,7 @@ void probe_machine(void)
}
/* What can we do if we didn't find ? */
if (machine_id >= &__machine_desc_end) {
-   DBG("No suitable machine found !\n");
+   pr_err("No suitable machine description found !\n");
for (;;);
}
 
-- 
2.13.3

[RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-17 Thread Yang Shi

Swap readahead would read in a few pages regardless if the underlying
device is busy or not.  It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.

Use inode_read_congested() to check if the underlying device is busy or
not like what file page readahead does.  Get inode from swap_info_struct.
Although we can add inode information in swap_address_space
(address_space->host), it may lead some unexpected side effect, i.e.
it may break mapping_cap_account_dirty().  Using inode from
swap_info_struct seems simple and good enough.

Just does the check in vma_cluster_readahead() since
swap_vma_readahead() is just used for non-rotational device which
much less likely has congestion than traditional HDD.

Although swap slots may be consecutive on swap partition, it still may be
fragmented on swap file. This check would help to reduce excessive stall
for such case.

Cc: Huang Ying 
Cc: Tim Chen 
Cc: Minchan Kim 
Signed-off-by: Yang Shi 
---
 mm/swap_state.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index fd2f21e..7cc3c29 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, 
gfp_t gfp_mask,
bool do_poll = true, page_allocated;
struct vm_area_struct *vma = vmf->vma;
unsigned long addr = vmf->address;
+   struct inode *inode = si->swap_file->f_mapping->host;
 
mask = swapin_nr_pages(offset) - 1;
if (!mask)
goto skip;
 
+   if (inode_read_congested(inode))
+   goto skip;
+
do_poll = false;
/* Read a page_cluster sized and aligned cluster around offset. */
start_offset = offset & ~mask;
-- 
1.8.3.1

Re: [PATCH v2 3/7] dt-bindings: remoteproc: qcom: Fixup regulator dependencies

2018-12-17 Thread Sibi Sankar


Hi Doug,
Thanks for the review :)

On 2018-12-18 05:30, Doug Anderson wrote:

Hi,

On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar  
wrote:


Fixup regulator supply dependencies for Q6V5 MSS on MSM996 SoCs.

Signed-off-by: Sibi Sankar 
---
 Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt 
b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt

index 780adc043b37..98894e6ad456 100644
--- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt
+++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt
@@ -76,7 +76,9 @@ on the Qualcomm Hexagon core.
Usage: required
Value type: 
Definition: reference to the regulators to be held on behalf 
of the

-   booting of the Hexagon core
+   booting of the Hexagon core on MSM8916 SoCs
+   reference to the pll-supply regulator to be held 
on behalf

+   of the booting of the Hexagon core on MSM8996 SoCs


The prose gets in the way and doesn't add anything.  I also don't
understand what you're saying for msm8996.  You're saying that
"pll-supply" is required there but none of the others?  That doesn't
seem to be true in the code I have in front of me, but maybe I'm
missing some patch.  For me, I'd write:



AFAIK, only the exceptions are captured. But your
suggestion seems more simple/complete. Perhaps I'll
replace SoCs instead of compatibles? Anyway
I'll wait for Bjorn/Rob's preference.


For the compatible strings below the following supplies are required:
  "qcom,q6v5-pil"
  "qcom,msm8916-mss-pil",
  "qcom,msm8974-mss-pil"
- cx-supply:
- mss-supply:
- mx-supply:
- pll-supply:
Usage: required
Value type: 
Definition: reference to the regulators to be held on behalf of the
booting of the Hexagon core


...and if msm8996 actually needs "pll-supply", you could add in...

For the compatible strings below the following supplies are required:
  "qcom,msm8996-mss-pil"
- pll-supply:
Usage: required
Value type: 
Definition: reference to the regulators to be held on behalf of the
booting of the Hexagon core


--
-- Sibi Sankar --
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [PATCH][resend] drm: dw-hdmi-i2s: convert to SPDX identifiers

2018-12-17 Thread Laurent Pinchart

Hi Morimoto-san,

Thank you for the patch.

On Tuesday, 18 December 2018 08:00:24 EET Kuninori Morimoto wrote:
> From: Kuninori Morimoto 
> 
> This patch updates license to use SPDX-License-Identifier
> instead of verbose license text.
> 
> Signed-off-by: Kuninori Morimoto 

Reviewed-by: Laurent Pinchart 

> ---
> few weeks passed, nothing happen. I re-post this patch again.
> I added Andrew on Cc

The driver seems to be lacking a maintainer :-S

>  drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c index
> 8f9c8a6..2228689 100644
> --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
> @@ -1,12 +1,9 @@
> +// SPDX-License-Identifier: GPL-2.0
>  /*
>   * dw-hdmi-i2s-audio.c
>   *
>   * Copyright (c) 2017 Renesas Solutions Corp.
>   * Kuninori Morimoto 
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
>   */
>  #include 


-- 
Regards,

Laurent Pinchart

Re: [RFC PATCH net v3] net: phy: Fix the issue that netif always links up after resuming

2018-12-17 Thread Heiner Kallweit

On 18.12.2018 07:25, Kunihiko Hayashi wrote:
> Hi Heiner,
> 
> On Mon, 17 Dec 2018 19:43:31 +0100  wrote:
> 
>> On 17.12.2018 19:41, Heiner Kallweit wrote:
>>> On 17.12.2018 07:41, Kunihiko Hayashi wrote:
 Hi,

 Gentle ping...
 Are there any comments about changes since v2?

 v2: https://www.spinics.net/lists/netdev/msg536926.html

 Thank you,

 On Mon, 3 Dec 2018 17:22:29 +0900  wrote:

> Even though the link is down before entering hibernation,
> there is an issue that the network interface always links up after 
> resuming
> from hibernation.
>
> The phydev->state is PHY_READY before enabling the network interface, so
> the link is down. After resuming from hibernation, the phydev->state is
> forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.
>
> This patch adds a new convenient function to check whether the PHY is in
> a started state, and expects to solve the issue by changing phydev->state
> to PHY_UP and calling phy_start_machine() only when the PHY is started.
>
>>> This convenience function and the related change to phy_stop() are part of
>>> the following already and don't need to be part of your patch.
>>> https://patchwork.ozlabs.org/patch/1014171/
> 
> I see. I'll follow your patch when necessary.
> 
> Suggested-by: Heiner Kallweit 
> Signed-off-by: Kunihiko Hayashi 
> ---
>
> Changes since v2:
>  - add mutex lock/unlock for changing phydev->state
>  - check whether the mutex is locked in phy_is_started()
>  
> Changes since v1:
>  - introduce a new helper function phy_is_started() and use it instead of
>checking link status
>  - replace checking phydev->state with phy_is_started() in
>phy_stop_machine()
>
> drivers/net/phy/phy.c|  2 +-
>  drivers/net/phy/phy_device.c | 12 +---
>  include/linux/phy.h  | 13 +
>  3 files changed, 23 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 1d73ac3..f484d03 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev)
>   cancel_delayed_work_sync(&phydev->state_queue);
>  
>   mutex_lock(&phydev->lock);
> - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
> + if (phy_is_started(phydev))
>   phydev->state = PHY_UP;
>>>
>>> I'm wondering whether we need to do this. If the PHY is attached,
>>> then mdio_bus_phy_suspend() calls phy_stop_machine() which does
>>> exactly the same. If the PHY is not attached, then we don't have
>>> to do anything. Therefore I think we just have to do the same as
>>> in mdio_bus_phy_resume():
>>>
>>> if (phydev->attached_dev && phydev->adjust_link)
>>> phy_start_machine(phydev);
> 
> Agreed.
> 
> Although the original code changed phydev->state,
> it seems that it's only enough to
> - call phy_stop_machine() in mdio_bus_phy_suspend()
> - call phy_start_machine() in mdio_bus_phy_resume() and mdio_bus_phy_restore()
> if the PHY is attached.
> 
>>> Can you test this?
> 
> I tested your code instead of applying my entire patch, and I confirmed
> that the code solved the issue in my environment.
> 
> Do you make new patch instead of my patch?
> (and I can add Reported-by: for the issue and Tested-by:)
> 
Up to you. It's fine with me if you submit the patch, but I can also do it
and mention you in Reported-by and Tested-by. Just let me know.

> Thank you,
> 
> 
>>>
>> Sorry for the confusion, this comment is related to the next part
>> of your patch.
>>
>   mutex_unlock(&phydev->lock);
>  }
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index ab33d17..4897d24 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -309,10 +309,16 @@ static int mdio_bus_phy_restore(struct device *dev)
>   return ret;
>  
>   /* The PHY needs to renegotiate. */
> - phydev->link = 0;
> - phydev->state = PHY_UP;
> + mutex_lock(&phydev->lock);
> + if (phy_is_started(phydev)) {
> + phydev->state = PHY_UP;
> + mutex_unlock(&phydev->lock);
> + phydev->link = 0;
> + phy_start_machine(phydev);
> + } else {
> + mutex_unlock(&phydev->lock);
> + }
>  
> - phy_start_machine(phydev);
>  
>   return 0;
>  }
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index 3ea87f7..dd21537 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -898,6 +898,19 @@ static inline bool phy_is_pseudo_fixed_link(struct 
> phy_device *phydev)
>  }
>  
>  /**
> + * phy_is_started - Convenience function for testing whether a PHY is in
> + * a started state
> + * @phydev: the phy_device s

Re: [PATCH] pinctrl: xway: fix gpio-hog related boot issues

2018-12-17 Thread Martin Schiller


On 2018-12-17 17:45, John Crispin wrote:

On 17/12/2018 15:32, Linus Walleij wrote:

On Fri, Dec 14, 2018 at 8:48 AM Martin Schiller  wrote:

This patch is based on commit a86caa9ba5d7 ("pinctrl: msm: fix 
gpio-hog

related boot issues").

It fixes the issue that the gpio ranges needs to be defined before
gpiochip_add().

Therefore, we also have to swap the order of registering the pinctrl
driver and registering the gpio chip.

You also have to add the "gpio-ranges" property to the pinctrl device
node to get it finally working.

Signed-off-by: Martin Schiller 

Patch applied unless John Crispin has objections, it looks
good to me!

Yours,
Linus Walleij



sorry did not see the patch in my inbox



Sorry, that was my fault.
I've added everyone from getmaintainers.pl output, but forgot to also 
add you.


Regards,
Martin

Re: [PATCH v2 6/7] arm64: dts: qcom: sdm845: Add power-domain for Q6V5 MSS node

2018-12-17 Thread Sibi Sankar


Hi Doug,
Thanks for the review :)

On 2018-12-18 05:32, Doug Anderson wrote:

Hi,

On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar  
wrote:


Add power-domains cx, mx, mss and load_state for Q6V5 MSS node.

Signed-off-by: Sibi Sankar 
---

This patch depends on the following bindings:
https://patchwork.kernel.org/patch/10725801/ - rpmhpd dt bindings
https://patchwork.kernel.org/patch/10725793/ - rpmhpd dt node
https://patchwork.kernel.org/patch/10678301/ - AOP QMP dt bindings

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 6 ++
 1 file changed, 6 insertions(+)


As per my comments on patch #5, I think this patch (AKA patch #6)
should be folded in there.



okay



diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi

index 33ff8668828f..56f5f55db9e2 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -1401,6 +1401,12 @@
qcom,halt-regs = <&tcsr_mutex_regs
0x23000 0x25000 
0x24000>;


+   power-domains = <&aoss_qmp_pd 
AOSS_QMP_LS_MODEM>,

+   <&rpmhpd SDM845_CX>,
+   <&rpmhpd SDM845_MX>,
+   <&rpmhpd SDM845_MSS>;
+   power-domain-names = "load_state", "cx", "mx", 
"mss";


I guess you changed this to "load_state" from "aop" before?  Is there
code that actually uses this?



Bjorn said he will be posting the patch
for handling power-domains for mss..



-Doug

-Doug


--
-- Sibi Sankar --
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [PATCH v2 5/7] arm64: dts: qcom: sdm845: Add Q6V5 MSS node

2018-12-17 Thread Sibi Sankar


Hi Doug,
Thanks for the review :)

On 2018-12-18 05:32, Doug Anderson wrote:

Hi,

On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar  
wrote:


This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs.

Signed-off-by: Sibi Sankar 
---

v2:
  * Fixed style changes
  * Added missing clocks in the dt-bindings
  * Split mss remoteproc node into a number of patches


I know there was some off-list suggestion to split this into a number
of patches, but to actually make that useful to anyone we'd actually
need to _also_ post up patches to make the driver probe / work without
these power domains.  ...and as per other discussions it's kinda
"lucky" that it happens to work without them and Bjorn wasn't
supportive of making this optional.

So I'd actually fold patch 6 into patch 5 and focus on getting the
"aoss_qmp_pd" landed sooner rather than later.



I'll fold them in v3



Keeping the "shutdown-ack" as a separate patch makes sense though
since the bindings currently list that as "optional" and I guess
things work OK w/out it.


Once patch #6 is folded into patch #5 feel free to add my Reviewed-by 
tag.


okay

--
-- Sibi Sankar --
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [PATCH v6 4/6] mm: Shuffle initial free memory to improve memory-side-cache utilization

2018-12-17 Thread kbuild test robot

Hi Dan,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc7 next-20181217]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Dan-Williams/mm-Randomize-free-memory/20181218-130230
config: x86_64-randconfig-x010-201850 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   mm/memblock.c: In function 'memblock_set_sidecache':
>> mm/memblock.c:859:4: error: too many arguments to function 
>> 'page_alloc_shuffle'
   page_alloc_shuffle(SHUFFLE_ENABLE);
   ^~
   In file included from mm/memblock.c:20:0:
   include/linux/shuffle.h:43:20: note: declared here
static inline void page_alloc_shuffle(void)
   ^~

vim +/page_alloc_shuffle +859 mm/memblock.c

   825  
   826  #ifdef CONFIG_HAVE_MEMBLOCK_CACHE_INFO
   827  /**
   828   * memblock_set_sidecache - set the system memory cache info
   829   * @base: base address of the region
   830   * @size: size of the region
   831   * @cache_size: system side cache size in bytes
   832   * @direct: true if the cache has direct mapped associativity
   833   *
   834   * This function isolates region [@base, @base + @size), and saves the 
cache
   835   * information.
   836   *
   837   * Return: 0 on success, -errno on failure.
   838   */
   839  int __init_memblock memblock_set_sidecache(phys_addr_t base, 
phys_addr_t size,
   840 phys_addr_t cache_size, bool direct_mapped)
   841  {
   842  struct memblock_type *type = &memblock.memory;
   843  int i, ret, start_rgn, end_rgn;
   844  
   845  ret = memblock_isolate_range(type, base, size, &start_rgn, 
&end_rgn);
   846  if (ret)
   847  return ret;
   848  
   849  for (i = start_rgn; i < end_rgn; i++) {
   850  struct memblock_region *r = &type->regions[i];
   851  
   852  r->cache_size = cache_size;
   853  r->direct_mapped = direct_mapped;
   854  /*
   855   * Enable randomization for amortizing direct-mapped
   856   * memory-side-cache conflicts.
   857   */
   858  if (r->size > r->cache_size && r->direct_mapped)
 > 859  page_alloc_shuffle(SHUFFLE_ENABLE);
   860  }
   861  
   862  return 0;
   863  }
   864  #endif
   865  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH v2 4/7] dt-bindings: remoteproc: qcom: Add power-domain bindings for Q6V5

2018-12-17 Thread Sibi Sankar


Hi Doug,
Thanks for the review :)

On 2018-12-18 05:31, Doug Anderson wrote:

Hi,

On Mon, Dec 17, 2018 at 2:08 AM Sibi Sankar  
wrote:


Add power-domain bindings for Q6V5 MSS on MSM8996 and SDM845 SoCs.

Reviewed-by: Rob Herring 
Signed-off-by: Sibi Sankar 
---

v2:
  * Add load_state power-domain
  * List cx and mx power-domains for MSM8996

 .../devicetree/bindings/remoteproc/qcom,q6v5.txt | 16 


 1 file changed, 16 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt 
b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt

index 98894e6ad456..50695cd86397 100644
--- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt
+++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt
@@ -80,6 +80,22 @@ on the Qualcomm Hexagon core.
reference to the pll-supply regulator to be held 
on behalf

of the booting of the Hexagon core on MSM8996 SoCs

+- power-domains:
+   Usage: required
+   Value type: 
+   Definition: reference to the list of 2 power-domains for the 
modem

+   sub-system on MSM8996 SoCs


This is truly required for msm8996 SoCs?  The code I'm looking at
doesn't try to get these power domains for 8996 so presumably you're
breaking backward compatibility with old device tree files by making
this required now.  I don't personally know how widespread msm8996
usage is w/ upstream, so I'd let Bjorn comment on whether he thinks
this is OK.



This is one of the reasons why the dt node
for mss on 8996 has not been posted/merged upstream.
Hence backward compatibility is not broken yet
in mainline :) .. However it will break on official
linaro integration releases (old dt + new kernel)


As with the other patches in this series, I personally prefer less
prose and more lists / tables of exactly what is required for which
compatible string.


+   reference to the list of 4 power-domains for the 
modem

+   sub-system on SDM845 SoCs
+
+- power-domain-names:
+   Usage: required
+   Value type: 
+   Definition: must be "cx", "mx" for the modem sub-system on 
MSM8996

+   SoCs
+   must be "cx", "mx", "mss", "load_state" for the 
modem

+   sub-system on SDM845 SoCs


I haven't see a patch for using "load_state".  Can you point at it?  I
guess this was "aop" in your last version?



using load_state was Bjorn's suggestion and seemed
more appropriate than aop



-Doug


--
-- Sibi Sankar --
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [RFC PATCH net v3] net: phy: Fix the issue that netif always links up after resuming

2018-12-17 Thread Kunihiko Hayashi

Hi Heiner,

On Mon, 17 Dec 2018 19:43:31 +0100  wrote:

> On 17.12.2018 19:41, Heiner Kallweit wrote:
> > On 17.12.2018 07:41, Kunihiko Hayashi wrote:
> >> Hi,
> >>
> >> Gentle ping...
> >> Are there any comments about changes since v2?
> >>
> >> v2: https://www.spinics.net/lists/netdev/msg536926.html
> >>
> >> Thank you,
> >>
> >> On Mon, 3 Dec 2018 17:22:29 +0900  wrote:
> >>
> >>> Even though the link is down before entering hibernation,
> >>> there is an issue that the network interface always links up after 
> >>> resuming
> >>> from hibernation.
> >>>
> >>> The phydev->state is PHY_READY before enabling the network interface, so
> >>> the link is down. After resuming from hibernation, the phydev->state is
> >>> forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.
> >>>
> >>> This patch adds a new convenient function to check whether the PHY is in
> >>> a started state, and expects to solve the issue by changing phydev->state
> >>> to PHY_UP and calling phy_start_machine() only when the PHY is started.
> >>>
> > This convenience function and the related change to phy_stop() are part of
> > the following already and don't need to be part of your patch.
> > https://patchwork.ozlabs.org/patch/1014171/

I see. I'll follow your patch when necessary.

> >>> Suggested-by: Heiner Kallweit 
> >>> Signed-off-by: Kunihiko Hayashi 
> >>> ---
> >>>
> >>> Changes since v2:
> >>>  - add mutex lock/unlock for changing phydev->state
> >>>  - check whether the mutex is locked in phy_is_started()
> >>>  
> >>> Changes since v1:
> >>>  - introduce a new helper function phy_is_started() and use it instead of
> >>>checking link status
> >>>  - replace checking phydev->state with phy_is_started() in
> >>>phy_stop_machine()
> >>>
> >>> drivers/net/phy/phy.c|  2 +-
> >>>  drivers/net/phy/phy_device.c | 12 +---
> >>>  include/linux/phy.h  | 13 +
> >>>  3 files changed, 23 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> >>> index 1d73ac3..f484d03 100644
> >>> --- a/drivers/net/phy/phy.c
> >>> +++ b/drivers/net/phy/phy.c
> >>> @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev)
> >>>   cancel_delayed_work_sync(&phydev->state_queue);
> >>>  
> >>>   mutex_lock(&phydev->lock);
> >>> - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
> >>> + if (phy_is_started(phydev))
> >>>   phydev->state = PHY_UP;
> > 
> > I'm wondering whether we need to do this. If the PHY is attached,
> > then mdio_bus_phy_suspend() calls phy_stop_machine() which does
> > exactly the same. If the PHY is not attached, then we don't have
> > to do anything. Therefore I think we just have to do the same as
> > in mdio_bus_phy_resume():
> > 
> > if (phydev->attached_dev && phydev->adjust_link)
> > phy_start_machine(phydev);

Agreed.

Although the original code changed phydev->state,
it seems that it's only enough to
- call phy_stop_machine() in mdio_bus_phy_suspend()
- call phy_start_machine() in mdio_bus_phy_resume() and mdio_bus_phy_restore()
if the PHY is attached.

> > Can you test this?

I tested your code instead of applying my entire patch, and I confirmed
that the code solved the issue in my environment.

Do you make new patch instead of my patch?
(and I can add Reported-by: for the issue and Tested-by:)

Thank you,


> > 
> Sorry for the confusion, this comment is related to the next part
> of your patch.
> 
> >>>   mutex_unlock(&phydev->lock);
> >>>  }
> >>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> >>> index ab33d17..4897d24 100644
> >>> --- a/drivers/net/phy/phy_device.c
> >>> +++ b/drivers/net/phy/phy_device.c
> >>> @@ -309,10 +309,16 @@ static int mdio_bus_phy_restore(struct device *dev)
> >>>   return ret;
> >>>  
> >>>   /* The PHY needs to renegotiate. */
> >>> - phydev->link = 0;
> >>> - phydev->state = PHY_UP;
> >>> + mutex_lock(&phydev->lock);
> >>> + if (phy_is_started(phydev)) {
> >>> + phydev->state = PHY_UP;
> >>> + mutex_unlock(&phydev->lock);
> >>> + phydev->link = 0;
> >>> + phy_start_machine(phydev);
> >>> + } else {
> >>> + mutex_unlock(&phydev->lock);
> >>> + }
> >>>  
> >>> - phy_start_machine(phydev);
> >>>  
> >>>   return 0;
> >>>  }
> >>> diff --git a/include/linux/phy.h b/include/linux/phy.h
> >>> index 3ea87f7..dd21537 100644
> >>> --- a/include/linux/phy.h
> >>> +++ b/include/linux/phy.h
> >>> @@ -898,6 +898,19 @@ static inline bool phy_is_pseudo_fixed_link(struct 
> >>> phy_device *phydev)
> >>>  }
> >>>  
> >>>  /**
> >>> + * phy_is_started - Convenience function for testing whether a PHY is in
> >>> + * a started state
> >>> + * @phydev: the phy_device struct
> >>> + *
> >>> + * The caller must have taken the phy_device mutex lock.
> >>> + */
> >>> +static inline bool phy_is_started(struct phy_device *phydev)
> >>> +{
> >>> + WARN_ON(!mutex_is_locked(&phydev->lock));
> >>> + return phyd

Re: [PATCH 01/18] mfd: aat2870-core: Make it explicitly non-modular

2018-12-17 Thread jinyoungp


Acked-by : Jin Park 


Thanks,

Jinyoung.


On 12/18/18 5:31 AM, Paul Gortmaker wrote:

The Kconfig currently controlling compilation of this code is:

drivers/mfd/Kconfig:config MFD_AAT2870_CORE
drivers/mfd/Kconfig:bool "AnalogicTech AAT2870"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

We explicitly disallow a driver unbind, since that doesn't have a
sensible use case anyway, and it allows us to drop the ".remove"
code for non-modular drivers.

Since module_init was not in use by this code, the init ordering
remains unchanged with this commit.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Cc: Lee Jones 
Cc: Jin Park 
Signed-off-by: Paul Gortmaker 
Acked-by: Linus Walleij 
---
  drivers/mfd/aat2870-core.c | 40 +++-
  1 file changed, 3 insertions(+), 37 deletions(-)

diff --git a/drivers/mfd/aat2870-core.c b/drivers/mfd/aat2870-core.c
index 3ba19a45f199..9d3d90d386c2 100644
--- a/drivers/mfd/aat2870-core.c
+++ b/drivers/mfd/aat2870-core.c
@@ -20,7 +20,6 @@
   */
  
  #include 

-#include 
  #include 
  #include 
  #include 
@@ -349,18 +348,10 @@ static void aat2870_init_debugfs(struct aat2870_data 
*aat2870)
 "Failed to create debugfs register file\n");
  }
  
-static void aat2870_uninit_debugfs(struct aat2870_data *aat2870)

-{
-   debugfs_remove_recursive(aat2870->dentry_root);
-}
  #else
  static inline void aat2870_init_debugfs(struct aat2870_data *aat2870)
  {
  }
-
-static inline void aat2870_uninit_debugfs(struct aat2870_data *aat2870)
-{
-}
  #endif /* CONFIG_DEBUG_FS */
  
  static int aat2870_i2c_probe(struct i2c_client *client,

@@ -440,20 +431,6 @@ static int aat2870_i2c_probe(struct i2c_client *client,
return ret;
  }
  
-static int aat2870_i2c_remove(struct i2c_client *client)

-{
-   struct aat2870_data *aat2870 = i2c_get_clientdata(client);
-
-   aat2870_uninit_debugfs(aat2870);
-
-   mfd_remove_devices(aat2870->dev);
-   aat2870_disable(aat2870);
-   if (aat2870->uninit)
-   aat2870->uninit(aat2870);
-
-   return 0;
-}
-
  #ifdef CONFIG_PM_SLEEP
  static int aat2870_i2c_suspend(struct device *dev)
  {
@@ -492,15 +469,14 @@ static const struct i2c_device_id aat2870_i2c_id_table[] 
= {
{ "aat2870", 0 },
{ }
  };
-MODULE_DEVICE_TABLE(i2c, aat2870_i2c_id_table);
  
  static struct i2c_driver aat2870_i2c_driver = {

.driver = {
-   .name   = "aat2870",
-   .pm = &aat2870_pm_ops,
+   .name   = "aat2870",
+   .pm = &aat2870_pm_ops,
+   .suppress_bind_attrs= true,
},
.probe  = aat2870_i2c_probe,
-   .remove = aat2870_i2c_remove,
.id_table   = aat2870_i2c_id_table,
  };
  
@@ -509,13 +485,3 @@ static int __init aat2870_init(void)

return i2c_add_driver(&aat2870_i2c_driver);
  }
  subsys_initcall(aat2870_init);
-
-static void __exit aat2870_exit(void)
-{
-   i2c_del_driver(&aat2870_i2c_driver);
-}
-module_exit(aat2870_exit);
-
-MODULE_DESCRIPTION("Core support for the AnalogicTech AAT2870");
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Jin Park ");

Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

2018-12-17 Thread Darrick J. Wong

On Mon, Dec 17, 2018 at 10:34:43AM -0800, Matthew Wilcox wrote:
> On Mon, Dec 17, 2018 at 01:11:50PM -0500, Jerome Glisse wrote:
> > On Mon, Dec 17, 2018 at 08:58:19AM +1100, Dave Chinner wrote:
> > > Sure, that's a possibility, but that doesn't close off any race
> > > conditions because there can be DMA into the page in progress while
> > > the page is being bounced, right? AFAICT this ext3+DIF/DIX case is
> > > different in that there is no 3rd-party access to the page while it
> > > is under IO (ext3 arbitrates all access to it's metadata), and so
> > > nothing can actually race for modification of the page between
> > > submission and bouncing at the block layer.
> > > 
> > > In this case, the moment the page is unlocked, anyone else can map
> > > it and start (R)DMA on it, and that can happen before the bio is
> > > bounced by the block layer. So AFAICT, block layer bouncing doesn't
> > > solve the problem of racing writeback and DMA direct to the page we
> > > are doing IO on. Yes, it reduces the race window substantially, but
> > > it doesn't get rid of it.
> > 
> > So the event flow is:
> > - userspace create object that match a range of virtual address
> >   against a given kernel sub-system (let's say infiniband) and
> >   let's assume that the range is an mmap() of a regular file
> > - device driver do GUP on the range (let's assume it is a write
> >   GUP) so if the page is not already map with write permission
> >   in the page table than a page fault is trigger and page_mkwrite
> >   happens
> > - Once GUP return the page to the device driver and once the
> >   device driver as updated the hardware states to allow access
> >   to this page then from that point on hardware can write to the
> >   page at _any_ time, it is fully disconnected from any fs event
> >   like write back, it fully ignore things like page_mkclean
> > 
> > This is how it is to day, we allowed people to push upstream such
> > users of GUP. This is a fact we have to live with, we can not stop
> > hardware access to the page, we can not force the hardware to follow
> > page_mkclean and force a page_mkwrite once write back ends. This is
> > the situation we are inheriting (and i am personnaly not happy with
> > that).
> > 
> > >From my point of view we are left with 2 choices:
> > [C1] break all drivers that do not abide by the page_mkclean and
> >  page_mkwrite
> > [C2] mitigate as much as possible the issue
> > 
> > For [C2] the idea is to keep track of GUP per page so we know if we
> > can expect the page to be written to at any time. Here is the event
> > flow:
> > - driver GUP the page and program the hardware, page is mark as
> >   GUPed
> > ...
> > - write back kicks in on the dirty page, lock the page and every
> >   thing as usual , sees it is GUPed and inform the block layer to
> >   use a bounce page
> 
> No.  The solution John, Dan & I have been looking at is to take the
> dirty page off the LRU while it is pinned by GUP.  It will never be
> found for writeback.
> 
> That's not the end of the story though.  Other parts of the kernel (eg
> msync) also need to be taught to stay away from pages which are pinned
> by GUP.  But the idea is that no page gets written back to storage while
> it's pinned by GUP.  Only when the last GUP ends is the page returned
> to the list of dirty pages.

Errr... what does fsync do in the meantime?  Not write the page?
That would seem to break what fsync() is supposed to do.

--D

> > - block layer copy the page to a bounce page effectively creating
> >   a snapshot of what is the content of the real page. This allows
> >   everything in block layer that need stable content to work on
> >   the bounce page (raid, stripping, encryption, ...)
> > - once write back is done the page is not marked clean but stays
> >   dirty, this effectively disable things like COW for filesystem
> >   and other feature that expect page_mkwrite between write back.
> >   AFAIK it is believe that it is something acceptable
> 
> So none of this is necessary.
>

Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc

2018-12-17 Thread Hou Tao

Hi,

On 2018/12/17 18:51, Tetsuo Handa wrote:
> On 2018/12/17 18:33, Michal Hocko wrote:
>> On Sun 16-12-18 19:51:57, Matthew Wilcox wrote:
>> [...]
>>> Ah, yes, that makes perfect sense.  Thank you for the explanation.
>>>
>>> I wonder if the correct fix, however, is not to move the check for
>>> GFP_NOFS in out_of_memory() down to below the check whether to kill
>>> the current task.  That would solve your problem, and I don't _think_
>>> it would cause any new ones.  Michal, you touched this code last, what
>>> do you think?
>>
>> What do you mean exactly? Whether we kill a current task or something
>> else doesn't change much on the fact that NOFS is a reclaim restricted
>> context and we might kill too early. If the fs can do GFP_FS then it is
>> obviously a better thing to do because FS metadata can be reclaimed as
>> well and therefore there is potentially less memory pressure on
>> application data.
>>
> 
> I interpreted "to move the check for GFP_NOFS in out_of_memory() down to
> below the check whether to kill the current task" as
> 
> @@ -1077,15 +1077,6 @@ bool out_of_memory(struct oom_control *oc)
>   }
>  
>   /*
> -  * The OOM killer does not compensate for IO-less reclaim.
> -  * pagefault_out_of_memory lost its gfp context so we have to
> -  * make sure exclude 0 mask - all other users should have at least
> -  * ___GFP_DIRECT_RECLAIM to get here.
> -  */
> - if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
> - return true;
> -
> - /*
>* Check if there were limitations on the allocation (only relevant for
>* NUMA and memcg) that may require different handling.
>*/
> @@ -1104,6 +1095,19 @@ bool out_of_memory(struct oom_control *oc)
>   }
>  
>   select_bad_process(oc);
> +
> + /*
> +  * The OOM killer does not compensate for IO-less reclaim.
> +  * pagefault_out_of_memory lost its gfp context so we have to
> +  * make sure exclude 0 mask - all other users should have at least
> +  * ___GFP_DIRECT_RECLAIM to get here.
> +  */
> + if ((oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) && oc->chosen &&
> + oc->chosen != (void *)-1UL && oc->chosen != current) {
> + put_task_struct(oc->chosen);
> + return true;
> + }
> +
>   /* Found nothing?!?! */
>   if (!oc->chosen) {
>   dump_header(oc, NULL);
> 
> which is prefixed by "the correct fix is not".
> 
> Behaving like sysctl_oom_kill_allocating_task == 1 if __GFP_FS is not used
> will not be the correct fix. But ...
> 
> Hou Tao wrote:
>> There is no need to disable __GFP_FS in ->readpage:
>> * It's a read-only fs, so there will be no dirty/writeback page and
>>   there will be no deadlock against the caller's locked page
> 
> is read-only filesystem sufficient for safe to use __GFP_FS?
> 
> Isn't "whether it is safe to use __GFP_FS" depends on "whether fs locks
> are held or not" rather than "whether fs has dirty/writeback page or not" ?
> 
In my understanding (correct me if I am wrong), there are three ways through 
which
reclamation will invoked fs related code and may cause dead-lock:

(1) write-back dirty pages. Not possible for squashfs.
(2) the reclamation of inodes & dentries. The current file is in-use, so it 
will be not
reclaimed, and for other reclaimable inodes, squashfs_destroy_inode() will
be invoked and it doesn't take any locks.
(3) customized shrinker defined by fs. No customized shrinker in squashfs.

So my point is that even a page lock is already held by squashfs_readpage() and
reclamation invokes back to squashfs code, there will be no dead-lock, so it's
safe to use __GFP_FS.

Regards,
Tao

> .
>

[PATCH 2/2] irqchip: irq-renesas-intc-irqpin: convert to SPDX identifiers

2018-12-17 Thread Kuninori Morimoto

From: Kuninori Morimoto 

This patch updates license to use SPDX-License-Identifier
instead of verbose license text.

Signed-off-by: Kuninori Morimoto 
Reviewed-by: Simon Horman 
---
 drivers/irqchip/irq-renesas-intc-irqpin.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-renesas-intc-irqpin.c 
b/drivers/irqchip/irq-renesas-intc-irqpin.c
index c6e6c9e..8c03952 100644
--- a/drivers/irqchip/irq-renesas-intc-irqpin.c
+++ b/drivers/irqchip/irq-renesas-intc-irqpin.c
@@ -1,20 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * Renesas INTC External IRQ Pin Driver
  *
  *  Copyright (C) 2013 Magnus Damm
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
 #include 
-- 
2.7.4

[PATCH v2] ACPI / tables: table override from built-in initrd

2018-12-17 Thread Shunyong Yang

In some scenario, we need to build initrd with kernel in a single image.
This can simplify system deployment process by downloading the whole system
once, such as in IC verification.

This patch adds support to override ACPI tables from built-in initrd.

Cc: Joey Zheng 
Signed-off-by: Shunyong Yang 
---
v2: change "upgrade" to "override" as it's more accurate
---
 Documentation/acpi/initrd_table_override.txt |  4 
 drivers/acpi/Kconfig | 10 ++
 drivers/acpi/tables.c| 12 ++--
 include/linux/initrd.h   |  3 +++
 4 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/Documentation/acpi/initrd_table_override.txt 
b/Documentation/acpi/initrd_table_override.txt
index eb651a6aa285..324d5fb90a22 100644
--- a/Documentation/acpi/initrd_table_override.txt
+++ b/Documentation/acpi/initrd_table_override.txt
@@ -14,6 +14,10 @@ upgrade the ACPI execution environment that is defined by 
the ACPI tables
 via upgrading the ACPI tables provided by the BIOS with an instrumented,
 modified, more recent version one, or installing brand new ACPI tables.
 
+When building initrd with kernel in a single image, option
+ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
+purpose.
+
 For a full list of ACPI tables that can be upgraded/installed, take a look
 at the char *table_sigs[MAX_ACPI_SIGNATURE]; definition in
 drivers/acpi/tables.c.
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 7cea769c37df..3b362a1c7685 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -357,6 +357,16 @@ config ACPI_TABLE_UPGRADE
  initrd, therefore it's safe to say Y.
  See Documentation/acpi/initrd_table_override.txt for details
 
+config ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD
+   bool "Override ACPI tables from built-in initrd"
+   depends on ACPI_TABLE_UPGRADE
+   depends on INITRAMFS_SOURCE!="" && INITRAMFS_COMPRESSION=""
+   def_bool n
+   help
+ This option provides functionality to override arbitrary ACPI tables
+ from built-in uncompressed initrd.
+ See Documentation/acpi/initrd_table_override.txt for details
+
 config ACPI_DEBUG
bool "Debug Statements"
help
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 61203eebf3a1..f6a2c5ebabcd 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -473,14 +473,22 @@ static u8 __init acpi_table_checksum(u8 *buffer, u32 
length)
 
 void __init acpi_table_upgrade(void)
 {
-   void *data = (void *)initrd_start;
-   size_t size = initrd_end - initrd_start;
+   void *data;
+   size_t size;
int sig, no, table_nr = 0, total_offset = 0;
long offset = 0;
struct acpi_table_header *table;
char cpio_path[32] = "kernel/firmware/acpi/";
struct cpio_data file;
 
+   if (IS_ENABLED(CONFIG_ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD)) {
+   data = __initramfs_start;
+   size = __initramfs_size;
+   } else {
+   data = (void *)initrd_start;
+   size = initrd_end - initrd_start;
+   }
+
if (data == NULL || size == 0)
return;
 
diff --git a/include/linux/initrd.h b/include/linux/initrd.h
index 84b423044088..02d94aae54c7 100644
--- a/include/linux/initrd.h
+++ b/include/linux/initrd.h
@@ -22,3 +22,6 @@
 extern void free_initrd_mem(unsigned long, unsigned long);
 
 extern unsigned int real_root_dev;
+
+extern char __initramfs_start[];
+extern unsigned long __initramfs_size;
-- 
1.8.3.1

[PATCH 1/2] irqchip: irq-renesas-irqc: convert to SPDX identifiers

2018-12-17 Thread Kuninori Morimoto



From: Kuninori Morimoto 

This patch updates license to use SPDX-License-Identifier
instead of verbose license text.

Signed-off-by: Kuninori Morimoto 
Reviewed-by: Simon Horman 
---
 drivers/irqchip/irq-renesas-irqc.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/irqchip/irq-renesas-irqc.c 
b/drivers/irqchip/irq-renesas-irqc.c
index a4f1112..a449a7c 100644
--- a/drivers/irqchip/irq-renesas-irqc.c
+++ b/drivers/irqchip/irq-renesas-irqc.c
@@ -1,20 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * Renesas IRQC Driver
  *
  *  Copyright (C) 2013 Magnus Damm
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
 #include 
-- 
2.7.4

[PATCH 0/2][resend] irqchip: irq-renesas-xxx: convert to SPDX identifiers

2018-12-17 Thread Kuninori Morimoto



Hi Thomas, Jason, Marc

I have posted these and 2weeks passed.
Nothing happen, thus, I re-post again

Kuninori Morimoto (2):
  irqchip: irq-renesas-irqc: convert to SPDX identifiers
  irqchip: irq-renesas-intc-irqpin: convert to SPDX identifiers

 drivers/irqchip/irq-renesas-intc-irqpin.c | 14 +-
 drivers/irqchip/irq-renesas-irqc.c| 14 +-
 2 files changed, 2 insertions(+), 26 deletions(-)

-- 
2.7.4



Best regards
---
Kuninori Morimoto

[PATCH][resend] drm: dw-hdmi-i2s: convert to SPDX identifiers

2018-12-17 Thread Kuninori Morimoto



From: Kuninori Morimoto 

This patch updates license to use SPDX-License-Identifier
instead of verbose license text.

Signed-off-by: Kuninori Morimoto 
---
few weeks passed, nothing happen. I re-post this patch again.
I added Andrew on Cc

 drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index 8f9c8a6..2228689 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * dw-hdmi-i2s-audio.c
  *
  * Copyright (c) 2017 Renesas Solutions Corp.
  * Kuninori Morimoto 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
  */
 #include 
 
-- 
2.7.4

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Thu, Nov 15, 2018 at 5:08 PM Jarkko Sakkinen
 wrote:
>
> Intel Software Guard eXtensions (SGX) is a set of CPU instructions that
> can be used by applications to set aside private regions of code and
> data. The code outside the enclave is disallowed to access the memory
> inside the enclave by the CPU access control.

This is a very partial review.

> +int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
> + struct vm_area_struct **vma)
> +{
> +   struct vm_area_struct *result;
> +   struct sgx_encl *encl;
> +
> +   result = find_vma(mm, addr);
> +   if (!result || result->vm_ops != &sgx_vm_ops || addr < 
> result->vm_start)
> +   return -EINVAL;
> +
> +   encl = result->vm_private_data;
> +   *vma = result;
> +
> +   return encl ? 0 : -ENOENT;
> +}

I realize that this function may go away entirely but, if you keep it:
what are the locking rules?  What, if anything, prevents another
thread from destroying the enclave after sgx_encl_find() returns?

> +static int sgx_validate_secs(const struct sgx_secs *secs,
> +unsigned long ssaframesize)
> +{

...

> +   if (secs->attributes & SGX_ATTR_MODE64BIT) {
> +   if (secs->size > sgx_encl_size_max_64)
> +   return -EINVAL;
> +   } else {
> +   /* On 64-bit architecture allow 32-bit encls only in
> +* the compatibility mode.
> +*/
> +   if (!test_thread_flag(TIF_ADDR32))
> +   return -EINVAL;
> +   if (secs->size > sgx_encl_size_max_32)
> +   return -EINVAL;
> +   }

Why do we need the 32-bit-on-64-bit check?  In general, anything that
checks per-task or per-mm flags like TIF_ADDR32 is IMO likely to be
problematic.  You're allowing 64-bit enclaves in 32-bit tasks, so I'm
guessing you could just delete the check.

> +
> +   if (!(secs->xfrm & XFEATURE_MASK_FP) ||
> +   !(secs->xfrm & XFEATURE_MASK_SSE) ||
> +   (((secs->xfrm >> XFEATURE_BNDREGS) & 1) !=
> +((secs->xfrm >> XFEATURE_BNDCSR) & 1)) ||
> +   (secs->xfrm & ~sgx_xfrm_mask))
> +   return -EINVAL;

Do we need to check that the enclave doesn't use xfeatures that the
kernel doesn't know about?  Or are they all safe by design in enclave
mode?

> +static int sgx_encl_pm_notifier(struct notifier_block *nb,
> +   unsigned long action, void *data)
> +{
> +   struct sgx_encl *encl = container_of(nb, struct sgx_encl, 
> pm_notifier);
> +
> +   if (action != PM_SUSPEND_PREPARE && action != PM_HIBERNATION_PREPARE)
> +   return NOTIFY_DONE;

Hmm.  There's an argument to made that omitting this would better
exercise the code that handles fully asynchronous loss of an enclave.
Also, I think you're unnecessarily killing enclaves when suspend is
attempted but fails.

> +
> +static int sgx_get_key_hash(const void *modulus, void *hash)
> +{
> +   struct crypto_shash *tfm;
> +   int ret;
> +
> +   tfm = crypto_alloc_shash("sha256", 0, CRYPTO_ALG_ASYNC);
> +   if (IS_ERR(tfm))
> +   return PTR_ERR(tfm);
> +
> +   ret = __sgx_get_key_hash(tfm, modulus, hash);
> +
> +   crypto_free_shash(tfm);
> +   return ret;
> +}
> +

I'm so sorry you had to deal with this API.  Once Zinc lands, you
could clean this up :)


> +static int sgx_encl_get(unsigned long addr, struct sgx_encl **encl)
> +{
> +   struct mm_struct *mm = current->mm;
> +   struct vm_area_struct *vma;
> +   int ret;
> +
> +   if (addr & (PAGE_SIZE - 1))
> +   return -EINVAL;
> +
> +   down_read(&mm->mmap_sem);
> +
> +   ret = sgx_encl_find(mm, addr, &vma);
> +   if (!ret) {
> +   *encl = vma->vm_private_data;
> +
> +   if ((*encl)->flags & SGX_ENCL_SUSPEND)
> +   ret = SGX_POWER_LOST_ENCLAVE;
> +   else
> +   kref_get(&(*encl)->refcount);
> +   }

Hmm.  This version has explicit refcounting.

> +static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> +   vma->vm_ops = &sgx_vm_ops;
> +   vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO |
> +VM_DONTCOPY;
> +
> +   return 0;
> +}
> +
> +static unsigned long sgx_get_unmapped_area(struct file *file,
> +  unsigned long addr,
> +  unsigned long len,
> +  unsigned long pgoff,
> +  unsigned long flags)
> +{
> +   if (len < 2 * PAGE_SIZE || (len & (len - 1)))
> +   return -EINVAL;
> +
> +   if (len > sgx_encl_size_max_64)
> +   return -EINVAL;
> +
> +   if (len > sgx_encl_size_max_32 && test_thread_flag(TIF_ADDR32))
> +   return -EINVAL;

Generally speaking, th

Re: [PATCH v2 2/7] dt-bindings: remoteproc: qcom: Add clock bindings for Q6V5

2018-12-17 Thread Sibi Sankar


Hi Doug,
Thanks for the review :)

On 2018-12-18 05:29, Doug Anderson wrote:

Hi,

On Mon, Dec 17, 2018 at 2:07 AM Sibi Sankar  
wrote:


Add missing clock bindings for Q6V5 MSS on SDM845 SoCs.

Signed-off-by: Sibi Sankar 
---
 .../devicetree/bindings/remoteproc/qcom,q6v5.txt   | 10 
+++---

 1 file changed, 7 insertions(+), 3 deletions(-)


Fixes: 9f058fa2efb1 ("remoteproc: qcom: Add support for mss remoteproc
on msm8996")
Fixes: fb22022ff63d ("dt-bindings: remoteproc: Add Q6v5 Modem PIL
binding for SDM845")

...it probably doesn't matter too much but if we wanted to be really
careful we could split into two patches, one for the msm8996 and one
for sdm845.  I don't think people care that much about stable
backports of bindings though (someone can feel free to correct me)...



I did think of splitting this up but it doesn't
actually fix 9f058fa2efb1 yet. I noticed a few missing
clocks for mss on 8996 when I did a diff with the
corresponding CAF tree. Hence couldn't add bindings for
it. Will add them once I validate mss on 8996 with the
necessary changes.



diff --git 
a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt 
b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt

index 9ff5b0309417..780adc043b37 100644
--- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt
+++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt
@@ -39,13 +39,17 @@ on the Qualcomm Hexagon core.
 - clocks:
Usage: required
Value type: 
-   Definition: reference to the iface, bus and mem clocks to be 
held on

-   behalf of the booting of the Hexagon core
+   Definition: reference to the list of 4 clocks for the modem 
sub-system
+   reference to the list of 8 clocks for the modem 
sub-system

+   on SDM845 SoCs


The above is confusing because you don't list the SoCs that are
supposed to use the 4 clocks.  How about instead:

Definition: reference to the clocks that match clock-names



AFAIK, only the exceptions are captured. I am fine
with both, I'll wait for Bjorn/Rob's preference.




 - clock-names:
Usage: required
Value type: 
-   Definition: must be "iface", "bus", "mem"
+   Definition: must be "iface", "bus", "mem", "xo" for the modem 
sub-system
+   must be "iface", "bus", "mem", "gpll0_mss", 
"snoc_axi",
+   "mnoc_axi", "prng", "xo" for the modem sub-system 
on SDM845

+   SoCs


Same here where it's confusing.  ...but also, it it correct?  As far
as I can tell you're missing msm8996.  It's better to just be explicit
and list each one, ideally without all the prose.

Definition: The clocks needed depend on the compatible string:



ditto


qcom,sdm845-mss-pil: "xo", "prng", "iface", "snoc_axi", "bus", "mem",
"gpll0_mss", "mnoc_axi"
qcom,msm8996-mss-pil: "xo", "pnoc", "iface", "bus", "mem", 
"gpll0_mss_clk"


ditto


qcom,msm8974-mss-pil: "xo", "iface", "bus", "mem"
qcom,msm8916-mss-pil: "xo", "iface", "bus", "mem"
qcom,q6v5-pil: "xo", "iface", "bus", "mem"

...as far as I can tell this binding is supposed to account for
"qcom,ipq8074-wcss-pil" too but it seems that one doesn't have
clock-names.



Yeah the lack of clocks have to be documented
for ipq8074-wcss-pil.. will do it in v3



-Doug


--
-- Sibi Sankar --
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

[v2, PATCH 2/2] net-next: stmmac: dwmac-mediatek: remove fine-tune property

2018-12-17 Thread Biao Huang

1. remove fine-tune property and related setting to simplify
the timing adjustment flow.
2. set timing value according to the value from device tree,
and will not care whether PHY insert internal delay.

Signed-off-by: Biao Huang 
---
 .../net/ethernet/stmicro/stmmac/dwmac-mediatek.c   |   71 +++-
 1 file changed, 24 insertions(+), 47 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
index e400cbd..bf25629 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
@@ -44,7 +44,6 @@ struct mac_delay_struct {
u32 rx_delay;
bool tx_inv;
bool rx_inv;
-   bool fine_tune;
 };
 
 struct mediatek_dwmac_plat_data {
@@ -105,16 +104,28 @@ static int mt2712_set_interface(struct 
mediatek_dwmac_plat_data *plat)
return 0;
 }
 
-static void mt2712_delay_ps2stage(struct mac_delay_struct *mac_delay)
+static void mt2712_delay_ps2stage(struct mediatek_dwmac_plat_data *plat)
 {
-   if (mac_delay->fine_tune) {
-   /* 170ps per stage for fine tune delay macro circuit*/
-   mac_delay->tx_delay /= 170;
-   mac_delay->rx_delay /= 170;
-   } else {
-   /* 550ps per stage for coarse tune delay macro circuit*/
+   struct mac_delay_struct *mac_delay = &plat->mac_delay;
+
+   switch (plat->phy_mode) {
+   case PHY_INTERFACE_MODE_MII:
+   case PHY_INTERFACE_MODE_RMII:
+   /* 550ps per stage for MII/RMII */
mac_delay->tx_delay /= 550;
mac_delay->rx_delay /= 550;
+   break;
+   case PHY_INTERFACE_MODE_RGMII:
+   case PHY_INTERFACE_MODE_RGMII_TXID:
+   case PHY_INTERFACE_MODE_RGMII_RXID:
+   case PHY_INTERFACE_MODE_RGMII_ID:
+   /* 170ps per stage for RGMII */
+   mac_delay->tx_delay /= 170;
+   mac_delay->rx_delay /= 170;
+   break;
+   default:
+   dev_err(plat->dev, "phy interface not supported\n");
+   break;
}
 }
 
@@ -123,7 +134,7 @@ static int mt2712_set_delay(struct mediatek_dwmac_plat_data 
*plat)
struct mac_delay_struct *mac_delay = &plat->mac_delay;
u32 delay_val = 0, fine_val = 0;
 
-   mt2712_delay_ps2stage(mac_delay);
+   mt2712_delay_ps2stage(plat);
 
switch (plat->phy_mode) {
case PHY_INTERFACE_MODE_MII:
@@ -167,13 +178,10 @@ static int mt2712_set_delay(struct 
mediatek_dwmac_plat_data *plat)
fine_val = ETH_RMII_DLY_TX_INV;
break;
case PHY_INTERFACE_MODE_RGMII:
-   /* the PHY is not responsible for inserting any internal
-* delay by itself in PHY_INTERFACE_MODE_RGMII case,
-* so Ethernet MAC will insert delays for both transmit
-* and receive path here.
-*/
-   if (mac_delay->fine_tune)
-   fine_val = ETH_FINE_DLY_GTXC | ETH_FINE_DLY_RXC;
+   case PHY_INTERFACE_MODE_RGMII_TXID:
+   case PHY_INTERFACE_MODE_RGMII_RXID:
+   case PHY_INTERFACE_MODE_RGMII_ID:
+   fine_val = ETH_FINE_DLY_GTXC | ETH_FINE_DLY_RXC;
 
delay_val |= FIELD_PREP(ETH_DLY_GTXC_ENABLE, 
!!mac_delay->tx_delay);
delay_val |= FIELD_PREP(ETH_DLY_GTXC_STAGES, 
mac_delay->tx_delay);
@@ -183,36 +191,6 @@ static int mt2712_set_delay(struct 
mediatek_dwmac_plat_data *plat)
delay_val |= FIELD_PREP(ETH_DLY_RXC_STAGES, 
mac_delay->rx_delay);
delay_val |= FIELD_PREP(ETH_DLY_RXC_INV, mac_delay->rx_inv);
break;
-   case PHY_INTERFACE_MODE_RGMII_TXID:
-   /* the PHY should insert an internal delay for the transmit
-* path in PHY_INTERFACE_MODE_RGMII_TXID case,
-* so Ethernet MAC will insert the delay for receive path here.
-*/
-   if (mac_delay->fine_tune)
-   fine_val = ETH_FINE_DLY_RXC;
-
-   delay_val |= FIELD_PREP(ETH_DLY_RXC_ENABLE, 
!!mac_delay->rx_delay);
-   delay_val |= FIELD_PREP(ETH_DLY_RXC_STAGES, 
mac_delay->rx_delay);
-   delay_val |= FIELD_PREP(ETH_DLY_RXC_INV, mac_delay->rx_inv);
-   break;
-   case PHY_INTERFACE_MODE_RGMII_RXID:
-   /* the PHY should insert an internal delay for the receive
-* path in PHY_INTERFACE_MODE_RGMII_RXID case,
-* so Ethernet MAC will insert the delay for transmit path here.
-*/
-   if (mac_delay->fine_tune)
-   fine_val = ETH_FINE_DLY_GTXC;
-
-   delay_val |= FIELD_PREP(ETH_DLY_GTXC_ENABLE, 
!!mac_delay->tx_delay);
-   delay_val |= FIELD_PREP(ETH_DLY_GTXC_STAGES, 
mac_delay->tx_delay);
-   delay_val |= FIELD_PREP(ETH_DLY_GTXC_INV, mac_delay-

[v2, PATCH 1/2] dt-binding: mediatek-dwmac: add binding document for MediaTek MT2712 DWMAC

2018-12-17 Thread Biao Huang

The commit adds the device tree binding documentation for the MediaTek DWMAC
found on MediaTek MT2712.

Signed-off-by: Biao Huang 
---
 .../devicetree/bindings/net/mediatek-dwmac.txt |   78 
 1 file changed, 78 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/mediatek-dwmac.txt

diff --git a/Documentation/devicetree/bindings/net/mediatek-dwmac.txt 
b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
new file mode 100644
index 000..8a08621
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
@@ -0,0 +1,78 @@
+MediaTek DWMAC glue layer controller
+
+This file documents platform glue layer for stmmac.
+Please see stmmac.txt for the other unchanged properties.
+
+The device node has following properties.
+
+Required properties:
+- compatible:  Should be "mediatek,mt2712-gmac" for MT2712 SoC
+- reg:  Address and length of the register set for the device
+- interrupts:  Should contain the MAC interrupts
+- interrupt-names: Should contain a list of interrupt names corresponding to
+   the interrupts in the interrupts property, if available.
+   Should be "macirq" for the main MAC IRQ
+- clocks: Must contain a phandle for each entry in clock-names.
+- clock-names: The name of the clock listed in the clocks property. These are
+   "axi", "apb", "mac_main", "ptp_ref" for MT2712 SoC
+- mac-address: See ethernet.txt in the same directory
+- phy-mode: See ethernet.txt in the same directory
+- mediatek,pericfg: A phandle to the syscon node that control ethernet
+   interface and timing delay.
+
+Optional properties:
+- mediatek,tx-delay-ps: TX clock delay macro value. Default is 0.
+   It should be defined for RGMII/MII interface.
+- mediatek,rx-delay-ps: RX clock delay macro value. Default is 0.
+   It should be defined for RGMII/MII/RMII interface.
+Both delay properties need to be a multiple of 170 for RGMII interface,
+or will round down. Range 0~31*170.
+Both delay properties need to be a multiple of 550 for MII/RMII interface,
+or will round down. Range 0~31*550.
+
+- mediatek,rmii-rxc: boolean property, if present indicates that the RMII
+   reference clock, which is from external PHYs, is connected to RXC pin
+   on MT2712 SoC.
+   Otherwise, is connected to TXC pin.
+- mediatek,txc-inverse: boolean property, if present indicates that
+   1. tx clock will be inversed in MII/RGMII case,
+   2. tx clock inside MAC will be inversed relative to reference clock
+  which is from external PHYs in RMII case, and it rarely happen.
+- mediatek,rxc-inverse: boolean property, if present indicates that
+   1. rx clock will be inversed in MII/RGMII case.
+   2. reference clock will be inversed when arrived at MAC in RMII case.
+- assigned-clocks: mac_main and ptp_ref clocks
+- assigned-clock-parents: parent clocks of the assigned clocks
+
+Example:
+   eth: ethernet@1101c000 {
+   compatible = "mediatek,mt2712-gmac";
+   reg = <0 0x1101c000 0 0x1300>;
+   interrupts = ;
+   interrupt-names = "macirq";
+   phy-mode ="rgmii";
+   mac-address = [00 55 7b b5 7d f7];
+   clock-names = "axi",
+ "apb",
+ "mac_main",
+ "ptp_ref",
+ "ptp_top";
+   clocks = <&pericfg CLK_PERI_GMAC>,
+<&pericfg CLK_PERI_GMAC_PCLK>,
+<&topckgen CLK_TOP_ETHER_125M_SEL>,
+<&topckgen CLK_TOP_ETHER_50M_SEL>;
+   assigned-clocks = <&topckgen CLK_TOP_ETHER_125M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_SEL>;
+   assigned-clock-parents = <&topckgen CLK_TOP_ETHERPLL_125M>,
+<&topckgen CLK_TOP_APLL1_D3>;
+   mediatek,pericfg = <&pericfg>;
+   mediatek,tx-delay-ps = <1530>;
+   mediatek,rx-delay-ps = <1530>;
+   mediatek,rmii-rxc;
+   mediatek,txc-inverse;
+   mediatek,rxc-inverse;
+   snps,txpbl = <32>;
+   snps,rxpbl = <32>;
+   snps,reset-gpio = <&pio 87 GPIO_ACTIVE_LOW>;
+   snps,reset-active-low;
+   };
-- 
1.7.9.5

[v2, PATCH 0/2] add ethernet binding and modify ethernet driver for mt2712

2018-12-17 Thread Biao Huang

changes in v2 as comments from Sean:
1. fix typo.
2. use capital letters for RMII/MII/RGMII in driver and bindings.

v1:
This new series is the result of discussion in:
http://lkml.org/lkml/2018/12/13/1007
http://lkml.org/lkml/2018/12/14/53

1. ethernet binding file move to this series.
2. remove fine tune property in device tree
3. remove fine tune flow in ethernet driver
4. set rgmii timing according to the value in device tree,
and don't care whether phy insert internal delay  or not.

Biao Huang (2):
  dt-binding: mediatek-dwmac: add binding document for MediaTek MT2712
DWMAC
  net-next: stmmac: dwmac-mediatek: remove fine-tune property

 .../devicetree/bindings/net/mediatek-dwmac.txt |   78 
 .../net/ethernet/stmicro/stmmac/dwmac-mediatek.c   |   71 ++
 2 files changed, 102 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/mediatek-dwmac.txt

-- 
1.7.9.5

Re: [PATCH v12 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-12-17 Thread Viresh Kumar

Hi Stephen,

On 13-12-18, 02:12, Stephen Boyd wrote:
> Quoting Viresh Kumar (2018-12-13 02:05:06)
> > On 13-12-18, 01:58, Stephen Boyd wrote:
> > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
> > > upon bringing the policy online the second time. I guess cpufreq_stats
> > > aren't able to be freed from there because they take locks in different
> > > order vs. the normal path?
> > 
> > Please share the lockdep report and the steps to reproduce it. I will
> > see if I can simulate the failure forcefully..
> > 
> 
> It's on a v4.19 kernel with this cpufreq hw driver backported to it. I
> think all it takes is to return an error the second time the policy is
> initialized when cpufreq_online() calls into the cpufreq driver.
> 
>  ==
>  WARNING: possible circular locking dependency detected
>  4.19.8 #61 Tainted: GW
>  --
>  cpuhp/5/36 is trying to acquire lock:
>  3e901e8a (kn->count#326){}, at: 
> kernfs_remove_by_name_ns+0x44/0x80
> 
>  but task is already holding lock:
>  dd7f52c3 (&policy->rwsem){}, at: cpufreq_policy_free+0x17c/0x1cc
> 
>  which lock already depends on the new lock.
> 
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 (&policy->rwsem){}:
> down_read+0x50/0xcc
> show+0x30/0x78
> sysfs_kf_seq_show+0x17c/0x25c
> kernfs_seq_show+0xb4/0xf8
> seq_read+0x4a8/0x8f0
> kernfs_fop_read+0xe0/0x360
> __vfs_read+0x80/0x328
> vfs_read+0xd0/0x1d4
> ksys_read+0x88/0x118
> __arm64_sys_read+0x4c/0x5c
> el0_svc_common+0x124/0x1c4
> el0_svc_compat_handler+0x64/0x8c
> el0_svc_compat+0x8/0x18

I failed to reproduce it over linux/next.

I had the following changes over linux/next:
https://pastebin.ubuntu.com/p/zkVm77PGdY/

I also did savedefconfig to show what all I changed in it. I faked multiple
clusters on my hikey960 board, which is not big little..

And here is the command list from history that I ran after boot.

  501  grep . /sys/devices/system/cpu/cpufreq/*/*
  502  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  503  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  504  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  505  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  506  grep . /sys/devices/system/cpu/cpufreq/*/*
  507  grep . /sys/devices/system/cpu/cpufreq/*/*
  508  echo 0 > /sys/devices/system/cpu/cpu4/online 
  509  echo 0 > /sys/devices/system/cpu/cpu5/online 
  510  echo 0 > /sys/devices/system/cpu/cpu6/online 
  511  echo 0 > /sys/devices/system/cpu/cpu7/online 
  512  grep . /sys/devices/system/cpu/cpufreq/*/*
  513  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  514  grep . /sys/devices/system/cpu/cpufreq/*/*
  515  echo 1 > /sys/devices/system/cpu/cpu4/online 
  516  grep . /sys/devices/system/cpu/cpufreq/*/*
  517  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  518  dmesg 

-- 
viresh

linux-next: build warning after merge of the gpio tree

2018-12-17 Thread Stephen Rothwell

Hi Linus,

After merging the gpio tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/gpio/gpiolib-acpi.c: In function 'acpi_gpio_adr_space_handler':
drivers/gpio/gpiolib-acpi.c:911:8: warning: unused variable 'err' 
[-Wunused-variable]
int err;
^~~

Introduced by commit

  21abf103818a ("gpio: Pass a flag to gpiochip_request_own_desc()")

-- 
Cheers,
Stephen Rothwell


pgpku3TQLB4lw.pgp
Description: OpenPGP digital signature

Re: [PATCH net-next] fou: Prevent unbounded recursion in GUE error handler

2018-12-17 Thread David Miller

From: Stefano Brivio 
Date: Tue, 18 Dec 2018 00:13:17 +0100

> Handling exceptions for direct UDP encapsulation in GUE (that is,
> UDP-in-UDP) leads to unbounded recursion in the GUE exception handler,
> syzbot reported.
> 
> While draft-ietf-intarea-gue-06 doesn't explicitly forbid direct
> encapsulation of UDP in GUE, it probably doesn't make sense to set up GUE
> this way, and it's currently not even possible to configure this.
> 
> Skip exception handling if the GUE proto/ctype field is set to the UDP
> protocol number. Should we need to handle exceptions for UDP-in-GUE one
> day, we might need to either explicitly set a bound for recursion, or
> implement a special iterative handling for these cases.
> 
> Reported-and-tested-by: syzbot+43f6755d1c2e62743...@syzkaller.appspotmail.com
> Fixes: b8a51b38e4d4 ("fou, fou6: ICMP error handlers for FoU and GUE")
> Signed-off-by: Stefano Brivio 

Applied, thanks.

Re: rcu_preempt caused oom

2018-12-17 Thread Paul E. McKenney

On Tue, Dec 18, 2018 at 02:46:43AM +, Zhang, Jun wrote:
> Hello, paul
> 
> In softirq context, and current is rcu_preempt-10,  rcu_gp_kthread_wake don't 
> wakeup rcu_preempt.
> Maybe next patch could fix it. Please help review.
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 0b760c1..98f5b40 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1697,7 +1697,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state 
> *rsp, struct rcu_node *rnp)
>   */
>  static void rcu_gp_kthread_wake(struct rcu_state *rsp)
>  {
> -   if (current == rsp->gp_kthread ||
> +   if (((current == rsp->gp_kthread) && !in_softirq()) ||

Close, but not quite.  Please see below.

> !READ_ONCE(rsp->gp_flags) ||
> !rsp->gp_kthread)
> return;
> 
> [44932.311439, 0][ rcu_preempt]  rcu_preempt-10[001] .n.. 
> 44929.401037: rcu_grace_period: rcu_preempt 19063548 reqwait
> ..
> [44932.311517, 0][ rcu_preempt]  rcu_preempt-10[001] d.s2 
> 44929.402234: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 
> Startleaf
> [44932.311536, 0][ rcu_preempt]  rcu_preempt-10[001] d.s2 
> 44929.402237: rcu_future_grace_period: rcu_preempt 19063548 19063552 0 0 3 
> Startedroot

Good catch!  If the rcu_preempt kthread had just entered the function
swait_event_idle_exclusive(), which had just called __swait_event_idle()
which had just called ___swait_event(), which had just gotten done
checking the "condition", then yes, the rcu_preempt kthread could
sleep forever.  This is a very narrow race window, but that matches
your experience with its not happening often -- and my experience with
it not happening at all.

However, for this to happen, the wakeup must happen within a softirq
handler that executes upon return from an interrupt that interrupted
___swait_event() just after the "if (condition)".  For this, we don't want
in_softirq() but rather in_serving_softirq(), as shown in the patch below.
The patch you have above could result in spurious wakeups, as it is
checking for bottom halves being disabled, not just executing within a
softirq handler.  Which might be better than not having enough wakeups,
but let's please try for just the right number.  ;-)

So could you please instead test the patch below?

And if it works, could I please have your Signed-off-by so that I can
queue it?  My patch is quite clearly derived from yours, after all!
And you should get credit for finding the problem and arriving at an
approximate fix, after all.

Thanx, Paul

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e9392a9d6291..b9205b40b621 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1722,7 +1722,7 @@ static bool rcu_future_gp_cleanup(struct rcu_state *rsp, 
struct rcu_node *rnp)
  */
 static void rcu_gp_kthread_wake(struct rcu_state *rsp)
 {
-   if (current == rsp->gp_kthread ||
+   if ((current == rsp->gp_kthread && !in_serving_softirq()) ||
!READ_ONCE(rsp->gp_flags) ||
!rsp->gp_kthread)
return;

RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND Flash Controller

2018-12-17 Thread Naga Sureshkumar Relli

Hi Miquel,

> -Original Message-
> From: Miquel Raynal [mailto:miquel.ray...@bootlin.com]
> Sent: Monday, December 17, 2018 10:11 PM
> To: Naga Sureshkumar Relli 
> Cc: Boris Brezillon ; r...@kernel.org; 
> rich...@nod.at; linux-
> ker...@vger.kernel.org; marek.va...@gmail.com; linux-...@lists.infradead.org;
> nagasures...@gmail.com; Michal Simek ;
> computersforpe...@gmail.com; dw...@infradead.org
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for 
> Arasan
> NAND Flash Controller
> 
> Hi Naga,
> 
> [...]
> 
> > Inserted biterror @ 48/7
> > Successfully corrected 25 bit errors per subpage Inserted biterror @
> > 50/7 ECC failure, invalid data despite read success
> > root@xilinx-zc1751-dc2-2018_1:~#
> >
> > But even in this case also, driver is saying ECC failure but read success.
> > That means controller is able to detect errors on read page up to 24 bit 
> > only.
> > After that there is no way to say to the upper layers that the page is bad 
> > because of the
> limitation in the controller.
> 
> This is more than a "limitation", the design is broken. I am not sure how to 
> support such
> controller, and I am not sure if we even want to.

The number of errors that are correctable is limited by a parameter 't'(total 
number of errors),
If there is a condition that the number of errors greater than 't', then the 
controller won't be able to detect that.
I guess this concept is same for other controllers as well.
In Arasan it is limited to 24-bit.

Even, in case of Hamming, it is 1-bit error correction and 2-bit error 
detection.
What will happen if there are multiple errors(greater than 2-bit)?

Thanks,
Naga Sureshkumar Relli
> 
> > Could you please suggest any alternative to report the errors in that case?
> 
> Shall we support the controller without the hw ECC engine? Boris, any 
> thoughts?
> 
> 
> Thanks,
> Miquèl

[PATCH] kbuild: fix false positive warning/error about missing libelf

2018-12-17 Thread Masahiro Yamada

For the same reason as commit 25896d073d8a ("x86/build: Fix compiler
support check for CONFIG_RETPOLINE"), you cannot put this $(error ...)
into the parse stage of the top Makefile.

Perhaps I'd propose a more sophisticated solution later, but this is
the best I can do for now.

Link: https://lkml.org/lkml/2017/12/25/211
Reported-by: Paul Gortmaker 
Reported-by: Bernd Edlinger 
Reported-by: Qian Cai 
Cc: Josh Poimboeuf 
Signed-off-by: Masahiro Yamada 
---

 Makefile | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 56d5270..d45856f 100644
--- a/Makefile
+++ b/Makefile
@@ -962,11 +962,6 @@ ifdef CONFIG_STACK_VALIDATION
   ifeq ($(has_libelf),1)
 objtool_target := tools/objtool FORCE
   else
-ifdef CONFIG_UNWINDER_ORC
-  $(error "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please 
install libelf-dev, libelf-devel or elfutils-libelf-devel")
-else
-  $(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install 
libelf-dev, libelf-devel or elfutils-libelf-devel")
-endif
 SKIP_STACK_VALIDATION := 1
 export SKIP_STACK_VALIDATION
   endif
@@ -1125,6 +1120,14 @@ uapi-asm-generic:
 
 PHONY += prepare-objtool
 prepare-objtool: $(objtool_target)
+ifeq ($(SKIP_STACK_VALIDATION),1)
+ifdef CONFIG_UNWINDER_ORC
+   @echo "error: Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, 
please install libelf-dev, libelf-devel or elfutils-libelf-devel" >&2
+   @false
+else
+   @echo "warning: Cannot use CONFIG_STACK_VALIDATION=y, please install 
libelf-dev, libelf-devel or elfutils-libelf-devel" >&2
+endif
+endif
 
 # Generate some files
 # ---
-- 
2.7.4

[PATCH] Fix mm->owner point to a tsk that has been free

2018-12-17 Thread gchen . guomin

From: guomin chen 

When mm->owner is modified by exit_mm, if the new owner directly calls
unuse_mm to exit, it will cause Use-After-Free. Due to the unuse_mm()
directly sets tsk->mm=NULL.

 Under normal circumstances,When do_exit exits, mm->owner will
 be updated on exit_mm(). but when the kernel process calls
 unuse_mm() and then exits,mm->owner cannot be updated. And it
 will point to a task that has been released.

The current issue flow is as follows: (Process A,B,C use the same mm)
Process C  Process A Process B
qemu-system-x86_64: kernel:vhost_net  kernel: vhost_net
open /dev/vhost-net
  VHOST_SET_OWNER   create kthread vhost-%d  create kthread vhost-%d
  network init   use_mm()  use_mm()
   ...   ...
   Abnormal exited
   ...
  do_exit
  exit_mm()
  update mm->owner to A
  exit_files()
   close_files()
   kthread_should_stop() unuse_mm()
Stop Process A   tsk->mm=NULL
 do_exit()
 can't update owner
 A exit completed  vhost-%d  rcv first package
   vhost-%d build rcv buffer for vq
   page fault
   access mm & mm->owner
   NOW,mm->owner still pointer A
   kernel UAF
stop Process B

Although I am having this issue on vhost_net,But it affects all users of
unuse_mm.

Cc: "Eric W. Biederman" 
Cc: Andrew Morton 
Cc: "Luis R. Rodriguez" 
Cc: Dominik Brodowski 
Cc: Arnd Bergmann 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Christoph Hellwig 
Signed-off-by: guomin chen 
---
 mm/mmu_context.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/mmu_context.c b/mm/mmu_context.c
index 3e612ae..9eb81aa 100644
--- a/mm/mmu_context.c
+++ b/mm/mmu_context.c
@@ -60,5 +60,6 @@ void unuse_mm(struct mm_struct *mm)
/* active_mm is still 'mm' */
enter_lazy_tlb(mm, tsk);
task_unlock(tsk);
+   mm_update_next_owner(mm);
 }
 EXPORT_SYMBOL_GPL(unuse_mm);
-- 
1.8.3.1

Re: [PATCH v2 1/7] dt-bindings: soc: qcom: Add remote-pid binding for GLINK SMEM

2018-12-17 Thread Sibi Sankar


Hi Doug,
Thanks for the review :)

On 2018-12-18 05:29, Doug Anderson wrote:

Hi,

On Mon, Dec 17, 2018 at 2:07 AM Sibi Sankar  
wrote:


Add missing qcom,remote-pid dt binding required for GLINK SMEM
which specifies the remote endpoint of the GLINK edge.

Signed-off-by: Sibi Sankar 
---
 Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt | 5 +
 1 file changed, 5 insertions(+)


Fixes: 2b41d6c8e696 ("dt-bindings: soc: qcom: Extend GLINK to cover 
SMEM")



diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt 
b/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt

index 0b8cc533ca83..59ae603ba520 100644
--- a/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,glink.txt
@@ -21,6 +21,11 @@ edge.
Definition: should specify the IRQ used by the remote 
processor to
signal this processor about communication related 
events


+- qcom,remote-pid:
+   Usage: required for glink-smem
+   Value type: 
+   Definition: specifies the identfier of the remote endpoint of 
this edge


s/identfier/identifier/



missed this, will correct it in v3.



Other than the typo this seems right to me.  Feel free to add my
Reviewed-by tag when that's fixed.


-Doug


--
-- Sibi Sankar --
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [PATCH] CIFS: return correct errors when pinning memory failed for direct I/O

2018-12-17 Thread Steve French

merged into cifs-2.6.git for-next

On Sun, Dec 16, 2018 at 4:44 PM Long Li  wrote:
>
> From: Long Li 
>
> When pinning memory failed, we should return the correct error code and
> rewind the SMB credits.
>
> Reported-by: Murphy Zhou 
> Signed-off-by: Long Li 
> Cc: sta...@vger.kernel.org
> Cc: Murphy Zhou 
> ---
>  fs/cifs/file.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index c9bc56b..3467351 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -2630,6 +2630,9 @@ cifs_write_from_iter(loff_t offset, size_t len, struct 
> iov_iter *from,
> result, from->type,
> from->iov_offset, from->count);
> dump_stack();
> +
> +   rc = result;
> +   add_credits_and_wake_if(server, credits, 0);
> break;
> }
> cur_len = (size_t)result;
> @@ -3313,13 +3316,16 @@ cifs_send_async_read(loff_t offset, size_t len, 
> struct cifsFileInfo *open_file,
> cur_len, &start);
> if (result < 0) {
> cifs_dbg(VFS,
> -   "couldn't get user pages 
> (cur_len=%zd)"
> +   "couldn't get user pages (rc=%zd)"
> " iter type %d"
> " iov_offset %zd count %zd\n",
> result, direct_iov.type,
> direct_iov.iov_offset,
> direct_iov.count);
> dump_stack();
> +
> +   rc = result;
> +   add_credits_and_wake_if(server, credits, 0);
> break;
> }
> cur_len = (size_t)result;
> --
> 2.7.4
>


-- 
Thanks,

Steve

Re: [PATCH] CIFS: use the correct length when pinning memory for direct I/O for write

2018-12-17 Thread Steve French

merged into cifs-2.6.git for-next

On Sun, Dec 16, 2018 at 5:18 PM Long Li  wrote:
>
> From: Long Li 
>
> The current code attempts to pin memory using the largest possible wsize
> based on the currect SMB credits. This doesn't cause kernel oops but this is
> not optimal as we may pin more pages then actually needed.
>
> Fix this by only pinning what are needed for doing this write I/O.
>
> Signed-off-by: Long Li 
> Cc: sta...@vger.kernel.org
> ---
>  fs/cifs/file.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 3467351..c23bf9d 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -2617,11 +2617,13 @@ cifs_write_from_iter(loff_t offset, size_t len, 
> struct iov_iter *from,
> if (rc)
> break;
>
> +   cur_len = min_t(const size_t, len, wsize);
> +
> if (ctx->direct_io) {
> ssize_t result;
>
> result = iov_iter_get_pages_alloc(
> -   from, &pagevec, wsize, &start);
> +   from, &pagevec, cur_len, &start);
> if (result < 0) {
> cifs_dbg(VFS,
> "direct_writev couldn't get user 
> pages "
> --
> 2.7.4
>


-- 
Thanks,

Steve

Re: [PATCH] ima: cleanup the match_token policy code

2018-12-17 Thread Mimi Zohar

On Tue, 2018-12-18 at 04:06 +, Al Viro wrote:
> On Mon, Dec 17, 2018 at 10:00:07PM -0500, Mimi Zohar wrote:
> 
> > Could you expand on commit 5b2ea6199614 ("selinux: switch away from
> > match_token()") patch description.  All that it says is "It's not a
> > good fit, unfortunately, and the next step will make it even less so.
> > Open-code what we need here."  And there's even less for the
> > equivalent Smack patch, which just says "same issue as with
> > selinux...".
> 
> match_token() would require messing around with strsep() or something
> equivalent.  It's not a regex; foo=%s has no idea that comma is in any
> way special, etc.
> 
> As for the next commit...  Killing the Cthulhu-awful mess in
> selinux_sb_eat_lsm_opts() (allocating two temproraries, concatenating
> (comma-separated) non-LSM options into one, concatenating (pipe-separated)
> dequoted LSM options into another, then splitting that another by '|'
> instances and figuring out which option each piece is, etc.)
> is a Good Thing(tm).  And having to dance around the needs of
> match_token() adds extra headache, for no good reason.

Ok, so it is this particular combination of things, not the general
usage of strsep or match_token that you're objecting to.  So fixing
the other match_token non-LSM instances is fine.

To prevent the enumeration and the match_table from going out of sync,
I was thinking about defining a macro to create the match_table_t:

#define __policy_tokens_match(ENUM, str) {Opt_ ## ENUM, #str},

static const match_table_t policy_tokens = {
    __policy_tokens_id(__policy_tokens_match)
};

and the enumeration:

enum policy_tokens_id {
    __policy_tokens_id(__policy_tokens_enumify)
};

Mimi

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 7:27 PM Jarkko Sakkinen
 wrote:
>
> On Tue, Dec 18, 2018 at 03:39:18AM +0200, Jarkko Sakkinen wrote:
> > On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> > > The only potential hiccup I can see is the build flow.  Currently,
> > > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > > (10x regression) when userspace is building multiple enclaves in parallel
> > > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > > but I've only confirmed the Golang case).  The issue is that allocating
> > > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > > to thrash and tank performance[1].
> >
> > I don't see any major issues having that kthread. All the code that
> > maps the enclave would be removed.
> >
> > I would only allow to map enclave to process address space after the
> > enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.
>
> Some refined thoughts.
>
> PTE insertion can done in the #PF handler. In fact, we can PoC this
> already with the current architecture (and I will right after sending
> v18).
>
> The backing space is a bit more nasty issue in the add pager thread.
> The previous shmem swapping would have been a better fit. Maybe that
> should be reconsidered?
>
> If shmem was used, all the commits up to "SGX Enclave Driver" could
> be reworked to the new model.
>
> When we think about the swapping code, there uprises some difficulties.
> Namely, when a page is swapped, the enclave must unmap the PTE from all
> processes that have it mapped.

That's what unmap_mapping_range(), etc do for you, no?  IOW make a
struct address_space that represents the logical enclave address
space, i.e. address 0 is the start and the pages count up from there.
You can unmap pages whenever you want, and the core mm code will take
care of zapping the pages from all vmas referencing that
address_space.

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
 wrote:
>

> My brain is still sorting out the details, but I generally like the idea
> of allocating an anon inode when creating an enclave, and exposing the
> other ioctls() via the returned fd.  This is essentially the approach
> used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> and vCPUS.  There are even similarities to accessing physical memory via
> multiple disparate domains, e.g. host kernel, host userspace and guest.
>

In my mind, opening /dev/sgx would give you the requisite inode.  I'm
not 100% sure that the chardev infrastructure allows this, but I think
it does.

> The only potential hiccup I can see is the build flow.  Currently,
> EADD+EEXTEND is done via a work queue to avoid major performance issues
> (10x regression) when userspace is building multiple enclaves in parallel
> using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> but I've only confirmed the Golang case).  The issue is that allocating
> an EPC page acts like a blocking syscall when the EPC is under pressure,
> i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> to thrash and tank performance[1].

What's the issue, and how does a workqueue help?  I'm wondering if a
nicer solution would be an ioctl to add lots of pages in a single
call.

>
> Alternatively, we could change the EADD+EEXTEND flow to not insert the
> added page's PFN into the owner's process space, i.e. force userspace to
> fault when it runs the enclave.  But that only delays the issue because
> eventually we'll want to account EPC pages, i.e. add a cgroup, at which
> point we'll likely need current->mm anyways.

You should be able to account the backing pages to a cgroup without
actually sticking them into the EPC, no?  Or am I misunderstanding?  I
guess we'll eventually want a cgroup to limit use of the limited EPC
resources.

Re: [Regression 4.15] Can't kill CONFIG_UNWINDER_ORC with fire or plague.

2018-12-17 Thread Masahiro Yamada

On Mon, Dec 17, 2018 at 6:45 AM Paul Gortmaker
 wrote:
>
> [Re: [Regression 4.15] Can't kill CONFIG_UNWINDER_ORC with fire or plague.] 
> On 29/12/2017 (Fri 13:18) Paul Gortmaker wrote:
>
> > [Re: [Regression 4.15] Can't kill CONFIG_UNWINDER_ORC with fire or plague.] 
> > On 29/12/2017 (Fri 10:47) Josh Poimboeuf wrote:
> >
> > > This seems to be related to a kconfig quirk where only silentoldconfig
> > > updates the include/config/auto.conf file.  The other config targets
> > > (oldconfig, defconfig, etc) don't touch it.  It seems intentional, but I
> > > have no idea why.
> > >
> > > That causes the Makefile to get stale data for 'CONFIG_*' variables when
> > > it includes auto.conf.  So I don't think this is specific to the ORC
> > > check.  It seems like it could also cause bugs elsewhere.
> >
> > OK, good - you agree with my initial diagnosis of stale auto.conf then.
> > Not sure what Randy was testing when he said he couldn't reproduce it.
> >
> > > The below (ugly) patch fixes it, though I'm not sure this is the best
> > > way to do it.  We probably need Masahiro or Michal to chime in here.
> >
> > Yep, that is why I intentionally put the kbuild folks on the To/Cc of
> > the original report (and ran away screaming at the prospect of debugging
> > Makefiles on xmas day).  But with holidays and all, it might not be
> > until early January before they have a chance to reply.
>
> It is nearly a year later and we still have this false positive.
>
> paul@sm:~/git/linux-head$ make -j12 > /dev/null
> Makefile:966: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y,
>  please install libelf-dev, libelf-devel or elfutils-libelf-devel".  Stop.
> paul@sm:~/git/linux-head$ grep UNWINDER_ORC .config
> # CONFIG_UNWINDER_ORC is not set
>
> We do know a bit more now -- the auto.conf issue has been independently
> confirmed and "fixed" for other subsystems/issues since, like RETPOLINE:
>
>  -
>   commit 25896d073d8a0403b07e6dec56f58e6c33678207
>   Author: Masahiro Yamada 
>   Date:   Wed Dec 5 15:27:19 2018 +0900
>
> x86/build: Fix compiler support check for CONFIG_RETPOLINE
>
> It is troublesome to add a diagnostic like this to the Makefile
> parse stage because the top-level Makefile could be parsed with
> a stale include/config/auto.conf.
>
> Once you are hit by the error about non-retpoline compiler, the
> compilation still breaks even after disabling CONFIG_RETPOLINE.
>  -
>
> I'm not sure if we want to treat this on a per config option each time
> again and again, or undertake a more global kbuild approach, but it does
> warrant a mention and a re-examination before we "solve" this again.


I did not notice this thread
(perhaps, it fell into my crack during the holidays)
but I actually tried to fix this twice in a sophisticated way
in the past.


The first attempt (https://patchwork.kernel.org/patch/10516049/)
was rejected by Josh Poimboeuf.

The second one (https://patchwork.kernel.org/patch/10643245/)
turned out not working as expected.


Now, I am preparing for the third attempt,
but it will take time for review.

What I can do now is the similar fix-up as commit 25896d073d8.

I will post a cheesy fix-up patch.


--
Best Regards
Masahiro Yamada

Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2018 at 5:39 PM Jarkko Sakkinen
 wrote:
>
> On Mon, Dec 17, 2018 at 02:20:48PM -0800, Sean Christopherson wrote:
> > The only potential hiccup I can see is the build flow.  Currently,
> > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > (10x regression) when userspace is building multiple enclaves in parallel
> > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > but I've only confirmed the Golang case).  The issue is that allocating
> > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > to thrash and tank performance[1].
>
> I don't see any major issues having that kthread. All the code that
> maps the enclave would be removed.
>
> I would only allow to map enclave to process address space after the
> enclave has been initialized i.e. SGX_IOC_ENCLAVE_ATTACH.
>

What's SGX_IOC_ENCLAVE_ATTACH?  Why would it be needed at all?  I
would imagine that all pages would be faulted in as needed (or
prefaulted as an optimization) and the enclave would just work in any
process.

[PATCH v4 2/2] trace nvme submit queue status

2018-12-17 Thread yupeng

export nvme disk name, queue id, sq_head, sq_tail to trace event
usage example:
go to the event directory:
cd /sys/kernel/debug/tracing/events/nvme/nvme_sq
filter by disk name:
echo 'disk=="nvme1n1"' > filter
enable the event:
echo 1 > enable
check results from trace_pipe:
cat /sys/kernel/debug/tracing/trace_pipe
In practice, this patch help me debug hardware related
performant issue.

Signed-off-by: yupeng 
---
 drivers/nvme/host/pci.c   |  5 +
 drivers/nvme/host/trace.c |  2 ++
 drivers/nvme/host/trace.h | 21 +
 3 files changed, 28 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c33bb201b884..52df2f7fef37 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 
+#include "trace.h"
 #include "nvme.h"
 
 #define SQ_SIZE(depth) (depth * sizeof(struct nvme_command))
@@ -899,6 +900,10 @@ static inline void nvme_handle_cqe(struct nvme_queue 
*nvmeq, u16 idx)
}
 
req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id);
+   trace_nvme_sq(req->rq_disk,
+   nvmeq->qid,
+   le16_to_cpu(cqe->sq_head),
+   nvmeq->sq_tail);
nvme_end_request(req, cqe->status, cqe->result);
 }
 
diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c
index 8ca7079ed2bc..7bfaace23e1e 100644
--- a/drivers/nvme/host/trace.c
+++ b/drivers/nvme/host/trace.c
@@ -142,3 +142,5 @@ const char *nvme_trace_disk_name(struct trace_seq *p, char 
*name)
return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_trace_disk_name);
+
+EXPORT_TRACEPOINT_SYMBOL(nvme_sq);
diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
index 196d5bd56718..3606cd7000f4 100644
--- a/drivers/nvme/host/trace.h
+++ b/drivers/nvme/host/trace.h
@@ -184,6 +184,27 @@ TRACE_EVENT(nvme_async_event,
 
 #undef aer_name
 
+TRACE_EVENT(nvme_sq,
+   TP_PROTO(void *rq_disk, int qid, int sq_head, int sq_tail),
+   TP_ARGS(rq_disk, qid, sq_head, sq_tail),
+   TP_STRUCT__entry(
+   __array(char, disk, DISK_NAME_LEN)
+   __field(int, qid)
+   __field(int, sq_head)
+   __field(int, sq_tail)
+   ),
+   TP_fast_assign(
+   __assign_disk_name(__entry->disk, rq_disk);
+   __entry->qid = qid;
+   __entry->sq_head = sq_head;
+   __entry->sq_tail = sq_tail;
+   ),
+   TP_printk("nvme: %s qid=%d head=%d tail=%d",
+   __print_disk_name(__entry->disk),
+   __entry->qid, __entry->sq_head, __entry->sq_tail
+   )
+);
+
 #endif /* _TRACE_NVME_H */
 
 #undef TRACE_INCLUDE_PATH
-- 
2.17.1

[PATCH 2/2] arm64: dts: ls1088: add missing dma-coherent property in fsl-mc

2018-12-17 Thread Nipun Gupta

Signed-off-by: Nipun Gupta 
---
 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index dec0c2d..b8e31a1 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -577,6 +577,7 @@
  <0x 0x0834 0 0x4>; /* MC control 
reg */
msi-parent = <&its>;
iommu-map = <0 &smmu 0 0>;  /* This is fixed-up by 
u-boot */
+   dma-coherent;
#address-cells = <3>;
#size-cells = <1>;
 
-- 
1.9.1

[PATCH v4 1/2] export trace.c helper functions to other modules

2018-12-17 Thread yupeng

Export bellow three functions:
nvme_trace_parse_admin_cmd
nvme_trace_parse_nvm_cmd
nvme_trace_disk_name
Thus any other modules which depends on nvme-core could use the trace
events in trace.h

Signed-off-by: yupeng 
---
 drivers/nvme/host/trace.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c
index 25b0e310f4a8..8ca7079ed2bc 100644
--- a/drivers/nvme/host/trace.c
+++ b/drivers/nvme/host/trace.c
@@ -113,6 +113,7 @@ const char *nvme_trace_parse_admin_cmd(struct trace_seq *p,
return nvme_trace_common(p, cdw10);
}
 }
+EXPORT_SYMBOL_GPL(nvme_trace_parse_admin_cmd);
 
 const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p,
 u8 opcode, u8 *cdw10)
@@ -128,6 +129,7 @@ const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p,
return nvme_trace_common(p, cdw10);
}
 }
+EXPORT_SYMBOL_GPL(nvme_trace_parse_nvm_cmd);
 
 const char *nvme_trace_disk_name(struct trace_seq *p, char *name)
 {
@@ -139,3 +141,4 @@ const char *nvme_trace_disk_name(struct trace_seq *p, char 
*name)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(nvme_trace_disk_name);
-- 
2.17.1

[PATCH 1/2] arm64: dts: ls1088: add smmu device node

2018-12-17 Thread Nipun Gupta

This patch also adds the iommu-map property in fsl-mc node, so
that fsl-mc can use iommu.

Signed-off-by: Nipun Gupta 
---
These patches are based over:
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git,
as there are couple of changes related to fsl-mc bus in this tree:
https://lore.kernel.org/patchwork/patch/1021020/

 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 92 +-
 1 file changed, 91 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index de93b42..dec0c2d 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -576,6 +576,7 @@
reg = <0x0008 0x0c00 0 0x40>,/* MC portal 
base */
  <0x 0x0834 0 0x4>; /* MC control 
reg */
msi-parent = <&its>;
+   iommu-map = <0 &smmu 0 0>;  /* This is fixed-up by 
u-boot */
#address-cells = <3>;
#size-cells = <1>;
 
@@ -641,6 +642,96 @@
};
};
};
+
+   smmu: iommu@500 {
+   compatible = "arm,mmu-500";
+   reg = <0 0x500 0 0x80>;
+   #iommu-cells = <1>;
+   stream-match-mask = <0x7C00>;
+   #global-interrupts = <12>;
+// global secure fault
+   interrupts = ,
+// combined secure
+,
+// global non-secure fault
+,
+// combined non-secure
+,
+// performance counter interrupts 0-7
+,
+,
+,
+,
+,
+,
+,
+,
+// per context interrupt, 64 interrupts
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   };
};
 
firmware {
@@ -649,5 +740,4 @@
method = "smc";

Re: [PATCH v2 0/3] x86: kprobes: Show correct blaclkist in debugfs

2018-12-17 Thread Masami Hiramatsu

On Mon, 17 Dec 2018 16:47:13 +0100
Andrea Righi  wrote:

> On Mon, Dec 17, 2018 at 05:20:25PM +0900, Masami Hiramatsu wrote:
> > This is v2 series for showing correct kprobe blacklist in
> > debugfs.
> > 
> > v1 is here:
> > 
> >  https://lkml.org/lkml/2018/12/7/517
> > 
> > I splitted the RFC v1 patch into x86 and generic parts,
> > also added a patch to remove unneeded arch-specific
> > blacklist check function (because those have been added
> > to the generic blacklist.)
> > 
> > If this style is good, I will make another series for the
> > archs which have own arch_within_kprobe_blacklist(), and
> > eventually replace that with arch_populate_kprobe_blacklist()
> > so that user can get the correct kprobe blacklist in debugfs.
> > 
> > Thank you,
> 
> Looks good to me. Thanks!
> 
> Tested-by: Andrea Righi 

Thank you for testing!

> 
> Side question: there are certain symbols in arch/x86/xen that should be
> blacklisted explicitly, because they're non-attachable.
> 
> More exactly, all functions defined in arch/x86/xen/spinlock.c,
> arch/x86/xen/time.c and arch/x86/xen/irq.c.
> 
> The reason is that these files are compiled without -pg to allow the
> usage of ftrace within a Xen domain apparently (from
> arch/x86/xen/Makefile):
> 
>  ifdef CONFIG_FUNCTION_TRACER
>  # Do not profile debug and lowlevel utilities
>  CFLAGS_REMOVE_spinlock.o = -pg
>  CFLAGS_REMOVE_time.o = -pg
>  CFLAGS_REMOVE_irq.o = -pg
>  endif

Actually, the reason why you can not probe those functions via
tracing/kprobe_events is just a side effect. You can probe it if you
write a kprobe module. Since the kprobe_events depends on some ftrace
tracing functions, it sometimes cause a recursive call problem. To avoid
this issue, I have introduced a CONFIG_KPROBE_EVENTS_ON_NOTRACE, see
commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on notrace function").

If you set CONFIG_KPROBE_EVENTS_ON_NOTRACE=n, you can continue putting probes
on Xen spinlock functions too.

> Do you see a nice and clean way to blacklist all these functions
> (something like arch_populate_kprobe_blacklist()), or should we just
> flag all of them explicitly with NOKPROBE_SYMBOL()?

As I pointed, you can probe it via your own kprobe module. Like systemtap,
you still can probe it. The blacklist is for "kprobes", not for "kprobe_events".
(Those are used to same, but since the above commit, those are different now)

I think the most sane solution is, identifying which (combination of) functions
in ftrace (kernel/trace/*) causes a problem, marking those NOKPROBE_SYMBOL() and
removing CONFIG_KPROBE_EVENTS_ON_NOTRACE.

Thank you,

-- 
Masami Hiramatsu

Re: [PATCH] Export mm_update_next_owner function for unuse_mm.

2018-12-17 Thread Michael S. Tsirkin

On Tue, Dec 18, 2018 at 11:42:11AM +0800, gchen.guo...@gmail.com wrote:
> From: guomin chen 
> 
> When mm->owner is modified by exit_mm, if the new owner directly calls
> unuse_mm to exit, it will cause Use-After-Free. Due to the unuse_mm()
> directly sets tsk->mm=NULL.
> 
>  Under normal circumstances,When do_exit exits, mm->owner will
>  be updated on exit_mm(). but when the kernel process calls
>  unuse_mm() and then exits,mm->owner cannot be updated. And it
>  will point to a task that has been released.
> 
> The current issue flow is as follows:
> Process C  Process A Process B
> qemu-system-x86_64: kernel:vhost_net  kernel: vhost_net
> open /dev/vhost-net
>   VHOST_SET_OWNER   create kthread vhost-%d  create kthread vhost-%d
>   network init   use_mm()  use_mm()
>...   ...
>Abnormal exited
>...
>   do_exit
>   exit_mm()
>   update mm->owner to A
>   exit_files()
>close_files()
>kthread_should_stop() unuse_mm()
> Stop Process A   tsk->mm=NULL
>  do_exit()
>  can't update owner
>  A exit completed  vhost-%d  rcv first package
>vhost-%d build rcv buffer for vq
>page fault
>access mm & mm->owner
>NOW,mm->owner still pointer A
>kernel UAF
> stop Process B
> 
> Although I am having this issue on vhost_net,But it affects all users of
> unuse_mm.
> 
> Cc: "Eric W. Biederman" 
> Cc: Andrew Morton 
> Cc: "Luis R. Rodriguez" 
> Cc: Dominik Brodowski 
> Cc: Arnd Bergmann 
> Cc: linux-kernel@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Christoph Hellwig 
> Signed-off-by: guomin chen 
> ---
>  kernel/exit.c| 1 +
>  mm/mmu_context.c | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 0e21e6d..9e046dd 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -486,6 +486,7 @@ void mm_update_next_owner(struct mm_struct *mm)
>   task_unlock(c);
>   put_task_struct(c);
>  }
> +EXPORT_SYMBOL(mm_update_next_owner);
>  #endif /* CONFIG_MEMCG */
>  
>  /*

So why export it? Is that still needed?

> diff --git a/mm/mmu_context.c b/mm/mmu_context.c
> index 3e612ae..9eb81aa 100644
> --- a/mm/mmu_context.c
> +++ b/mm/mmu_context.c
> @@ -60,5 +60,6 @@ void unuse_mm(struct mm_struct *mm)
>   /* active_mm is still 'mm' */
>   enter_lazy_tlb(mm, tsk);
>   task_unlock(tsk);
> + mm_update_next_owner(mm);
>  }
>  EXPORT_SYMBOL_GPL(unuse_mm);
> -- 
> 1.8.3.1

[PATCH v6 4/6] mm: Shuffle initial free memory to improve memory-side-cache utilization

2018-12-17 Thread Dan Williams

Randomization of the page allocator improves the average utilization of
a direct-mapped memory-side-cache. Memory side caching is a platform
capability that Linux has been previously exposed to in HPC
(high-performance computing) environments on specialty platforms. In
that instance it was a smaller pool of high-bandwidth-memory relative to
higher-capacity / lower-bandwidth DRAM. Now, this capability is going to
be found on general purpose server platforms where DRAM is a cache in
front of higher latency persistent memory [1].

Robert offered an explanation of the state of the art of Linux
interactions with memory-side-caches [2], and I copy it here:

It's been a problem in the HPC space:

http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/

A kernel module called zonesort is available to try to help:
https://software.intel.com/en-us/articles/xeon-phi-software

and this abandoned patch series proposed that for the kernel:
https://lkml.org/lkml/2017/8/23/195

Dan's patch series doesn't attempt to ensure buffers won't conflict, but
also reduces the chance that the buffers will. This will make performance
more consistent, albeit slower than "optimal" (which is near impossible
to attain in a general-purpose kernel).  That's better than forcing
users to deploy remedies like:
"To eliminate this gradual degradation, we have added a Stream
 measurement to the Node Health Check that follows each job;
 nodes are rebooted whenever their measured memory bandwidth
 falls below 300 GB/s."

A replacement for zonesort was merged upstream in commit cc9aec03e58f
"x86/numa_emulation: Introduce uniform split capability". With this
numa_emulation capability, memory can be split into cache sized
("near-memory" sized) numa nodes. A bind operation to such a node, and
disabling workloads on other nodes, enables full cache performance.
However, once the workload exceeds the cache size then cache conflicts
are unavoidable. While HPC environments might be able to tolerate
time-scheduling of cache sized workloads, for general purpose server
platforms, the oversubscribed cache case will be the common case.

The worst case scenario is that a server system owner benchmarks a
workload at boot with an un-contended cache only to see that performance
degrade over time, even below the average cache performance due to
excessive conflicts. Randomization clips the peaks and fills in the
valleys of cache utilization to yield steady average performance.

Here are some performance impact details of the patches:

1/ An Intel internal synthetic memory bandwidth measurement tool, saw a
3X speedup in a contrived case that tries to force cache conflicts. The
contrived cased used the numa_emulation capability to force an instance
of the benchmark to be run in two of the near-memory sized numa nodes.
If both instances were placed on the same emulated they would fit and
cause zero conflicts.  While on separate emulated nodes without
randomization they underutilized the cache and conflicted unnecessarily
due to the in-order allocation per node.

2/ A well known Java server application benchmark was run with a heap
size that exceeded cache size by 3X. The cache conflict rate was 8% for
the first run and degraded to 21% after page allocator aging. With
randomization enabled the rate levelled out at 11%.

3/ A MongoDB workload did not observe measurable difference in
cache-conflict rates, but the overall throughput dropped by 7% with
randomization in one case.

4/ Mel Gorman ran his suite of performance workloads with randomization
enabled on platforms without a memory-side-cache and saw a mix of some
improvements and some losses [3].

While there is potentially significant improvement for applications that
depend on low latency access across a wide working-set, the performance
may be negligible to negative for other workloads. For this reason the
shuffle capability defaults to off unless a direct-mapped
memory-side-cache is detected. Even then, the page_alloc.shuffle=0
parameter can be specified to disable the randomization on those
systems.

Outside of memory-side-cache utilization concerns there is potentially
security benefit from randomization. Some data exfiltration and
return-oriented-programming attacks rely on the ability to infer the
location of sensitive data objects. The kernel page allocator,
especially early in system boot, has predictable first-in-first out
behavior for physical pages. Pages are freed in physical address order
when first onlined.

Quoting Kees:
"While we already have a base-address randomization
 (CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and
 memory layouts would certainly be using the predictability of
 allocation ordering (i.e. for attacks where the base address isn't
 important: only the relative positions between allocated memory).
 This is common in lots of heap-style attacks. They

[PATCH v6 6/6] mm: Maintain randomization of page free lists

2018-12-17 Thread Dan Williams

When freeing a page with an order >= shuffle_page_order randomly select
the front or back of the list for insertion.

While the mm tries to defragment physical pages into huge pages this can
tend to make the page allocator more predictable over time. Inject the
front-back randomness to preserve the initial randomness established by
shuffle_free_memory() when the kernel was booted.

The overhead of this manipulation is constrained by only being applied
for MAX_ORDER sized pages by default.

Cc: Michal Hocko 
Cc: Kees Cook 
Cc: Dave Hansen 
Signed-off-by: Dan Williams 
---
 include/linux/mmzone.h  |   10 ++
 include/linux/shuffle.h |   12 
 mm/page_alloc.c |   11 +--
 mm/shuffle.c|   16 
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 35cc33af87f2..338929647eea 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -98,6 +98,8 @@ extern int page_group_by_mobility_disabled;
 struct free_area {
struct list_headfree_list[MIGRATE_TYPES];
unsigned long   nr_free;
+   u64 rand;
+   u8  rand_bits;
 };
 
 /* Used for pages not on another list */
@@ -116,6 +118,14 @@ static inline void add_to_free_area_tail(struct page 
*page, struct free_area *ar
area->nr_free++;
 }
 
+#ifdef CONFIG_SHUFFLE_PAGE_ALLOCATOR
+/* Used to preserve page allocation order entropy */
+void add_to_free_area_random(struct page *page, struct free_area *area,
+   int migratetype);
+#else
+#define add_to_free_area_random add_to_free_area
+#endif
+
 /* Used for pages which are on another list */
 static inline void move_to_free_area(struct page *page, struct free_area *area,
 int migratetype)
diff --git a/include/linux/shuffle.h b/include/linux/shuffle.h
index a8a168919cb5..8b3941a87c2c 100644
--- a/include/linux/shuffle.h
+++ b/include/linux/shuffle.h
@@ -29,6 +29,13 @@ static inline void shuffle_zone(struct zone *z, unsigned 
long start_pfn,
return;
__shuffle_zone(z, start_pfn, end_pfn);
 }
+
+static inline bool is_shuffle_order(int order)
+{
+   if (!static_branch_unlikely(&page_alloc_shuffle_key))
+return false;
+   return order >= CONFIG_SHUFFLE_PAGE_ORDER;
+}
 #else
 static inline void shuffle_free_memory(pg_data_t *pgdat, unsigned long 
start_pfn,
unsigned long end_pfn)
@@ -43,5 +50,10 @@ static inline void shuffle_zone(struct zone *z, unsigned 
long start_pfn,
 static inline void page_alloc_shuffle(void)
 {
 }
+
+static inline bool is_shuffle_order(int order)
+{
+   return false;
+}
 #endif
 #endif /* _MM_SHUFFLE_H */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index de8b5eb78d13..3a932ba23daf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -851,7 +852,8 @@ static inline void __free_one_page(struct page *page,
 * so it's less likely to be used soon and more likely to be merged
 * as a higher order page
 */
-   if ((order < MAX_ORDER-2) && pfn_valid_within(buddy_pfn)) {
+   if ((order < MAX_ORDER-2) && pfn_valid_within(buddy_pfn)
+   && !is_shuffle_order(order)) {
struct page *higher_page, *higher_buddy;
combined_pfn = buddy_pfn & pfn;
higher_page = page + (combined_pfn - pfn);
@@ -865,7 +867,12 @@ static inline void __free_one_page(struct page *page,
}
}
 
-   add_to_free_area(page, &zone->free_area[order], migratetype);
+   if (is_shuffle_order(order))
+   add_to_free_area_random(page, &zone->free_area[order],
+   migratetype);
+   else
+   add_to_free_area(page, &zone->free_area[order], migratetype);
+
 }
 
 /*
diff --git a/mm/shuffle.c b/mm/shuffle.c
index 07961ff41a03..4cadf51c9b40 100644
--- a/mm/shuffle.c
+++ b/mm/shuffle.c
@@ -213,3 +213,19 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat, 
unsigned long start_pfn,
for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
shuffle_zone(z, start_pfn, end_pfn);
 }
+
+void add_to_free_area_random(struct page *page, struct free_area *area,
+   int migratetype)
+{
+   if (area->rand_bits == 0) {
+   area->rand_bits = 64;
+   area->rand = get_random_u64();
+   }
+
+   if (area->rand & 1)
+   add_to_free_area(page, area, migratetype);
+   else
+   add_to_free_area_tail(page, area, migratetype);
+   area->rand_bits--;
+   area->rand >>= 1;
+}

[PATCH v6 2/6] acpi: Add HMAT to generic parsing tables

2018-12-17 Thread Dan Williams

From: Keith Busch 

The HMAT table header has different field lengths than the existing
parsing uses. Add the HMAT type to the parsing rules so it may be
generically parsed.

Signed-off-by: Keith Busch 
Signed-off-by: Dan Williams 
---
 drivers/acpi/tables.c |9 +
 include/linux/acpi.h  |1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index e9643b4267c7..bc1addf715dc 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -51,6 +51,7 @@ static int acpi_apic_instance __initdata;
 
 enum acpi_subtable_type {
ACPI_SUBTABLE_COMMON,
+   ACPI_SUBTABLE_HMAT,
 };
 
 struct acpi_subtable_entry {
@@ -232,6 +233,8 @@ acpi_get_entry_type(struct acpi_subtable_entry *entry)
switch (entry->type) {
case ACPI_SUBTABLE_COMMON:
return entry->hdr->common.type;
+   case ACPI_SUBTABLE_HMAT:
+   return entry->hdr->hmat.type;
}
return 0;
 }
@@ -242,6 +245,8 @@ acpi_get_entry_length(struct acpi_subtable_entry *entry)
switch (entry->type) {
case ACPI_SUBTABLE_COMMON:
return entry->hdr->common.length;
+   case ACPI_SUBTABLE_HMAT:
+   return entry->hdr->hmat.length;
}
return 0;
 }
@@ -252,6 +257,8 @@ acpi_get_subtable_header_length(struct acpi_subtable_entry 
*entry)
switch (entry->type) {
case ACPI_SUBTABLE_COMMON:
return sizeof(entry->hdr->common);
+   case ACPI_SUBTABLE_HMAT:
+   return sizeof(entry->hdr->hmat);
}
return 0;
 }
@@ -259,6 +266,8 @@ acpi_get_subtable_header_length(struct acpi_subtable_entry 
*entry)
 static enum acpi_subtable_type __init
 acpi_get_subtable_type(char *id)
 {
+   if (strncmp(id, ACPI_SIG_HMAT, 4) == 0)
+   return ACPI_SUBTABLE_HMAT;
return ACPI_SUBTABLE_COMMON;
 }
 
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 18805a967c70..4373f5ba0f95 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -143,6 +143,7 @@ enum acpi_address_range_id {
 /* Table Handlers */
 union acpi_subtable_headers {
struct acpi_subtable_header common;
+   struct acpi_hmat_structure hmat;
 };
 
 typedef int (*acpi_tbl_table_handler)(struct acpi_table_header *table);

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1152 matches

Mail list logo