[PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-05 Thread Dongjiu Geng
ARMv8.2 requires implementation of the RAS extension. In
this extension, it adds SEI(SError Interrupt) notification
type, this patch adds new GHES error source SEI handling
functions. This error source parsing and handling method
is similar with the SEA.

Expose API ghes_notify_sei() to external users. External
modules can call this exposed API to parse APEI table and
handle the SEI notification.

Signed-off-by: Dongjiu Geng 
---
 drivers/acpi/apei/Kconfig | 15 ++
 drivers/acpi/apei/ghes.c  | 53 +++
 include/acpi/ghes.h   |  1 +
 3 files changed, 69 insertions(+)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae543..ff4afc3 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -55,6 +55,21 @@ config ACPI_APEI_SEA
  option allows the OS to look for such hardware error record, and
  take appropriate action.
 
+config ACPI_APEI_SEI
+   bool "APEI SError(System Error) Interrupt logging/recovering support"
+   depends on ARM64 && ACPI_APEI_GHES
+   default y
+   help
+ This option should be enabled if the system supports
+ firmware first handling of SEI (SError interrupt).
+
+ SEI happens with asynchronous external abort for errors on device
+ memory reads on ARMv8 systems. If a system supports firmware first
+ handling of SEI, the platform analyzes and handles hardware error
+ notifications from SEI, and it may then form a hardware error record 
for
+ the OS to parse and handle. This option allows the OS to look for
+ such hardware error record, and take appropriate action.
+
 config ACPI_APEI_MEMORY_FAILURE
bool "APEI memory error recovering support"
depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6a3f824..67cd3a7 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -855,6 +855,46 @@ static inline void ghes_sea_add(struct ghes *ghes) { }
 static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
+#ifdef CONFIG_ACPI_APEI_SEI
+static LIST_HEAD(ghes_sei);
+
+/*
+ * Return 0 only if one of the SEI error sources successfully reported an error
+ * record sent from the firmware.
+ */
+int ghes_notify_sei(void)
+{
+   struct ghes *ghes;
+   int ret = -ENOENT;
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ghes, _sei, list) {
+   if (!ghes_proc(ghes))
+   ret = 0;
+   }
+   rcu_read_unlock();
+   return ret;
+}
+
+static void ghes_sei_add(struct ghes *ghes)
+{
+   mutex_lock(_list_mutex);
+   list_add_rcu(>list, _sei);
+   mutex_unlock(_list_mutex);
+}
+
+static void ghes_sei_remove(struct ghes *ghes)
+{
+   mutex_lock(_list_mutex);
+   list_del_rcu(>list);
+   mutex_unlock(_list_mutex);
+   synchronize_rcu();
+}
+#else /* CONFIG_ACPI_APEI_SEI */
+static inline void ghes_sei_add(struct ghes *ghes) { }
+static inline void ghes_sei_remove(struct ghes *ghes) { }
+#endif /* CONFIG_ACPI_APEI_SEI */
+
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
  * printk is not safe in NMI context.  So in NMI handler, we allocate
@@ -1086,6 +1126,13 @@ static int ghes_probe(struct platform_device *ghes_dev)
goto err;
}
break;
+   case ACPI_HEST_NOTIFY_SEI:
+   if (!IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
+   pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via SEI is not supported!\n",
+   generic->header.source_id);
+   goto err;
+   }
+   break;
case ACPI_HEST_NOTIFY_NMI:
if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via NMI interrupt is not supported!\n",
@@ -1158,6 +1205,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_add(ghes);
break;
+   case ACPI_HEST_NOTIFY_SEI:
+   ghes_sei_add(ghes);
+   break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_add(ghes);
break;
@@ -1211,6 +1261,9 @@ static int ghes_remove(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_remove(ghes);
break;
+   case ACPI_HEST_NOTIFY_SEI:
+   ghes_sei_remove(ghes);
+   break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);
break;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 8feb0c8..9ba59e2 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -120,5 +120,6 @@ static inline void *acpi_hest_get_next(struct 
acpi_hest_generic_data *gdata)
 section = 

[PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-05 Thread Dongjiu Geng
ARMv8.2 requires implementation of the RAS extension. In
this extension, it adds SEI(SError Interrupt) notification
type, this patch adds new GHES error source SEI handling
functions. This error source parsing and handling method
is similar with the SEA.

Expose API ghes_notify_sei() to external users. External
modules can call this exposed API to parse APEI table and
handle the SEI notification.

Signed-off-by: Dongjiu Geng 
---
 drivers/acpi/apei/Kconfig | 15 ++
 drivers/acpi/apei/ghes.c  | 53 +++
 include/acpi/ghes.h   |  1 +
 3 files changed, 69 insertions(+)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae543..ff4afc3 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -55,6 +55,21 @@ config ACPI_APEI_SEA
  option allows the OS to look for such hardware error record, and
  take appropriate action.
 
+config ACPI_APEI_SEI
+   bool "APEI SError(System Error) Interrupt logging/recovering support"
+   depends on ARM64 && ACPI_APEI_GHES
+   default y
+   help
+ This option should be enabled if the system supports
+ firmware first handling of SEI (SError interrupt).
+
+ SEI happens with asynchronous external abort for errors on device
+ memory reads on ARMv8 systems. If a system supports firmware first
+ handling of SEI, the platform analyzes and handles hardware error
+ notifications from SEI, and it may then form a hardware error record 
for
+ the OS to parse and handle. This option allows the OS to look for
+ such hardware error record, and take appropriate action.
+
 config ACPI_APEI_MEMORY_FAILURE
bool "APEI memory error recovering support"
depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6a3f824..67cd3a7 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -855,6 +855,46 @@ static inline void ghes_sea_add(struct ghes *ghes) { }
 static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
+#ifdef CONFIG_ACPI_APEI_SEI
+static LIST_HEAD(ghes_sei);
+
+/*
+ * Return 0 only if one of the SEI error sources successfully reported an error
+ * record sent from the firmware.
+ */
+int ghes_notify_sei(void)
+{
+   struct ghes *ghes;
+   int ret = -ENOENT;
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ghes, _sei, list) {
+   if (!ghes_proc(ghes))
+   ret = 0;
+   }
+   rcu_read_unlock();
+   return ret;
+}
+
+static void ghes_sei_add(struct ghes *ghes)
+{
+   mutex_lock(_list_mutex);
+   list_add_rcu(>list, _sei);
+   mutex_unlock(_list_mutex);
+}
+
+static void ghes_sei_remove(struct ghes *ghes)
+{
+   mutex_lock(_list_mutex);
+   list_del_rcu(>list);
+   mutex_unlock(_list_mutex);
+   synchronize_rcu();
+}
+#else /* CONFIG_ACPI_APEI_SEI */
+static inline void ghes_sei_add(struct ghes *ghes) { }
+static inline void ghes_sei_remove(struct ghes *ghes) { }
+#endif /* CONFIG_ACPI_APEI_SEI */
+
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
  * printk is not safe in NMI context.  So in NMI handler, we allocate
@@ -1086,6 +1126,13 @@ static int ghes_probe(struct platform_device *ghes_dev)
goto err;
}
break;
+   case ACPI_HEST_NOTIFY_SEI:
+   if (!IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
+   pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via SEI is not supported!\n",
+   generic->header.source_id);
+   goto err;
+   }
+   break;
case ACPI_HEST_NOTIFY_NMI:
if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
pr_warn(GHES_PFX "Generic hardware error source: %d 
notified via NMI interrupt is not supported!\n",
@@ -1158,6 +1205,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_add(ghes);
break;
+   case ACPI_HEST_NOTIFY_SEI:
+   ghes_sei_add(ghes);
+   break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_add(ghes);
break;
@@ -1211,6 +1261,9 @@ static int ghes_remove(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_remove(ghes);
break;
+   case ACPI_HEST_NOTIFY_SEI:
+   ghes_sei_remove(ghes);
+   break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);
break;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 8feb0c8..9ba59e2 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -120,5 +120,6 @@ static inline void *acpi_hest_get_next(struct 
acpi_hest_generic_data *gdata)
 section = acpi_hest_get_next(section))
 
 int 

[PATCH v9 7/7] arm64: kvm: handle guest SError Interrupt by categorization

2018-01-05 Thread Dongjiu Geng
If it is not RAS SError, directly inject virtual SError,
which will keep the old way, otherwise firstly let host
ACPI kernel driver to handle it. If the ACPI handling is
failed, KVM continues categorizing errors by the ESR_ELx.

For the recoverable error (UER), it has not been silently
propagated and has not (yet) been architecturally consumed
by the PE, the exception is precise. In order to make it
simple, we temporarily shut down the VM to isolate the error.

Signed-off-by: Dongjiu Geng 
---
change since v8:
1. Check handle_guest_sei()'s return value
2. Temporarily shut down the VM to isolate the error for the
   recoverable error (UER) 
3. Remove some unused macro definitions
---
 arch/arm64/include/asm/esr.h | 11 ++
 arch/arm64/include/asm/system_misc.h |  1 +
 arch/arm64/kvm/handle_exit.c | 68 +---
 arch/arm64/mm/fault.c| 16 +
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 66ed8b6..a751e86 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -102,6 +102,7 @@
 #define ESR_ELx_FSC_ACCESS (0x08)
 #define ESR_ELx_FSC_FAULT  (0x04)
 #define ESR_ELx_FSC_PERM   (0x0C)
+#define ESR_ELx_FSC_SERROR (0x11)
 
 /* ISS field definitions for Data Aborts */
 #define ESR_ELx_ISV_SHIFT  (24)
@@ -119,6 +120,16 @@
 #define ESR_ELx_CM_SHIFT   (8)
 #define ESR_ELx_CM (UL(1) << ESR_ELx_CM_SHIFT)
 
+/* ISS field definitions for SError interrupt */
+#define ESR_ELx_AET_SHIFT  (10)
+#define ESR_ELx_AET(UL(0x7) << ESR_ELx_AET_SHIFT)
+/* Restartable error */
+#define ESR_ELx_AET_UEO(UL(2) << ESR_ELx_AET_SHIFT)
+/* Recoverable error */
+#define ESR_ELx_AET_UER(UL(3) << ESR_ELx_AET_SHIFT)
+/* Corrected error */
+#define ESR_ELx_AET_CE (UL(6) << ESR_ELx_AET_SHIFT)
+
 /* ISS field definitions for exceptions taken in to Hyp */
 #define ESR_ELx_CV (UL(1) << 24)
 #define ESR_ELx_COND_SHIFT (20)
diff --git a/arch/arm64/include/asm/system_misc.h 
b/arch/arm64/include/asm/system_misc.h
index 07aa8e3..9ee13ad 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -57,6 +57,7 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, 
unsigned int,
 })
 
 int handle_guest_sea(phys_addr_t addr, unsigned int esr);
+int handle_guest_sei(void);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74..5b806d4 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -178,6 +179,67 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu 
*vcpu)
return arm_exit_handlers[hsr_ec];
 }
 
+/**
+ * kvm_handle_guest_sei - handles SError interrupt or asynchronous aborts
+ * @vcpu:  the VCPU pointer
+ * @run:access to the kvm_run structure for results
+ *
+ * For RAS SError interrupt, firstly let host kernel handle it. If handling
+ * failed, then categorize the error by the ESR
+ */
+static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   unsigned int esr = kvm_vcpu_get_hsr(vcpu);
+   bool impdef_syndrome =  esr & ESR_ELx_ISV;  /* aka IDS */
+   unsigned int aet = esr & ESR_ELx_AET;
+
+   /*
+* This is not RAS SError
+*/
+   if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+   kvm_inject_vabt(vcpu);
+   return 1;
+   }
+
+   /* For RAS the host kernel may handle this abort. */
+   if (!handle_guest_sei())
+   return 1;
+
+   /*
+* In below two conditions, it will directly inject the
+* virtual SError:
+* 1. The Syndrome is IMPLEMENTATION DEFINED
+* 2. It is Uncategorized SEI
+*/
+   if (impdef_syndrome ||
+   ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)) {
+   kvm_inject_vabt(vcpu);
+   return 1;
+   }
+
+   switch (aet) {
+   case ESR_ELx_AET_CE:/* corrected error */
+   case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
+   return 1;   /* continue processing the guest exit */
+   case ESR_ELx_AET_UER:   /* recoverable error */
+   /*
+* the exception is precise, not been silently propagated
+* and not been consumed by the CPU, temporarily shut down
+* the VM to isolated the error, hope not touch it again.
+*/
+   run->exit_reason = KVM_EXIT_EXCEPTION;
+   return 0;
+   default:
+   /*
+* Until now, the CPU supports RAS, SError interrupt is fatal
+* and host does not successfully 

[PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

2018-01-05 Thread Dongjiu Geng
The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
guest and user space needs a way to tell KVM this value. So we add a
new ioctl. Before user space specifies the Exception Syndrome Register
ESR(ESR), it firstly checks that whether KVM has the capability to
set the guest ESR, If has, will set it. Otherwise, nothing to do.

For this ESR specifying, Only support for AArch64, not support AArch32.

Signed-off-by: Dongjiu Geng 
---
change the name to KVM_CAP_ARM_INJECT_SERROR_ESR instead of
X_ARM_RAS_EXTENSION, suggested here

https://patchwork.kernel.org/patch/9925203/
---
 Documentation/virtual/kvm/api.txt | 11 +++
 arch/arm/include/asm/kvm_host.h   |  1 +
 arch/arm/kvm/guest.c  |  9 +
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/guest.c|  5 +
 arch/arm64/kvm/reset.c|  3 +++
 include/uapi/linux/kvm.h  |  3 +++
 virt/kvm/arm/arm.c|  7 +++
 8 files changed, 40 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index e63a35f..6dfb9fc 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4347,3 +4347,14 @@ This capability indicates that userspace can load 
HV_X64_MSR_VP_INDEX msr.  Its
 value is used to denote the target vcpu for a SynIC interrupt.  For
 compatibilty, KVM initializes this msr to KVM's internal vcpu index.  When this
 capability is absent, userspace can still query this msr's value.
+
+8.13 KVM_CAP_ARM_SET_SERROR_ESR
+
+Architectures: arm, arm64
+
+This capability indicates that userspace can specify syndrome value reported to
+guest OS when guest takes a virtual SError interrupt exception.
+If KVM has this capability, userspace can only specify the ISS field for the 
ESR
+syndrome, can not specify the EC field which is not under control by KVM.
+If this virtual SError is taken to EL1 using AArch64, this value will be 
reported
+into ISS filed of ESR_EL1
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6..6cf5c7b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -211,6 +211,7 @@ struct kvm_vcpu_stat {
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome);
 unsigned long kvm_call_hyp(void *hypfn, ...);
 void force_vm_exit(const cpumask_t *mask);
 
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 1e0784e..1e15fa2 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -248,6 +248,15 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
 }
 
+/*
+ * we only support guest SError syndrome specifying
+ * for ARM64, not support it for ARM32.
+ */
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
+{
+   return -EINVAL;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
switch (read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e923b58..769cc58 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -317,6 +317,7 @@ struct kvm_vcpu_stat {
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome);
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5c7f657..738ae90 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
 }
 
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
+{
+   return -EINVAL;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
unsigned long implementor = read_cpuid_implementor();
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3256b92..38c8a64 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -77,6 +77,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_ARM_PMU_V3:
r = kvm_arm_support_pmu_v3();
break;
+   case KVM_CAP_ARM_INJECT_SERROR_ESR:
+   r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
+   break;
case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_VCPU_ATTRIBUTES:
r = 1;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7e9..0c861c4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -931,6 +931,7 @@ struct 

[PATCH v9 1/7] arm64: cpufeature: Detect CPU RAS Extentions

2018-01-05 Thread Dongjiu Geng
From: Xie XiuQi 

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi 
[Rebased added config option, reworded commit message]
Signed-off-by: James Morse 
Signed-off-by: Dongjiu Geng 
Reviewed-by: Catalin Marinas 
---
 arch/arm64/Kconfig   | 16 
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6..cc00d10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -973,6 +973,22 @@ config ARM64_PMEM
  operations if DC CVAP is not supported (following the behaviour of
  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+   bool "Enable support for RAS CPU Extensions"
+   default y
+   help
+ CPUs that support the Reliability, Availability and Serviceability
+ (RAS) Extensions, part of ARMv8.2 are able to track faults and
+ errors, classify them and report them to software.
+
+ On CPUs with these extensions system software can use additional
+ barriers to determine if faults are pending and read the
+ classification from a new set of registers.
+
+ Selecting this feature will allow the kernel to use these barriers
+ and access the new registers if the system supports the extension.
+ Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8da6216..4820d44 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
 #define ARM64_WORKAROUND_85892119
 #define ARM64_WORKAROUND_CAVIUM_30115  20
 #define ARM64_HAS_DCPOP21
+#define ARM64_HAS_RAS_EXTN 22
 
-#define ARM64_NCAPS22
+#define ARM64_NCAPS23
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index f707fed..64e2a80 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -332,6 +332,7 @@
 #define ID_AA64ISAR1_DPB_SHIFT 0
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT  28
 #define ID_AA64PFR0_GIC_SHIFT  24
 #define ID_AA64PFR0_ASIMD_SHIFT20
 #define ID_AA64PFR0_FP_SHIFT   16
@@ -340,6 +341,7 @@
 #define ID_AA64PFR0_EL1_SHIFT  4
 #define ID_AA64PFR0_EL0_SHIFT  0
 
+#define ID_AA64PFR0_RAS_V1 0x1
 #define ID_AA64PFR0_FP_NI  0xf
 #define ID_AA64PFR0_FP_SUPPORTED   0x0
 #define ID_AA64PFR0_ASIMD_NI   0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 21e2c95..4846974 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -125,6 +125,7 @@ static int __init register_cpu_hwcaps_dumper(void)
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+   ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 
ID_AA64PFR0_RAS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 
ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -900,6 +901,18 @@ static bool has_no_fpsimd(const struct 
arm64_cpu_capabilities *entry, int __unus
.min_field_value = 1,
},
 #endif
+#ifdef CONFIG_ARM64_RAS_EXTN
+   {
+   .desc = "RAS Extension Support",
+   .capability = ARM64_HAS_RAS_EXTN,
+   .def_scope = SCOPE_SYSTEM,
+   .matches = has_cpuid_feature,
+   .sys_reg = SYS_ID_AA64PFR0_EL1,
+   .sign = FTR_UNSIGNED,
+   .field_pos = ID_AA64PFR0_RAS_SHIFT,
+   .min_field_value = ID_AA64PFR0_RAS_V1,
+   },
+#endif /* CONFIG_ARM64_RAS_EXTN */
{},
 };
 
-- 
1.9.1



[PATCH v9 7/7] arm64: kvm: handle guest SError Interrupt by categorization

2018-01-05 Thread Dongjiu Geng
If it is not RAS SError, directly inject virtual SError,
which will keep the old way, otherwise firstly let host
ACPI kernel driver to handle it. If the ACPI handling is
failed, KVM continues categorizing errors by the ESR_ELx.

For the recoverable error (UER), it has not been silently
propagated and has not (yet) been architecturally consumed
by the PE, the exception is precise. In order to make it
simple, we temporarily shut down the VM to isolate the error.

Signed-off-by: Dongjiu Geng 
---
change since v8:
1. Check handle_guest_sei()'s return value
2. Temporarily shut down the VM to isolate the error for the
   recoverable error (UER) 
3. Remove some unused macro definitions
---
 arch/arm64/include/asm/esr.h | 11 ++
 arch/arm64/include/asm/system_misc.h |  1 +
 arch/arm64/kvm/handle_exit.c | 68 +---
 arch/arm64/mm/fault.c| 16 +
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 66ed8b6..a751e86 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -102,6 +102,7 @@
 #define ESR_ELx_FSC_ACCESS (0x08)
 #define ESR_ELx_FSC_FAULT  (0x04)
 #define ESR_ELx_FSC_PERM   (0x0C)
+#define ESR_ELx_FSC_SERROR (0x11)
 
 /* ISS field definitions for Data Aborts */
 #define ESR_ELx_ISV_SHIFT  (24)
@@ -119,6 +120,16 @@
 #define ESR_ELx_CM_SHIFT   (8)
 #define ESR_ELx_CM (UL(1) << ESR_ELx_CM_SHIFT)
 
+/* ISS field definitions for SError interrupt */
+#define ESR_ELx_AET_SHIFT  (10)
+#define ESR_ELx_AET(UL(0x7) << ESR_ELx_AET_SHIFT)
+/* Restartable error */
+#define ESR_ELx_AET_UEO(UL(2) << ESR_ELx_AET_SHIFT)
+/* Recoverable error */
+#define ESR_ELx_AET_UER(UL(3) << ESR_ELx_AET_SHIFT)
+/* Corrected error */
+#define ESR_ELx_AET_CE (UL(6) << ESR_ELx_AET_SHIFT)
+
 /* ISS field definitions for exceptions taken in to Hyp */
 #define ESR_ELx_CV (UL(1) << 24)
 #define ESR_ELx_COND_SHIFT (20)
diff --git a/arch/arm64/include/asm/system_misc.h 
b/arch/arm64/include/asm/system_misc.h
index 07aa8e3..9ee13ad 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -57,6 +57,7 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, 
unsigned int,
 })
 
 int handle_guest_sea(phys_addr_t addr, unsigned int esr);
+int handle_guest_sei(void);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74..5b806d4 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -178,6 +179,67 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu 
*vcpu)
return arm_exit_handlers[hsr_ec];
 }
 
+/**
+ * kvm_handle_guest_sei - handles SError interrupt or asynchronous aborts
+ * @vcpu:  the VCPU pointer
+ * @run:access to the kvm_run structure for results
+ *
+ * For RAS SError interrupt, firstly let host kernel handle it. If handling
+ * failed, then categorize the error by the ESR
+ */
+static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   unsigned int esr = kvm_vcpu_get_hsr(vcpu);
+   bool impdef_syndrome =  esr & ESR_ELx_ISV;  /* aka IDS */
+   unsigned int aet = esr & ESR_ELx_AET;
+
+   /*
+* This is not RAS SError
+*/
+   if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+   kvm_inject_vabt(vcpu);
+   return 1;
+   }
+
+   /* For RAS the host kernel may handle this abort. */
+   if (!handle_guest_sei())
+   return 1;
+
+   /*
+* In below two conditions, it will directly inject the
+* virtual SError:
+* 1. The Syndrome is IMPLEMENTATION DEFINED
+* 2. It is Uncategorized SEI
+*/
+   if (impdef_syndrome ||
+   ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)) {
+   kvm_inject_vabt(vcpu);
+   return 1;
+   }
+
+   switch (aet) {
+   case ESR_ELx_AET_CE:/* corrected error */
+   case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
+   return 1;   /* continue processing the guest exit */
+   case ESR_ELx_AET_UER:   /* recoverable error */
+   /*
+* the exception is precise, not been silently propagated
+* and not been consumed by the CPU, temporarily shut down
+* the VM to isolated the error, hope not touch it again.
+*/
+   run->exit_reason = KVM_EXIT_EXCEPTION;
+   return 0;
+   default:
+   /*
+* Until now, the CPU supports RAS, SError interrupt is fatal
+* and host does not successfully handle it.
+

[PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

2018-01-05 Thread Dongjiu Geng
The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
guest and user space needs a way to tell KVM this value. So we add a
new ioctl. Before user space specifies the Exception Syndrome Register
ESR(ESR), it firstly checks that whether KVM has the capability to
set the guest ESR, If has, will set it. Otherwise, nothing to do.

For this ESR specifying, Only support for AArch64, not support AArch32.

Signed-off-by: Dongjiu Geng 
---
change the name to KVM_CAP_ARM_INJECT_SERROR_ESR instead of
X_ARM_RAS_EXTENSION, suggested here

https://patchwork.kernel.org/patch/9925203/
---
 Documentation/virtual/kvm/api.txt | 11 +++
 arch/arm/include/asm/kvm_host.h   |  1 +
 arch/arm/kvm/guest.c  |  9 +
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/guest.c|  5 +
 arch/arm64/kvm/reset.c|  3 +++
 include/uapi/linux/kvm.h  |  3 +++
 virt/kvm/arm/arm.c|  7 +++
 8 files changed, 40 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index e63a35f..6dfb9fc 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4347,3 +4347,14 @@ This capability indicates that userspace can load 
HV_X64_MSR_VP_INDEX msr.  Its
 value is used to denote the target vcpu for a SynIC interrupt.  For
 compatibilty, KVM initializes this msr to KVM's internal vcpu index.  When this
 capability is absent, userspace can still query this msr's value.
+
+8.13 KVM_CAP_ARM_SET_SERROR_ESR
+
+Architectures: arm, arm64
+
+This capability indicates that userspace can specify syndrome value reported to
+guest OS when guest takes a virtual SError interrupt exception.
+If KVM has this capability, userspace can only specify the ISS field for the 
ESR
+syndrome, can not specify the EC field which is not under control by KVM.
+If this virtual SError is taken to EL1 using AArch64, this value will be 
reported
+into ISS filed of ESR_EL1
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6..6cf5c7b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -211,6 +211,7 @@ struct kvm_vcpu_stat {
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome);
 unsigned long kvm_call_hyp(void *hypfn, ...);
 void force_vm_exit(const cpumask_t *mask);
 
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 1e0784e..1e15fa2 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -248,6 +248,15 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
 }
 
+/*
+ * we only support guest SError syndrome specifying
+ * for ARM64, not support it for ARM32.
+ */
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
+{
+   return -EINVAL;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
switch (read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e923b58..769cc58 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -317,6 +317,7 @@ struct kvm_vcpu_stat {
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome);
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5c7f657..738ae90 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
 }
 
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
+{
+   return -EINVAL;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
unsigned long implementor = read_cpuid_implementor();
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3256b92..38c8a64 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -77,6 +77,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_ARM_PMU_V3:
r = kvm_arm_support_pmu_v3();
break;
+   case KVM_CAP_ARM_INJECT_SERROR_ESR:
+   r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
+   break;
case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_VCPU_ATTRIBUTES:
r = 1;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7e9..0c861c4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -931,6 +931,7 @@ struct kvm_ppc_resize_hpt {
 

[PATCH v9 1/7] arm64: cpufeature: Detect CPU RAS Extentions

2018-01-05 Thread Dongjiu Geng
From: Xie XiuQi 

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi 
[Rebased added config option, reworded commit message]
Signed-off-by: James Morse 
Signed-off-by: Dongjiu Geng 
Reviewed-by: Catalin Marinas 
---
 arch/arm64/Kconfig   | 16 
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6..cc00d10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -973,6 +973,22 @@ config ARM64_PMEM
  operations if DC CVAP is not supported (following the behaviour of
  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+   bool "Enable support for RAS CPU Extensions"
+   default y
+   help
+ CPUs that support the Reliability, Availability and Serviceability
+ (RAS) Extensions, part of ARMv8.2 are able to track faults and
+ errors, classify them and report them to software.
+
+ On CPUs with these extensions system software can use additional
+ barriers to determine if faults are pending and read the
+ classification from a new set of registers.
+
+ Selecting this feature will allow the kernel to use these barriers
+ and access the new registers if the system supports the extension.
+ Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8da6216..4820d44 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
 #define ARM64_WORKAROUND_85892119
 #define ARM64_WORKAROUND_CAVIUM_30115  20
 #define ARM64_HAS_DCPOP21
+#define ARM64_HAS_RAS_EXTN 22
 
-#define ARM64_NCAPS22
+#define ARM64_NCAPS23
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index f707fed..64e2a80 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -332,6 +332,7 @@
 #define ID_AA64ISAR1_DPB_SHIFT 0
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT  28
 #define ID_AA64PFR0_GIC_SHIFT  24
 #define ID_AA64PFR0_ASIMD_SHIFT20
 #define ID_AA64PFR0_FP_SHIFT   16
@@ -340,6 +341,7 @@
 #define ID_AA64PFR0_EL1_SHIFT  4
 #define ID_AA64PFR0_EL0_SHIFT  0
 
+#define ID_AA64PFR0_RAS_V1 0x1
 #define ID_AA64PFR0_FP_NI  0xf
 #define ID_AA64PFR0_FP_SUPPORTED   0x0
 #define ID_AA64PFR0_ASIMD_NI   0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 21e2c95..4846974 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -125,6 +125,7 @@ static int __init register_cpu_hwcaps_dumper(void)
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+   ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 
ID_AA64PFR0_RAS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 
ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 
ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -900,6 +901,18 @@ static bool has_no_fpsimd(const struct 
arm64_cpu_capabilities *entry, int __unus
.min_field_value = 1,
},
 #endif
+#ifdef CONFIG_ARM64_RAS_EXTN
+   {
+   .desc = "RAS Extension Support",
+   .capability = ARM64_HAS_RAS_EXTN,
+   .def_scope = SCOPE_SYSTEM,
+   .matches = has_cpuid_feature,
+   .sys_reg = SYS_ID_AA64PFR0_EL1,
+   .sign = FTR_UNSIGNED,
+   .field_pos = ID_AA64PFR0_RAS_SHIFT,
+   .min_field_value = ID_AA64PFR0_RAS_V1,
+   },
+#endif /* CONFIG_ARM64_RAS_EXTN */
{},
 };
 
-- 
1.9.1



[PATCH v9 4/7] KVM: arm64: Trap RAS error registers and set HCR_EL2's TERR & TEA

2018-01-05 Thread Dongjiu Geng
ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
all Non-secure EL1&0 error record accesses to EL2.

This patch enables the two bits for the guest OS, guaranteeing that
KVM takes external aborts and traps attempts to access the physical
error registers.

ERRIDR_EL1 advertises the number of error records, we return
zero meaning we can treat all the other registers as RAZ/WI too.

Signed-off-by: Dongjiu Geng 
[removed specific emulation, use trap_raz_wi() directly for everything,
 rephrased parts of the commit message]
Signed-off-by: James Morse 
---
 arch/arm64/include/asm/kvm_arm.h |  2 ++
 arch/arm64/include/asm/kvm_emulate.h |  7 +++
 arch/arm64/include/asm/sysreg.h  | 10 ++
 arch/arm64/kvm/sys_regs.c| 10 ++
 4 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 61d694c..1188272 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
 #include 
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA(UL(1) << 37)
+#define HCR_TERR   (UL(1) << 36)
 #define HCR_E2H(UL(1) << 34)
 #define HCR_ID (UL(1) << 33)
 #define HCR_CD (UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fc..555b28b 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -47,6 +47,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
if (is_kernel_in_hyp_mode())
vcpu->arch.hcr_el2 |= HCR_E2H;
+   if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+   /* route synchronous external abort exceptions to EL2 */
+   vcpu->arch.hcr_el2 |= HCR_TEA;
+   /* trap error record accesses */
+   vcpu->arch.hcr_el2 |= HCR_TERR;
+   }
+
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 64e2a80..47b967d 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -169,6 +169,16 @@
 #define SYS_AFSR0_EL1  sys_reg(3, 0, 5, 1, 0)
 #define SYS_AFSR1_EL1  sys_reg(3, 0, 5, 1, 1)
 #define SYS_ESR_EL1sys_reg(3, 0, 5, 2, 0)
+
+#define SYS_ERRIDR_EL1 sys_reg(3, 0, 5, 3, 0)
+#define SYS_ERRSELR_EL1sys_reg(3, 0, 5, 3, 1)
+#define SYS_ERXFR_EL1  sys_reg(3, 0, 5, 4, 0)
+#define SYS_ERXCTLR_EL1sys_reg(3, 0, 5, 4, 1)
+#define SYS_ERXSTATUS_EL1  sys_reg(3, 0, 5, 4, 2)
+#define SYS_ERXADDR_EL1sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXMISC0_EL1   sys_reg(3, 0, 5, 5, 0)
+#define SYS_ERXMISC1_EL1   sys_reg(3, 0, 5, 5, 1)
+
 #define SYS_FAR_EL1sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1sys_reg(3, 0, 7, 4, 0)
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e070d3..2b1fafa 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -953,6 +953,16 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
+
+   { SYS_DESC(SYS_ERRIDR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERRSELR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXFR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXCTLR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXSTATUS_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXADDR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
+
{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
 
-- 
1.9.1



[PATCH v9 4/7] KVM: arm64: Trap RAS error registers and set HCR_EL2's TERR & TEA

2018-01-05 Thread Dongjiu Geng
ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
all Non-secure EL1&0 error record accesses to EL2.

This patch enables the two bits for the guest OS, guaranteeing that
KVM takes external aborts and traps attempts to access the physical
error registers.

ERRIDR_EL1 advertises the number of error records, we return
zero meaning we can treat all the other registers as RAZ/WI too.

Signed-off-by: Dongjiu Geng 
[removed specific emulation, use trap_raz_wi() directly for everything,
 rephrased parts of the commit message]
Signed-off-by: James Morse 
---
 arch/arm64/include/asm/kvm_arm.h |  2 ++
 arch/arm64/include/asm/kvm_emulate.h |  7 +++
 arch/arm64/include/asm/sysreg.h  | 10 ++
 arch/arm64/kvm/sys_regs.c| 10 ++
 4 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 61d694c..1188272 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
 #include 
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA(UL(1) << 37)
+#define HCR_TERR   (UL(1) << 36)
 #define HCR_E2H(UL(1) << 34)
 #define HCR_ID (UL(1) << 33)
 #define HCR_CD (UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fc..555b28b 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -47,6 +47,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
if (is_kernel_in_hyp_mode())
vcpu->arch.hcr_el2 |= HCR_E2H;
+   if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+   /* route synchronous external abort exceptions to EL2 */
+   vcpu->arch.hcr_el2 |= HCR_TEA;
+   /* trap error record accesses */
+   vcpu->arch.hcr_el2 |= HCR_TERR;
+   }
+
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 64e2a80..47b967d 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -169,6 +169,16 @@
 #define SYS_AFSR0_EL1  sys_reg(3, 0, 5, 1, 0)
 #define SYS_AFSR1_EL1  sys_reg(3, 0, 5, 1, 1)
 #define SYS_ESR_EL1sys_reg(3, 0, 5, 2, 0)
+
+#define SYS_ERRIDR_EL1 sys_reg(3, 0, 5, 3, 0)
+#define SYS_ERRSELR_EL1sys_reg(3, 0, 5, 3, 1)
+#define SYS_ERXFR_EL1  sys_reg(3, 0, 5, 4, 0)
+#define SYS_ERXCTLR_EL1sys_reg(3, 0, 5, 4, 1)
+#define SYS_ERXSTATUS_EL1  sys_reg(3, 0, 5, 4, 2)
+#define SYS_ERXADDR_EL1sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXMISC0_EL1   sys_reg(3, 0, 5, 5, 0)
+#define SYS_ERXMISC1_EL1   sys_reg(3, 0, 5, 5, 1)
+
 #define SYS_FAR_EL1sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1sys_reg(3, 0, 7, 4, 0)
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e070d3..2b1fafa 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -953,6 +953,16 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
+
+   { SYS_DESC(SYS_ERRIDR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERRSELR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXFR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXCTLR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXSTATUS_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXADDR_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
+   { SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
+
{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
 
-- 
1.9.1



[PATCH v9 2/7] KVM: arm64: Save ESR_EL2 on guest SError

2018-01-05 Thread Dongjiu Geng
From: James Morse 

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse 
---
 arch/arm64/kvm/hyp/switch.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c..fb5a538 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -228,11 +228,12 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, 
u64 *hpfar)
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-   u64 esr = read_sysreg_el2(esr);
-   u8 ec = ESR_ELx_EC(esr);
+   u8 ec;
+   u64 esr;
u64 hpfar, far;
 
-   vcpu->arch.fault.esr_el2 = esr;
+   esr = vcpu->arch.fault.esr_el2;
+   ec = ESR_ELx_EC(esr);
 
if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
return true;
@@ -313,6 +314,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
exit_code = __guest_enter(vcpu, host_ctxt);
/* And we're baaack! */
 
+   if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
+   vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
/*
 * We're using the raw exception code in order to only process
 * the trap if no SError is pending. We will come back to the
-- 
1.9.1



[PATCH v9 2/7] KVM: arm64: Save ESR_EL2 on guest SError

2018-01-05 Thread Dongjiu Geng
From: James Morse 

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse 
---
 arch/arm64/kvm/hyp/switch.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c..fb5a538 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -228,11 +228,12 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, 
u64 *hpfar)
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-   u64 esr = read_sysreg_el2(esr);
-   u8 ec = ESR_ELx_EC(esr);
+   u8 ec;
+   u64 esr;
u64 hpfar, far;
 
-   vcpu->arch.fault.esr_el2 = esr;
+   esr = vcpu->arch.fault.esr_el2;
+   ec = ESR_ELx_EC(esr);
 
if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
return true;
@@ -313,6 +314,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
exit_code = __guest_enter(vcpu, host_ctxt);
/* And we're baaack! */
 
+   if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
+   vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
/*
 * We're using the raw exception code in order to only process
 * the trap if no SError is pending. We will come back to the
-- 
1.9.1



Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Dave Hansen
On 01/05/2018 10:53 PM, Hanjun Guo wrote:
>>  +   /*
>>  +* PTI poisons low addresses in the kernel page tables in the
>>  +* name of making them unusable for userspace.  To execute
>>  +* code at such a low address, the poison must be cleared.
>>  +*/
>>  +   pgd->pgd &= ~_PAGE_NX;
>>
>> We will have a try in a minute, and report back later.
> And it works,we can boot/reboot the system successfully, thank
> you all the quick response and debug!

I think I'll just submit the attached patch if there are no objections
(and if it works, of course!).

I just stuck the NX clearing at the bottom.

From: Dave Hansen 

This is another case similar to what EFI does: create a new set of
page tables, map some code at a low address, and jump to it.  PTI
mistakes this low address for userspace and mistakenly marks it
non-executable in an effort to make it unusable for userspace.  Undo
the poison to allow execution.

Signed-off-by: Dave Hansen 
Cc: Ning Sun 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: tboot-de...@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
---

 b/arch/x86/kernel/tboot.c |   11 +++
 1 file changed, 11 insertions(+)

diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
--- a/arch/x86/kernel/tboot.c~pti-tboot-fix	2018-01-05 21:50:55.74960 -0800
+++ b/arch/x86/kernel/tboot.c	2018-01-05 23:51:41.368536890 -0800
@@ -138,6 +138,17 @@ static int map_tboot_page(unsigned long
 		return -1;
 	set_pte_at(_mm, vaddr, pte, pfn_pte(pfn, prot));
 	pte_unmap(pte);
+
+	/*
+	 * PTI poisons low addresses in the kernel page tables in the
+	 * name of making them unusable for userspace.  To execute
+	 * code at such a low address, the poison must be cleared.
+	 *
+	 * Note: 'pgd' actually gets set in p4d_alloc() _or_
+	 * pud_alloc() depending on 4/5-level paging.
+	 */
+	pgd->pgd &= ~_PAGE_NX;
+
 	return 0;
 }
 
_


Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Dave Hansen
On 01/05/2018 10:53 PM, Hanjun Guo wrote:
>>  +   /*
>>  +* PTI poisons low addresses in the kernel page tables in the
>>  +* name of making them unusable for userspace.  To execute
>>  +* code at such a low address, the poison must be cleared.
>>  +*/
>>  +   pgd->pgd &= ~_PAGE_NX;
>>
>> We will have a try in a minute, and report back later.
> And it works,we can boot/reboot the system successfully, thank
> you all the quick response and debug!

I think I'll just submit the attached patch if there are no objections
(and if it works, of course!).

I just stuck the NX clearing at the bottom.

From: Dave Hansen 

This is another case similar to what EFI does: create a new set of
page tables, map some code at a low address, and jump to it.  PTI
mistakes this low address for userspace and mistakenly marks it
non-executable in an effort to make it unusable for userspace.  Undo
the poison to allow execution.

Signed-off-by: Dave Hansen 
Cc: Ning Sun 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: tboot-de...@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
---

 b/arch/x86/kernel/tboot.c |   11 +++
 1 file changed, 11 insertions(+)

diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
--- a/arch/x86/kernel/tboot.c~pti-tboot-fix	2018-01-05 21:50:55.74960 -0800
+++ b/arch/x86/kernel/tboot.c	2018-01-05 23:51:41.368536890 -0800
@@ -138,6 +138,17 @@ static int map_tboot_page(unsigned long
 		return -1;
 	set_pte_at(_mm, vaddr, pte, pfn_pte(pfn, prot));
 	pte_unmap(pte);
+
+	/*
+	 * PTI poisons low addresses in the kernel page tables in the
+	 * name of making them unusable for userspace.  To execute
+	 * code at such a low address, the poison must be cleared.
+	 *
+	 * Note: 'pgd' actually gets set in p4d_alloc() _or_
+	 * pud_alloc() depending on 4/5-level paging.
+	 */
+	pgd->pgd &= ~_PAGE_NX;
+
 	return 0;
 }
 
_


Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Dave Hansen
On 01/05/2018 10:28 PM, Hanjun Guo wrote:
>> +
>>  p4d = p4d_alloc(_mm, pgd, vaddr);
> Seems pgd will be re-set after p4d_alloc(), so should
> we put the code behind (or after pud_alloc())?

 Yes, it has to go below where the PGD actually gets set which is
after pud_alloc().  You can put it anywhere later in the function.



Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Dave Hansen
On 01/05/2018 10:28 PM, Hanjun Guo wrote:
>> +
>>  p4d = p4d_alloc(_mm, pgd, vaddr);
> Seems pgd will be re-set after p4d_alloc(), so should
> we put the code behind (or after pud_alloc())?

 Yes, it has to go below where the PGD actually gets set which is
after pud_alloc().  You can put it anywhere later in the function.



Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav



On 1/6/2018 2:35 AM, Alan Cox wrote:

On Sat, 6 Jan 2018 01:54:36 +0530
"Kohli, Gaurav"  wrote:


Hi Alan,

Sorry correcting the typo here:
+retval =  tty_ldisc_lock(tty, 5 * HZ);
+if (retval)
+     goto err_release_lock;
tty->port->itty = tty;
/*
* Structures all installed ... call the ldisc open routines.
* If we fail here just call release_tty to clean up.  No need
* to decrement the use counts, as release_tty doesn't care.
*/
retval = tty_ldisc_setup(tty, tty->link);
          if (retval)
              goto err_release_tty;
tty_ldisc_unlock(tty);
err_release_tty:
tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
      retval, idx);
+err_release_lock;
+tty_unlock(tty);
+release_tty(tty, idx);
+tty_ldisc_unlock(tty);
+return ERR_PTR(retval);

Thanks - can you give that a testing since for some reason you seem to be
the only system able to hit this and confirm that it's now working
properly. Then I'll push it upstream

And thanks for doing all the debug work to find this and identify what
was going on.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Thanks Alan for your support, yes we will try to reproduce and get back 
to you.
Ideally it take 2-3 days for issue reproduction, but yes it is 
consistently reproducible.


Regards
Gaurav

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.



Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

2018-01-05 Thread Kohli, Gaurav



On 1/6/2018 2:35 AM, Alan Cox wrote:

On Sat, 6 Jan 2018 01:54:36 +0530
"Kohli, Gaurav"  wrote:


Hi Alan,

Sorry correcting the typo here:
+retval =  tty_ldisc_lock(tty, 5 * HZ);
+if (retval)
+     goto err_release_lock;
tty->port->itty = tty;
/*
* Structures all installed ... call the ldisc open routines.
* If we fail here just call release_tty to clean up.  No need
* to decrement the use counts, as release_tty doesn't care.
*/
retval = tty_ldisc_setup(tty, tty->link);
          if (retval)
              goto err_release_tty;
tty_ldisc_unlock(tty);
err_release_tty:
tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
      retval, idx);
+err_release_lock;
+tty_unlock(tty);
+release_tty(tty, idx);
+tty_ldisc_unlock(tty);
+return ERR_PTR(retval);

Thanks - can you give that a testing since for some reason you seem to be
the only system able to hit this and confirm that it's now working
properly. Then I'll push it upstream

And thanks for doing all the debug work to find this and identify what
was going on.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Thanks Alan for your support, yes we will try to reproduce and get back 
to you.
Ideally it take 2-3 days for issue reproduction, but yes it is 
consistently reproducible.


Regards
Gaurav

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. 
is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.



Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Hanjun Guo
On 2018/1/6 14:28, Hanjun Guo wrote:
> Hi Dave,
> 
> Thank you very much for the quick response! Minor comments inline.
> 
> On 2018/1/6 14:06, Dave Hansen wrote:
>> On 01/05/2018 08:54 PM, Hanjun Guo wrote:
>>> Do you mean NX bit will be brought back later? I'm asking this because
>>> I tested this patch which it fixed the boot panic issue but the system
>>> will hang when rebooting the system, because rebooting will also call efi
>>> then panic as NS bit is set.
>> Wow, you're running a lot of very lighly-used code paths!  You actually
>> found a similar but totally separate issue from what I gather.  Thank
>> you immensely for the quick testing and bug reports!
>>
>> Could you test the attached fix?
>>
>> For those playing along at home, I think this will end up being needed
>> for 4.15 and probably all the backports.  I want to see if it works
>> before I submit it for real, though.
>>
>>
>> pti-tboot-fix.patch
>>
>>
>> From: Dave Hansen 
>>
>> This is another case similar to what EFI does: create a new set of
>> page tables, map some code at a low address, and jump to it.  PTI
>> mistakes this low address for userspace and mistakenly marks it
>> non-executable in an effort to make it unusable for userspace.  Undo
>> the poison to allow execution.
>>
>> Signed-off-by: Dave Hansen 
>> Cc: Ning Sun 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: "H. Peter Anvin" 
>> Cc: x...@kernel.org
>> Cc: tboot-de...@lists.sourceforge.net
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>
>>  b/arch/x86/kernel/tboot.c |7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
>> --- a/arch/x86/kernel/tboot.c~pti-tboot-fix  2018-01-05 21:50:55.74960 
>> -0800
>> +++ b/arch/x86/kernel/tboot.c2018-01-05 22:01:51.393553325 -0800
>> @@ -124,6 +124,13 @@ static int map_tboot_page(unsigned long
>>  pte_t *pte;
>>  
>>  pgd = pgd_offset(_mm, vaddr);
>> +/*
>> + * PTI poisons low addresses in the kernel page tables in the
>> + * name of making them unusable for userspace.  To execute
>> + * code at such a low address, the poison must be cleared.
>> + */
>> +pgd->pgd &= ~_PAGE_NX;
> 
> ...
> 
>> +
>>  p4d = p4d_alloc(_mm, pgd, vaddr);
> 
> Seems pgd will be re-set after p4d_alloc(), so should
> we put the code behind (or after pud_alloc())?
> 
>>  if (!p4d)
>>  return -1;
> 
>  +/*
>  + * PTI poisons low addresses in the kernel page tables in the
>  + * name of making them unusable for userspace.  To execute
>  + * code at such a low address, the poison must be cleared.
>  + */
>  +pgd->pgd &= ~_PAGE_NX;
> 
> We will have a try in a minute, and report back later.

And it works,we can boot/reboot the system successfully, thank
you all the quick response and debug!

Thanks
Hanjun



Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Hanjun Guo
On 2018/1/6 14:28, Hanjun Guo wrote:
> Hi Dave,
> 
> Thank you very much for the quick response! Minor comments inline.
> 
> On 2018/1/6 14:06, Dave Hansen wrote:
>> On 01/05/2018 08:54 PM, Hanjun Guo wrote:
>>> Do you mean NX bit will be brought back later? I'm asking this because
>>> I tested this patch which it fixed the boot panic issue but the system
>>> will hang when rebooting the system, because rebooting will also call efi
>>> then panic as NS bit is set.
>> Wow, you're running a lot of very lighly-used code paths!  You actually
>> found a similar but totally separate issue from what I gather.  Thank
>> you immensely for the quick testing and bug reports!
>>
>> Could you test the attached fix?
>>
>> For those playing along at home, I think this will end up being needed
>> for 4.15 and probably all the backports.  I want to see if it works
>> before I submit it for real, though.
>>
>>
>> pti-tboot-fix.patch
>>
>>
>> From: Dave Hansen 
>>
>> This is another case similar to what EFI does: create a new set of
>> page tables, map some code at a low address, and jump to it.  PTI
>> mistakes this low address for userspace and mistakenly marks it
>> non-executable in an effort to make it unusable for userspace.  Undo
>> the poison to allow execution.
>>
>> Signed-off-by: Dave Hansen 
>> Cc: Ning Sun 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: "H. Peter Anvin" 
>> Cc: x...@kernel.org
>> Cc: tboot-de...@lists.sourceforge.net
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>
>>  b/arch/x86/kernel/tboot.c |7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
>> --- a/arch/x86/kernel/tboot.c~pti-tboot-fix  2018-01-05 21:50:55.74960 
>> -0800
>> +++ b/arch/x86/kernel/tboot.c2018-01-05 22:01:51.393553325 -0800
>> @@ -124,6 +124,13 @@ static int map_tboot_page(unsigned long
>>  pte_t *pte;
>>  
>>  pgd = pgd_offset(_mm, vaddr);
>> +/*
>> + * PTI poisons low addresses in the kernel page tables in the
>> + * name of making them unusable for userspace.  To execute
>> + * code at such a low address, the poison must be cleared.
>> + */
>> +pgd->pgd &= ~_PAGE_NX;
> 
> ...
> 
>> +
>>  p4d = p4d_alloc(_mm, pgd, vaddr);
> 
> Seems pgd will be re-set after p4d_alloc(), so should
> we put the code behind (or after pud_alloc())?
> 
>>  if (!p4d)
>>  return -1;
> 
>  +/*
>  + * PTI poisons low addresses in the kernel page tables in the
>  + * name of making them unusable for userspace.  To execute
>  + * code at such a low address, the poison must be cleared.
>  + */
>  +pgd->pgd &= ~_PAGE_NX;
> 
> We will have a try in a minute, and report back later.

And it works,we can boot/reboot the system successfully, thank
you all the quick response and debug!

Thanks
Hanjun



Re: KASAN: use-after-free Read in sctp_packet_transmit

2018-01-05 Thread Xin Long
On Sat, Jan 6, 2018 at 6:07 AM, syzbot
 wrote:
> Hello,
>
> syzkaller hit the following crash on
> 8a4816cad00bf14642f0ed6043b32d29a05006ce
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+5adcca18fca253b4c...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> ==
> BUG: KASAN: use-after-free in sctp_packet_transmit+0x3505/0x3750
> net/sctp/output.c:643
> Read of size 8 at addr 8801bda9fb80 by task modprobe/23740
>
> CPU: 1 PID: 23740 Comm: modprobe Not tainted 4.15.0-rc5+ #175
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>  sctp_packet_transmit+0x3505/0x3750 net/sctp/output.c:643
>  sctp_outq_flush+0x121b/0x4060 net/sctp/outqueue.c:1197
>  sctp_outq_uncork+0x5a/0x70 net/sctp/outqueue.c:776
>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1807 [inline]
>  sctp_side_effects net/sctp/sm_sideeffect.c:1210 [inline]
>  sctp_do_sm+0x4e0/0x6ed0 net/sctp/sm_sideeffect.c:1181
>  sctp_generate_heartbeat_event+0x292/0x3f0 net/sctp/sm_sideeffect.c:406
>  call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
>  expire_timers kernel/time/timer.c:1357 [inline]
>  __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
>  run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
>  __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:540 [inline]
>  smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
>  
> RIP: 0010:__preempt_count_add arch/x86/include/asm/preempt.h:76 [inline]
> RIP: 0010:__rcu_read_lock include/linux/rcupdate.h:83 [inline]
> RIP: 0010:rcu_read_lock include/linux/rcupdate.h:629 [inline]
> RIP: 0010:__is_insn_slot_addr+0x8f/0x330 kernel/kprobes.c:303
> RSP: 0018:8801d4937430 EFLAGS: 0283 ORIG_RAX: ff11
> RAX: 8801bf13c000 RBX: 8656dd00 RCX: 8170bd88
> RDX:  RSI:  RDI: 8656dd00
> RBP: 8801d4937518 R08:  R09: 11003a926e67
> R10: 8801d4937300 R11:  R12: 
> R13:  R14: 8801d49374f0 R15: 8801dae230c0
>  is_kprobe_insn_slot include/linux/kprobes.h:318 [inline]
>  kernel_text_address+0x132/0x140 kernel/extable.c:150
>  __kernel_text_address+0xd/0x40 kernel/extable.c:107
>  unwind_get_return_address+0x61/0xa0 arch/x86/kernel/unwind_frame.c:18
>  __save_stack_trace+0x7e/0xd0 arch/x86/kernel/stacktrace.c:45
>  save_stack_trace+0x1a/0x20 arch/x86/kernel/stacktrace.c:60
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
>  kmem_cache_alloc+0x12e/0x760 mm/slab.c:3544
>  kmem_cache_zalloc include/linux/slab.h:678 [inline]
>  file_alloc_security security/selinux/hooks.c:369 [inline]
>  selinux_file_alloc_security+0xae/0x190 security/selinux/hooks.c:3454
>  security_file_alloc+0x6d/0xa0 security/security.c:873
>  get_empty_filp+0x189/0x4f0 fs/file_table.c:129
>  path_openat+0xed/0x3530 fs/namei.c:3496
>  do_filp_open+0x25b/0x3b0 fs/namei.c:3554
>  do_sys_open+0x502/0x6d0 fs/open.c:1059
>  SYSC_open fs/open.c:1077 [inline]
>  SyS_open+0x2d/0x40 fs/open.c:1072
>  entry_SYSCALL_64_fastpath+0x23/0x9a
> RIP: 0033:0x7efdff1bb120
> RSP: 002b:7ffde6213c08 EFLAGS: 0246 ORIG_RAX: 0002
> RAX: ffda RBX: 55c34fab4090 RCX: 7efdff1bb120
> RDX: 01b6 RSI: 0008 RDI: 7ffde6213d20
> RBP: 7ffde6214d90 R08: 0008 R09: 0001
> R10:  R11: 0246 R12: 55c34fab4090
> R13: 7ffde6215de0 R14:  R15: 
>
> Allocated by task 23739:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
>  kmem_cache_alloc+0x12e/0x760 mm/slab.c:3544
>  kmem_cache_zalloc include/linux/slab.h:678 

Re: KASAN: use-after-free Read in sctp_packet_transmit

2018-01-05 Thread Xin Long
On Sat, Jan 6, 2018 at 6:07 AM, syzbot
 wrote:
> Hello,
>
> syzkaller hit the following crash on
> 8a4816cad00bf14642f0ed6043b32d29a05006ce
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+5adcca18fca253b4c...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> ==
> BUG: KASAN: use-after-free in sctp_packet_transmit+0x3505/0x3750
> net/sctp/output.c:643
> Read of size 8 at addr 8801bda9fb80 by task modprobe/23740
>
> CPU: 1 PID: 23740 Comm: modprobe Not tainted 4.15.0-rc5+ #175
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>  sctp_packet_transmit+0x3505/0x3750 net/sctp/output.c:643
>  sctp_outq_flush+0x121b/0x4060 net/sctp/outqueue.c:1197
>  sctp_outq_uncork+0x5a/0x70 net/sctp/outqueue.c:776
>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1807 [inline]
>  sctp_side_effects net/sctp/sm_sideeffect.c:1210 [inline]
>  sctp_do_sm+0x4e0/0x6ed0 net/sctp/sm_sideeffect.c:1181
>  sctp_generate_heartbeat_event+0x292/0x3f0 net/sctp/sm_sideeffect.c:406
>  call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
>  expire_timers kernel/time/timer.c:1357 [inline]
>  __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
>  run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
>  __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:540 [inline]
>  smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
>  
> RIP: 0010:__preempt_count_add arch/x86/include/asm/preempt.h:76 [inline]
> RIP: 0010:__rcu_read_lock include/linux/rcupdate.h:83 [inline]
> RIP: 0010:rcu_read_lock include/linux/rcupdate.h:629 [inline]
> RIP: 0010:__is_insn_slot_addr+0x8f/0x330 kernel/kprobes.c:303
> RSP: 0018:8801d4937430 EFLAGS: 0283 ORIG_RAX: ff11
> RAX: 8801bf13c000 RBX: 8656dd00 RCX: 8170bd88
> RDX:  RSI:  RDI: 8656dd00
> RBP: 8801d4937518 R08:  R09: 11003a926e67
> R10: 8801d4937300 R11:  R12: 
> R13:  R14: 8801d49374f0 R15: 8801dae230c0
>  is_kprobe_insn_slot include/linux/kprobes.h:318 [inline]
>  kernel_text_address+0x132/0x140 kernel/extable.c:150
>  __kernel_text_address+0xd/0x40 kernel/extable.c:107
>  unwind_get_return_address+0x61/0xa0 arch/x86/kernel/unwind_frame.c:18
>  __save_stack_trace+0x7e/0xd0 arch/x86/kernel/stacktrace.c:45
>  save_stack_trace+0x1a/0x20 arch/x86/kernel/stacktrace.c:60
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
>  kmem_cache_alloc+0x12e/0x760 mm/slab.c:3544
>  kmem_cache_zalloc include/linux/slab.h:678 [inline]
>  file_alloc_security security/selinux/hooks.c:369 [inline]
>  selinux_file_alloc_security+0xae/0x190 security/selinux/hooks.c:3454
>  security_file_alloc+0x6d/0xa0 security/security.c:873
>  get_empty_filp+0x189/0x4f0 fs/file_table.c:129
>  path_openat+0xed/0x3530 fs/namei.c:3496
>  do_filp_open+0x25b/0x3b0 fs/namei.c:3554
>  do_sys_open+0x502/0x6d0 fs/open.c:1059
>  SYSC_open fs/open.c:1077 [inline]
>  SyS_open+0x2d/0x40 fs/open.c:1072
>  entry_SYSCALL_64_fastpath+0x23/0x9a
> RIP: 0033:0x7efdff1bb120
> RSP: 002b:7ffde6213c08 EFLAGS: 0246 ORIG_RAX: 0002
> RAX: ffda RBX: 55c34fab4090 RCX: 7efdff1bb120
> RDX: 01b6 RSI: 0008 RDI: 7ffde6213d20
> RBP: 7ffde6214d90 R08: 0008 R09: 0001
> R10:  R11: 0246 R12: 55c34fab4090
> R13: 7ffde6215de0 R14:  R15: 
>
> Allocated by task 23739:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
>  kmem_cache_alloc+0x12e/0x760 mm/slab.c:3544
>  kmem_cache_zalloc include/linux/slab.h:678 [inline]
>  sctp_chunkify+0xce/0x3f0 

Re: [RFC] boot failed when enable KAISER/KPTI

2018-01-05 Thread Xishi Qiu
On 2018/1/6 2:33, Jiri Kosina wrote:

> On Fri, 5 Jan 2018, Xishi Qiu wrote:
> 
>> I run the latest RHEL 7.2 with the KAISER/KPTI patch, and boot failed.
>>
>> ...
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x810-0x8ff]
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x910-0xfff]
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x1010-0x10ff]
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x1110-0x17ff]
>> [0.00] PM: Regitered nosave memory: [mem 
>> 0x1810-0x18ff]
>> [0.00] e820: [mem 0x9000-0xfed1bfff] available for PCI devices
>> [0.00] Booting paravirtualized kernel on bare hardware
>> [0.00] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:1536 
>> nr_cpu_ids:1536 nr_node_ids:8
>> [0.00] PERCPU: max_distance=0x180ffe24 too large for vmalloc 
>> space 0x1fff
>> [0.00] setup_percpu: auto allocator failed (-22), falling back to 
>> page size
>> [0.00] PERCPU: 32 4K pages/cpu @c900 s107200 r8192 d15680
>> [0.00] Built 8 zonelists in Zone order, mobility grouping on.  Total 
>> pages: 132001804
>> [0.00] Policy zone: Normal
>> iosdevname=0 8250.nr_uarts=8 efi=old_map rdloaddriver=usb_storage 
>> rdloaddriver=sd_mod udev.event-timeout=600 softlockup_panic=0 
>> rcupdate.rcu_cpu_stall_timeout=300
>> [0.00] Intel-IOMMU: enabled
>> [0.00] PID hash table entries: 4096 (order: 3, 32768 bytes)
>> [0.00] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
>> [0.00] xsave: enabled xstate_bv 0x7, cntxt size 0x340
>> [0.00] AGP: Checking aperture...
>> [0.00] AGP: No AGP bridge found
>> [0.00] Memory: 526901612k/26910638080k available (6528k kernel code, 
>> 26374249692k absent, 9486776k reserved, 4302k data, 1676k init)
>> [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1536, Nodes=8
>> [0.00] x86/pti: Unmapping kernel while in userspace
>> [0.00] Hierarchical RCU implementation.
>> [0.00]   RCU restricting CPUs from NR_CPUS=5120 to 
>> nr_cpu_ids=1536.
>> [0.00]   Offload RCU callbacks from all CPUs
>> [0.00]   Offload RCU callbacks from CPUs: 0-1535.
>> [0.00] NR_IRQS:327936 nr_irqs:15976 0
>> [0.00] Console: colour dummy device 80x25
>> [0.00] console [tty0] enabled
>> [0.00] console [ttyS0] enabled
>> [0.00] allocated 2145910784 bytes of page_cgroup
>> [0.00] please try 'cgroup_disable=memory' option if you don't want 
>> memory cgroups
>> [0.00] Enabling automatic NUMA balancing. Configure with 
>> numa_balancing= or the kernel.numa_balancing sysctl
>> [0.00] tsc: Fast TSC calibration using PIT
>> [0.00] tsc: Detected 2799.999 MHz processor
>> [0.001803] Calibrating delay loop (skipped), value calculated using 
>> timer frequency.. 5599.99 BogoMIPS (lpj=279)
>> [0.012408] pid_max: default: 1572864 minimum: 12288
>> [0.017987] init_memory_mapping: [mem 0x5947f000-0x5b47efff]
>> [0.023701] init_memory_mapping: [mem 0x5b47f000-0x5b87efff]
>> [0.029369] init_memory_mapping: [mem 0x6d368000-0x6d3edfff]
>> [0.039130] BUG: unable to handle kernel paging request at 
>> 5b835f90
>> [0.046101] IP: [<5b835f90>] 0x5b835f8f
>> [0.050637] PGD 81f61067 PUD 190ffefff067 PMD 190ffeffd067 PTE 
>> 5b835063
>> [0.057989] Oops: 0011 [#1] SMP 
>> [0.061241] Modules linked in:
>> [0.064304] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
>> 3.10.0-327.59.59.46.h42.x86_64 #1
>> [0.072280] Hardware name: Huawei FusionServer9032/IT91SMUB, BIOS 
>> BLXSV316 11/14/2017
>> [0.080082] task: 8196e440 ti: 81958000 task.ti: 
>> 81958000
>> [0.087539] RIP: 0010:[<5b835f90>]  [<5b835f90>] 
>> 0x5b835f8f
>> [0.094494] RSP: :8195be28  EFLAGS: 00010046
>> [0.099788] RAX: 80050033 RBX: 910fbc802000 RCX: 
>> 02d0
>> [0.106897] RDX: 0030 RSI: 02d0 RDI: 
>> 5b835f90
>> [0.114006] RBP: 8195bf38 R08: 0001 R09: 
>> 090fbc802000
>> [0.121116] R10: 88ffbcc07340 R11: 0001 R12: 
>> 0001
>> [0.128225] R13: 090fbc802000 R14: 02d0 R15: 
>> 0001
>> [0.135336] FS:  () GS:c900() 
>> knlGS:
>> [0.143398] CS:  0010 DS:  ES:  CR0: 80050033
>> [0.149124] CR2: 5b835f90 CR3: 01966000 CR4: 
>> 000606b0
>> [0.156234] DR0:  DR1:  DR2: 
>> 
>> [0.163344] DR3:  DR6: fffe0ff0 DR7: 
>> 0400
>> [0.170454] Call Trace:
>> [0.172899]  [] ? efi_call4+0x6c/0xf0
> 
> EFI old memmap have NX bit 

Re: [RFC] boot failed when enable KAISER/KPTI

2018-01-05 Thread Xishi Qiu
On 2018/1/6 2:33, Jiri Kosina wrote:

> On Fri, 5 Jan 2018, Xishi Qiu wrote:
> 
>> I run the latest RHEL 7.2 with the KAISER/KPTI patch, and boot failed.
>>
>> ...
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x810-0x8ff]
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x910-0xfff]
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x1010-0x10ff]
>> [0.00] PM: Registered nosave memory: [mem 
>> 0x1110-0x17ff]
>> [0.00] PM: Regitered nosave memory: [mem 
>> 0x1810-0x18ff]
>> [0.00] e820: [mem 0x9000-0xfed1bfff] available for PCI devices
>> [0.00] Booting paravirtualized kernel on bare hardware
>> [0.00] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:1536 
>> nr_cpu_ids:1536 nr_node_ids:8
>> [0.00] PERCPU: max_distance=0x180ffe24 too large for vmalloc 
>> space 0x1fff
>> [0.00] setup_percpu: auto allocator failed (-22), falling back to 
>> page size
>> [0.00] PERCPU: 32 4K pages/cpu @c900 s107200 r8192 d15680
>> [0.00] Built 8 zonelists in Zone order, mobility grouping on.  Total 
>> pages: 132001804
>> [0.00] Policy zone: Normal
>> iosdevname=0 8250.nr_uarts=8 efi=old_map rdloaddriver=usb_storage 
>> rdloaddriver=sd_mod udev.event-timeout=600 softlockup_panic=0 
>> rcupdate.rcu_cpu_stall_timeout=300
>> [0.00] Intel-IOMMU: enabled
>> [0.00] PID hash table entries: 4096 (order: 3, 32768 bytes)
>> [0.00] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
>> [0.00] xsave: enabled xstate_bv 0x7, cntxt size 0x340
>> [0.00] AGP: Checking aperture...
>> [0.00] AGP: No AGP bridge found
>> [0.00] Memory: 526901612k/26910638080k available (6528k kernel code, 
>> 26374249692k absent, 9486776k reserved, 4302k data, 1676k init)
>> [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1536, Nodes=8
>> [0.00] x86/pti: Unmapping kernel while in userspace
>> [0.00] Hierarchical RCU implementation.
>> [0.00]   RCU restricting CPUs from NR_CPUS=5120 to 
>> nr_cpu_ids=1536.
>> [0.00]   Offload RCU callbacks from all CPUs
>> [0.00]   Offload RCU callbacks from CPUs: 0-1535.
>> [0.00] NR_IRQS:327936 nr_irqs:15976 0
>> [0.00] Console: colour dummy device 80x25
>> [0.00] console [tty0] enabled
>> [0.00] console [ttyS0] enabled
>> [0.00] allocated 2145910784 bytes of page_cgroup
>> [0.00] please try 'cgroup_disable=memory' option if you don't want 
>> memory cgroups
>> [0.00] Enabling automatic NUMA balancing. Configure with 
>> numa_balancing= or the kernel.numa_balancing sysctl
>> [0.00] tsc: Fast TSC calibration using PIT
>> [0.00] tsc: Detected 2799.999 MHz processor
>> [0.001803] Calibrating delay loop (skipped), value calculated using 
>> timer frequency.. 5599.99 BogoMIPS (lpj=279)
>> [0.012408] pid_max: default: 1572864 minimum: 12288
>> [0.017987] init_memory_mapping: [mem 0x5947f000-0x5b47efff]
>> [0.023701] init_memory_mapping: [mem 0x5b47f000-0x5b87efff]
>> [0.029369] init_memory_mapping: [mem 0x6d368000-0x6d3edfff]
>> [0.039130] BUG: unable to handle kernel paging request at 
>> 5b835f90
>> [0.046101] IP: [<5b835f90>] 0x5b835f8f
>> [0.050637] PGD 81f61067 PUD 190ffefff067 PMD 190ffeffd067 PTE 
>> 5b835063
>> [0.057989] Oops: 0011 [#1] SMP 
>> [0.061241] Modules linked in:
>> [0.064304] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
>> 3.10.0-327.59.59.46.h42.x86_64 #1
>> [0.072280] Hardware name: Huawei FusionServer9032/IT91SMUB, BIOS 
>> BLXSV316 11/14/2017
>> [0.080082] task: 8196e440 ti: 81958000 task.ti: 
>> 81958000
>> [0.087539] RIP: 0010:[<5b835f90>]  [<5b835f90>] 
>> 0x5b835f8f
>> [0.094494] RSP: :8195be28  EFLAGS: 00010046
>> [0.099788] RAX: 80050033 RBX: 910fbc802000 RCX: 
>> 02d0
>> [0.106897] RDX: 0030 RSI: 02d0 RDI: 
>> 5b835f90
>> [0.114006] RBP: 8195bf38 R08: 0001 R09: 
>> 090fbc802000
>> [0.121116] R10: 88ffbcc07340 R11: 0001 R12: 
>> 0001
>> [0.128225] R13: 090fbc802000 R14: 02d0 R15: 
>> 0001
>> [0.135336] FS:  () GS:c900() 
>> knlGS:
>> [0.143398] CS:  0010 DS:  ES:  CR0: 80050033
>> [0.149124] CR2: 5b835f90 CR3: 01966000 CR4: 
>> 000606b0
>> [0.156234] DR0:  DR1:  DR2: 
>> 
>> [0.163344] DR3:  DR6: fffe0ff0 DR7: 
>> 0400
>> [0.170454] Call Trace:
>> [0.172899]  [] ? efi_call4+0x6c/0xf0
> 
> EFI old memmap have NX bit 

Re: [PATCH v2 0/8] IBRS patch series

2018-01-05 Thread Tim Chen


On 01/05/2018 06:12 PM, Tim Chen wrote:
> Thanks to everyone for the feedback on the initial posting.
> This is an updated patchset and I hope I've captured all
> the review comments.  I've done a lot of code clean up
> per everyone's comments. Please let me know if I've missed
> something.
> 

Should also have mentioned that the patch series applies on
top of 4.15-rc6.

Tim


Re: [PATCH v2 0/8] IBRS patch series

2018-01-05 Thread Tim Chen


On 01/05/2018 06:12 PM, Tim Chen wrote:
> Thanks to everyone for the feedback on the initial posting.
> This is an updated patchset and I hope I've captured all
> the review comments.  I've done a lot of code clean up
> per everyone's comments. Please let me know if I've missed
> something.
> 

Should also have mentioned that the patch series applies on
top of 4.15-rc6.

Tim


[PATCH v7] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
SoundWire bus provides sdw_read() and sdw_write() APIs for Slave
devices to program the registers. Provide support in regmap for
SoundWire bus.

Signed-off-by: Hardik T Shah 
Signed-off-by: Sanyog Kale 
Reviewed-by: Philippe Ombredanne 
Acked-by: Pierre-Louis Bossart 
Reviewed-by: Takashi Iwai 
Signed-off-by: Vinod Koul 
---
changes in v7:
 drop SDW bus select
 drop stride check

 drivers/base/regmap/Kconfig  |  7 
 drivers/base/regmap/Makefile |  1 +
 drivers/base/regmap/regmap-sdw.c | 88 
 include/linux/regmap.h   | 37 +
 4 files changed, 133 insertions(+)
 create mode 100644 drivers/base/regmap/regmap-sdw.c

diff --git a/drivers/base/regmap/Kconfig b/drivers/base/regmap/Kconfig
index 0368fd7b3a41..1eb90c8ad82e 100644
--- a/drivers/base/regmap/Kconfig
+++ b/drivers/base/regmap/Kconfig
@@ -37,3 +37,10 @@ config REGMAP_MMIO
 
 config REGMAP_IRQ
bool
+
+config REGMAP_HWSPINLOCK
+   bool
+
+config REGMAP_SOUNDWIRE
+   tristate
+   depends on SOUNDWIRE_BUS
diff --git a/drivers/base/regmap/Makefile b/drivers/base/regmap/Makefile
index 0d298c446108..22d263cca395 100644
--- a/drivers/base/regmap/Makefile
+++ b/drivers/base/regmap/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_REGMAP_SPMI) += regmap-spmi.o
 obj-$(CONFIG_REGMAP_MMIO) += regmap-mmio.o
 obj-$(CONFIG_REGMAP_IRQ) += regmap-irq.o
 obj-$(CONFIG_REGMAP_W1) += regmap-w1.o
+obj-$(CONFIG_REGMAP_SOUNDWIRE) += regmap-sdw.o
diff --git a/drivers/base/regmap/regmap-sdw.c b/drivers/base/regmap/regmap-sdw.c
new file mode 100644
index ..50a66382d87d
--- /dev/null
+++ b/drivers/base/regmap/regmap-sdw.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright(c) 2015-17 Intel Corporation.
+
+#include 
+#include 
+#include 
+#include 
+#include "internal.h"
+
+static int regmap_sdw_write(void *context, unsigned int reg, unsigned int val)
+{
+   struct device *dev = context;
+   struct sdw_slave *slave = dev_to_sdw_dev(dev);
+
+   return sdw_write(slave, reg, val);
+}
+
+static int regmap_sdw_read(void *context, unsigned int reg, unsigned int *val)
+{
+   struct device *dev = context;
+   struct sdw_slave *slave = dev_to_sdw_dev(dev);
+   int read;
+
+   read = sdw_read(slave, reg);
+   if (read < 0)
+   return read;
+
+   *val = read;
+   return 0;
+}
+
+static struct regmap_bus regmap_sdw = {
+   .reg_read = regmap_sdw_read,
+   .reg_write = regmap_sdw_write,
+   .reg_format_endian_default = REGMAP_ENDIAN_LITTLE,
+   .val_format_endian_default = REGMAP_ENDIAN_LITTLE,
+};
+
+static int regmap_sdw_config_check(const struct regmap_config *config)
+{
+   /* All register are 8-bits wide as per MIPI Soundwire 1.0 Spec */
+   if (config->val_bits != 8)
+   return -ENOTSUPP;
+
+   /* Registers are 32 bits wide */
+   if (config->reg_bits != 32)
+   return -ENOTSUPP;
+
+   if (config->pad_bits != 0)
+   return -ENOTSUPP;
+
+   return 0;
+}
+
+struct regmap *__regmap_init_sdw(struct sdw_slave *sdw,
+const struct regmap_config *config,
+struct lock_class_key *lock_key,
+const char *lock_name)
+{
+   int ret;
+
+   ret = regmap_sdw_config_check(config);
+   if (ret)
+   return ERR_PTR(ret);
+
+   return __regmap_init(>dev, _sdw,
+   >dev, config, lock_key, lock_name);
+}
+EXPORT_SYMBOL_GPL(__regmap_init_sdw);
+
+struct regmap *__devm_regmap_init_sdw(struct sdw_slave *sdw,
+ const struct regmap_config *config,
+ struct lock_class_key *lock_key,
+ const char *lock_name)
+{
+   int ret;
+
+   ret = regmap_sdw_config_check(config);
+   if (ret)
+   return ERR_PTR(ret);
+
+   return __devm_regmap_init(>dev, _sdw,
+   >dev, config, lock_key, lock_name);
+}
+EXPORT_SYMBOL_GPL(__devm_regmap_init_sdw);
+
+MODULE_DESCRIPTION("Regmap SoundWire Module");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 671515911b03..27e242bed62c 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -30,6 +30,7 @@ struct regmap;
 struct regmap_range_cfg;
 struct regmap_field;
 struct snd_ac97;
+struct sdw_slave;
 
 /* An enum of all the supported cache types */
 enum regcache_type {
@@ -531,6 +532,10 @@ struct regmap *__regmap_init_ac97(struct snd_ac97 *ac97,
  const struct regmap_config *config,
  struct lock_class_key *lock_key,
  const char *lock_name);
+struct regmap 

[PATCH v7] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
SoundWire bus provides sdw_read() and sdw_write() APIs for Slave
devices to program the registers. Provide support in regmap for
SoundWire bus.

Signed-off-by: Hardik T Shah 
Signed-off-by: Sanyog Kale 
Reviewed-by: Philippe Ombredanne 
Acked-by: Pierre-Louis Bossart 
Reviewed-by: Takashi Iwai 
Signed-off-by: Vinod Koul 
---
changes in v7:
 drop SDW bus select
 drop stride check

 drivers/base/regmap/Kconfig  |  7 
 drivers/base/regmap/Makefile |  1 +
 drivers/base/regmap/regmap-sdw.c | 88 
 include/linux/regmap.h   | 37 +
 4 files changed, 133 insertions(+)
 create mode 100644 drivers/base/regmap/regmap-sdw.c

diff --git a/drivers/base/regmap/Kconfig b/drivers/base/regmap/Kconfig
index 0368fd7b3a41..1eb90c8ad82e 100644
--- a/drivers/base/regmap/Kconfig
+++ b/drivers/base/regmap/Kconfig
@@ -37,3 +37,10 @@ config REGMAP_MMIO
 
 config REGMAP_IRQ
bool
+
+config REGMAP_HWSPINLOCK
+   bool
+
+config REGMAP_SOUNDWIRE
+   tristate
+   depends on SOUNDWIRE_BUS
diff --git a/drivers/base/regmap/Makefile b/drivers/base/regmap/Makefile
index 0d298c446108..22d263cca395 100644
--- a/drivers/base/regmap/Makefile
+++ b/drivers/base/regmap/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_REGMAP_SPMI) += regmap-spmi.o
 obj-$(CONFIG_REGMAP_MMIO) += regmap-mmio.o
 obj-$(CONFIG_REGMAP_IRQ) += regmap-irq.o
 obj-$(CONFIG_REGMAP_W1) += regmap-w1.o
+obj-$(CONFIG_REGMAP_SOUNDWIRE) += regmap-sdw.o
diff --git a/drivers/base/regmap/regmap-sdw.c b/drivers/base/regmap/regmap-sdw.c
new file mode 100644
index ..50a66382d87d
--- /dev/null
+++ b/drivers/base/regmap/regmap-sdw.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright(c) 2015-17 Intel Corporation.
+
+#include 
+#include 
+#include 
+#include 
+#include "internal.h"
+
+static int regmap_sdw_write(void *context, unsigned int reg, unsigned int val)
+{
+   struct device *dev = context;
+   struct sdw_slave *slave = dev_to_sdw_dev(dev);
+
+   return sdw_write(slave, reg, val);
+}
+
+static int regmap_sdw_read(void *context, unsigned int reg, unsigned int *val)
+{
+   struct device *dev = context;
+   struct sdw_slave *slave = dev_to_sdw_dev(dev);
+   int read;
+
+   read = sdw_read(slave, reg);
+   if (read < 0)
+   return read;
+
+   *val = read;
+   return 0;
+}
+
+static struct regmap_bus regmap_sdw = {
+   .reg_read = regmap_sdw_read,
+   .reg_write = regmap_sdw_write,
+   .reg_format_endian_default = REGMAP_ENDIAN_LITTLE,
+   .val_format_endian_default = REGMAP_ENDIAN_LITTLE,
+};
+
+static int regmap_sdw_config_check(const struct regmap_config *config)
+{
+   /* All register are 8-bits wide as per MIPI Soundwire 1.0 Spec */
+   if (config->val_bits != 8)
+   return -ENOTSUPP;
+
+   /* Registers are 32 bits wide */
+   if (config->reg_bits != 32)
+   return -ENOTSUPP;
+
+   if (config->pad_bits != 0)
+   return -ENOTSUPP;
+
+   return 0;
+}
+
+struct regmap *__regmap_init_sdw(struct sdw_slave *sdw,
+const struct regmap_config *config,
+struct lock_class_key *lock_key,
+const char *lock_name)
+{
+   int ret;
+
+   ret = regmap_sdw_config_check(config);
+   if (ret)
+   return ERR_PTR(ret);
+
+   return __regmap_init(>dev, _sdw,
+   >dev, config, lock_key, lock_name);
+}
+EXPORT_SYMBOL_GPL(__regmap_init_sdw);
+
+struct regmap *__devm_regmap_init_sdw(struct sdw_slave *sdw,
+ const struct regmap_config *config,
+ struct lock_class_key *lock_key,
+ const char *lock_name)
+{
+   int ret;
+
+   ret = regmap_sdw_config_check(config);
+   if (ret)
+   return ERR_PTR(ret);
+
+   return __devm_regmap_init(>dev, _sdw,
+   >dev, config, lock_key, lock_name);
+}
+EXPORT_SYMBOL_GPL(__devm_regmap_init_sdw);
+
+MODULE_DESCRIPTION("Regmap SoundWire Module");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 671515911b03..27e242bed62c 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -30,6 +30,7 @@ struct regmap;
 struct regmap_range_cfg;
 struct regmap_field;
 struct snd_ac97;
+struct sdw_slave;
 
 /* An enum of all the supported cache types */
 enum regcache_type {
@@ -531,6 +532,10 @@ struct regmap *__regmap_init_ac97(struct snd_ac97 *ac97,
  const struct regmap_config *config,
  struct lock_class_key *lock_key,
  const char *lock_name);
+struct regmap *__regmap_init_sdw(struct sdw_slave *sdw,
+const struct regmap_config *config,
+struct 

Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Hanjun Guo
Hi Dave,

Thank you very much for the quick response! Minor comments inline.

On 2018/1/6 14:06, Dave Hansen wrote:
> On 01/05/2018 08:54 PM, Hanjun Guo wrote:
>> Do you mean NX bit will be brought back later? I'm asking this because
>> I tested this patch which it fixed the boot panic issue but the system
>> will hang when rebooting the system, because rebooting will also call efi
>> then panic as NS bit is set.
> Wow, you're running a lot of very lighly-used code paths!  You actually
> found a similar but totally separate issue from what I gather.  Thank
> you immensely for the quick testing and bug reports!
> 
> Could you test the attached fix?
> 
> For those playing along at home, I think this will end up being needed
> for 4.15 and probably all the backports.  I want to see if it works
> before I submit it for real, though.
> 
> 
> pti-tboot-fix.patch
> 
> 
> From: Dave Hansen 
> 
> This is another case similar to what EFI does: create a new set of
> page tables, map some code at a low address, and jump to it.  PTI
> mistakes this low address for userspace and mistakenly marks it
> non-executable in an effort to make it unusable for userspace.  Undo
> the poison to allow execution.
> 
> Signed-off-by: Dave Hansen 
> Cc: Ning Sun 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: tboot-de...@lists.sourceforge.net
> Cc: linux-kernel@vger.kernel.org
> ---
> 
>  b/arch/x86/kernel/tboot.c |7 +++
>  1 file changed, 7 insertions(+)
> 
> diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
> --- a/arch/x86/kernel/tboot.c~pti-tboot-fix   2018-01-05 21:50:55.74960 
> -0800
> +++ b/arch/x86/kernel/tboot.c 2018-01-05 22:01:51.393553325 -0800
> @@ -124,6 +124,13 @@ static int map_tboot_page(unsigned long
>   pte_t *pte;
>  
>   pgd = pgd_offset(_mm, vaddr);
> + /*
> +  * PTI poisons low addresses in the kernel page tables in the
> +  * name of making them unusable for userspace.  To execute
> +  * code at such a low address, the poison must be cleared.
> +  */
> + pgd->pgd &= ~_PAGE_NX;

...

> +
>   p4d = p4d_alloc(_mm, pgd, vaddr);

Seems pgd will be re-set after p4d_alloc(), so should
we put the code behind (or after pud_alloc())?

>   if (!p4d)
>   return -1;

 +  /*
 +   * PTI poisons low addresses in the kernel page tables in the
 +   * name of making them unusable for userspace.  To execute
 +   * code at such a low address, the poison must be cleared.
 +   */
 +  pgd->pgd &= ~_PAGE_NX;

We will have a try in a minute, and report back later.

Thanks
Hanjun



Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Hanjun Guo
Hi Dave,

Thank you very much for the quick response! Minor comments inline.

On 2018/1/6 14:06, Dave Hansen wrote:
> On 01/05/2018 08:54 PM, Hanjun Guo wrote:
>> Do you mean NX bit will be brought back later? I'm asking this because
>> I tested this patch which it fixed the boot panic issue but the system
>> will hang when rebooting the system, because rebooting will also call efi
>> then panic as NS bit is set.
> Wow, you're running a lot of very lighly-used code paths!  You actually
> found a similar but totally separate issue from what I gather.  Thank
> you immensely for the quick testing and bug reports!
> 
> Could you test the attached fix?
> 
> For those playing along at home, I think this will end up being needed
> for 4.15 and probably all the backports.  I want to see if it works
> before I submit it for real, though.
> 
> 
> pti-tboot-fix.patch
> 
> 
> From: Dave Hansen 
> 
> This is another case similar to what EFI does: create a new set of
> page tables, map some code at a low address, and jump to it.  PTI
> mistakes this low address for userspace and mistakenly marks it
> non-executable in an effort to make it unusable for userspace.  Undo
> the poison to allow execution.
> 
> Signed-off-by: Dave Hansen 
> Cc: Ning Sun 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: tboot-de...@lists.sourceforge.net
> Cc: linux-kernel@vger.kernel.org
> ---
> 
>  b/arch/x86/kernel/tboot.c |7 +++
>  1 file changed, 7 insertions(+)
> 
> diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
> --- a/arch/x86/kernel/tboot.c~pti-tboot-fix   2018-01-05 21:50:55.74960 
> -0800
> +++ b/arch/x86/kernel/tboot.c 2018-01-05 22:01:51.393553325 -0800
> @@ -124,6 +124,13 @@ static int map_tboot_page(unsigned long
>   pte_t *pte;
>  
>   pgd = pgd_offset(_mm, vaddr);
> + /*
> +  * PTI poisons low addresses in the kernel page tables in the
> +  * name of making them unusable for userspace.  To execute
> +  * code at such a low address, the poison must be cleared.
> +  */
> + pgd->pgd &= ~_PAGE_NX;

...

> +
>   p4d = p4d_alloc(_mm, pgd, vaddr);

Seems pgd will be re-set after p4d_alloc(), so should
we put the code behind (or after pud_alloc())?

>   if (!p4d)
>   return -1;

 +  /*
 +   * PTI poisons low addresses in the kernel page tables in the
 +   * name of making them unusable for userspace.  To execute
 +   * code at such a low address, the poison must be cleared.
 +   */
 +  pgd->pgd &= ~_PAGE_NX;

We will have a try in a minute, and report back later.

Thanks
Hanjun



Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-05 Thread Dan Williams
On Fri, Jan 5, 2018 at 6:22 PM, Eric W. Biederman  wrote:
> Dan Williams  writes:
>
>> Quoting Mark's original RFC:
>>
>> "Recently, Google Project Zero discovered several classes of attack
>> against speculative execution. One of these, known as variant-1, allows
>> explicit bounds checks to be bypassed under speculation, providing an
>> arbitrary read gadget. Further details can be found on the GPZ blog [1]
>> and the Documentation patch in this series."
>>
>> This series incorporates Mark Rutland's latest api and adds the x86
>> specific implementation of nospec_barrier. The
>> nospec_{array_ptr,ptr,barrier} helpers are then combined with a kernel
>> wide analysis performed by Elena Reshetova to address static analysis
>> reports where speculative execution on a userspace controlled value
>> could bypass a bounds check. The patches address a precondition for the
>> attack discussed in the Spectre paper [2].
>
> Please expand this.
>
> It is not clear what the static analysis is looking for.  Have a clear
> description of what is being fixed is crucial for allowing any of these
> changes.
>
> For the details given in the change description what I read is magic
> changes because a magic process says this code is vunlerable.

Yes, that was my first reaction to the patches as well, I try below to
add some more background and guidance, but in the end these are static
analysis reports across a wide swath of sub-systems. It's going to
take some iteration with domain experts to improve the patch
descriptions, and that's the point of this series, to get the better
trained eyes from the actual sub-system owners to take a look at these
reports.

For example, I'm looking for feedback like what Srinivas gave where he
identified that the report is bogus, the branch condition can not be
seeded with bad values in that path. Be like Srinivas.

> Given the similarities in the code that is being patched to many other
> places in the kernel it is not at all clear that this small set of
> changes is sufficient for any purpose.

I find this assertion absurd, when in the past have we as kernel
developers ever been handed a static analysis report and then
questioned why the static analysis did not flag other call sites
before first reviewing the ones it did find?

>> A consideration worth noting for reviewing these patches is to weigh the
>> dramatic cost of being wrong about whether a given report is exploitable
>> vs the overhead nospec_{array_ptr,ptr} may introduce. In other words,
>> lets make the bar for applying these patches be "can you prove that the
>> bounds check bypass is *not* exploitable". Consider that the Spectre
>> paper reports one example of a speculation window being ~180 cycles.
>
>
>> Note that there is also a proposal from Linus, array_access [3], that
>> attempts to quash speculative execution past a bounds check without
>> introducing an lfence instruction. That may be a future optimization
>> possibility that is compatible with this api, but it would appear to
>> need guarantees from the compiler that it is not clear the kernel can
>> rely on at this point. It is also not clear that it would be a
>> significant performance win vs lfence.
>
> It is also not clear that these changes fix anything, or are in any
> sense correct for the problem they are trying to fix as the problem
> is not clearly described.

I'll try my best. I don't have first hand knowledge of how the static
analyzer is doing this job, and I don't think it matters for
evaluating these reports. I'll give you my thoughts on how I would
handle one of these reports if it flagged one of the sub-systems I
maintain.

Start with the example from the Spectre paper:

if (x < array1_size)
y = array2[array1[x] * 256];

In all the patches 'x' and 'array1' are called out explicitly. For example:

net: mpls: prevent bounds-check bypass via speculative execution

Static analysis reports that 'index' may be a user controlled value that
is used as a data dependency reading 'rt' from the 'platform_label'
array...

So the first thing to review is whether the analyzer got it wrong and
'x' is not arbitrarily controllable by userspace to cause speculation
outside of the checked bounds. Be like Srinivas. The next step is to
ask whether the code can be refactored so that 'x' is sanitized
earlier in the call stack, especially if the nospec_array_ptr() lands
in a hot path. The next aspect that I expect most would be tempted to
go check is whether 'array2[array1[x]]' occurs later in the code
stream, but with speculation windows being architecture dependent and
potentially large (~180 cycles in one case says the paper) I submit
that we should err on the side of caution and not guess if that second
dependent read has been emitted somewhere in the instruction stream.

> In at least one place (mpls) you are patching a fast path.  Compile out
> or don't load mpls by all means.  But it is not 

Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-05 Thread Dan Williams
On Fri, Jan 5, 2018 at 6:22 PM, Eric W. Biederman  wrote:
> Dan Williams  writes:
>
>> Quoting Mark's original RFC:
>>
>> "Recently, Google Project Zero discovered several classes of attack
>> against speculative execution. One of these, known as variant-1, allows
>> explicit bounds checks to be bypassed under speculation, providing an
>> arbitrary read gadget. Further details can be found on the GPZ blog [1]
>> and the Documentation patch in this series."
>>
>> This series incorporates Mark Rutland's latest api and adds the x86
>> specific implementation of nospec_barrier. The
>> nospec_{array_ptr,ptr,barrier} helpers are then combined with a kernel
>> wide analysis performed by Elena Reshetova to address static analysis
>> reports where speculative execution on a userspace controlled value
>> could bypass a bounds check. The patches address a precondition for the
>> attack discussed in the Spectre paper [2].
>
> Please expand this.
>
> It is not clear what the static analysis is looking for.  Have a clear
> description of what is being fixed is crucial for allowing any of these
> changes.
>
> For the details given in the change description what I read is magic
> changes because a magic process says this code is vunlerable.

Yes, that was my first reaction to the patches as well, I try below to
add some more background and guidance, but in the end these are static
analysis reports across a wide swath of sub-systems. It's going to
take some iteration with domain experts to improve the patch
descriptions, and that's the point of this series, to get the better
trained eyes from the actual sub-system owners to take a look at these
reports.

For example, I'm looking for feedback like what Srinivas gave where he
identified that the report is bogus, the branch condition can not be
seeded with bad values in that path. Be like Srinivas.

> Given the similarities in the code that is being patched to many other
> places in the kernel it is not at all clear that this small set of
> changes is sufficient for any purpose.

I find this assertion absurd, when in the past have we as kernel
developers ever been handed a static analysis report and then
questioned why the static analysis did not flag other call sites
before first reviewing the ones it did find?

>> A consideration worth noting for reviewing these patches is to weigh the
>> dramatic cost of being wrong about whether a given report is exploitable
>> vs the overhead nospec_{array_ptr,ptr} may introduce. In other words,
>> lets make the bar for applying these patches be "can you prove that the
>> bounds check bypass is *not* exploitable". Consider that the Spectre
>> paper reports one example of a speculation window being ~180 cycles.
>
>
>> Note that there is also a proposal from Linus, array_access [3], that
>> attempts to quash speculative execution past a bounds check without
>> introducing an lfence instruction. That may be a future optimization
>> possibility that is compatible with this api, but it would appear to
>> need guarantees from the compiler that it is not clear the kernel can
>> rely on at this point. It is also not clear that it would be a
>> significant performance win vs lfence.
>
> It is also not clear that these changes fix anything, or are in any
> sense correct for the problem they are trying to fix as the problem
> is not clearly described.

I'll try my best. I don't have first hand knowledge of how the static
analyzer is doing this job, and I don't think it matters for
evaluating these reports. I'll give you my thoughts on how I would
handle one of these reports if it flagged one of the sub-systems I
maintain.

Start with the example from the Spectre paper:

if (x < array1_size)
y = array2[array1[x] * 256];

In all the patches 'x' and 'array1' are called out explicitly. For example:

net: mpls: prevent bounds-check bypass via speculative execution

Static analysis reports that 'index' may be a user controlled value that
is used as a data dependency reading 'rt' from the 'platform_label'
array...

So the first thing to review is whether the analyzer got it wrong and
'x' is not arbitrarily controllable by userspace to cause speculation
outside of the checked bounds. Be like Srinivas. The next step is to
ask whether the code can be refactored so that 'x' is sanitized
earlier in the call stack, especially if the nospec_array_ptr() lands
in a hot path. The next aspect that I expect most would be tempted to
go check is whether 'array2[array1[x]]' occurs later in the code
stream, but with speculation windows being architecture dependent and
potentially large (~180 cycles in one case says the paper) I submit
that we should err on the side of caution and not guess if that second
dependent read has been emitted somewhere in the instruction stream.

> In at least one place (mpls) you are patching a fast path.  Compile out
> or don't load mpls by all means.  But it is not acceptable to change the
> fast path without 

Re: [PATCH net-next 06/20] net: hns3: Modify the update period of packet statistics

2018-01-05 Thread lipeng (Y)



On 2018/1/5 22:54, Andrew Lunn wrote:

--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1126,6 +1126,7 @@ static int hns3_nic_set_features(struct net_device 
*netdev,
  {
struct hns3_nic_priv *priv = netdev_priv(netdev);
int queue_num = priv->ae_handle->kinfo.num_tqps;
+   struct hnae3_handle *handle = priv->ae_handle;
struct hns3_enet_ring *ring;
unsigned int start;
unsigned int idx;
@@ -1134,6 +1135,8 @@ static int hns3_nic_set_features(struct net_device 
*netdev,
u64 tx_pkts = 0;
u64 rx_pkts = 0;
  
+	handle->ae_algo->ops->update_stats(handle, >stats);

+
for (idx = 0; idx < queue_num; idx++) {
/* fetch the tx stats */
ring = priv->ring_data[idx].ring;

There is something odd going on with patch here. Notice how it says
hns3_nic_set_features(). This is not the function being patched, it is
actually the next one, hns3_nic_get_stats64(), which makes a lot more
sense.

Is it because the static void is on the previous line?

Yes, it is because the static void is on the previous line.

I can add one patch to fix the  previous line ,  and this patch will 
correct  automatically.


do it need V2 patchset? or push a new patch after this patchset?



It would be nice if the function was correctly reported. It makes it
easier to review the patch.

Andrew

.






Re: [PATCH net-next 06/20] net: hns3: Modify the update period of packet statistics

2018-01-05 Thread lipeng (Y)



On 2018/1/5 22:54, Andrew Lunn wrote:

--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1126,6 +1126,7 @@ static int hns3_nic_set_features(struct net_device 
*netdev,
  {
struct hns3_nic_priv *priv = netdev_priv(netdev);
int queue_num = priv->ae_handle->kinfo.num_tqps;
+   struct hnae3_handle *handle = priv->ae_handle;
struct hns3_enet_ring *ring;
unsigned int start;
unsigned int idx;
@@ -1134,6 +1135,8 @@ static int hns3_nic_set_features(struct net_device 
*netdev,
u64 tx_pkts = 0;
u64 rx_pkts = 0;
  
+	handle->ae_algo->ops->update_stats(handle, >stats);

+
for (idx = 0; idx < queue_num; idx++) {
/* fetch the tx stats */
ring = priv->ring_data[idx].ring;

There is something odd going on with patch here. Notice how it says
hns3_nic_set_features(). This is not the function being patched, it is
actually the next one, hns3_nic_get_stats64(), which makes a lot more
sense.

Is it because the static void is on the previous line?

Yes, it is because the static void is on the previous line.

I can add one patch to fix the  previous line ,  and this patch will 
correct  automatically.


do it need V2 patchset? or push a new patch after this patchset?



It would be nice if the function was correctly reported. It makes it
easier to review the patch.

Andrew

.






Re: [alsa-devel] [PATCH v6 07/14] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
On Fri, Jan 05, 2018 at 05:05:52PM +, Mark Brown wrote:
> On Thu, Dec 14, 2017 at 11:19:38AM +0530, Vinod Koul wrote:
> > SoundWire bus provides sdw_read() and sdw_write() APIs for Slave
> > devices to program the registers. Provide support in regmap for
> > SoundWire bus.
> 
> I can't apply this because you've changed the soundwire Kconfig in this
> patch :(

Ah, I will resend dropiing that part. 

Thanks
-- 
~Vinod


Re: [alsa-devel] [PATCH v6 07/14] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
On Fri, Jan 05, 2018 at 05:05:52PM +, Mark Brown wrote:
> On Thu, Dec 14, 2017 at 11:19:38AM +0530, Vinod Koul wrote:
> > SoundWire bus provides sdw_read() and sdw_write() APIs for Slave
> > devices to program the registers. Provide support in regmap for
> > SoundWire bus.
> 
> I can't apply this because you've changed the soundwire Kconfig in this
> patch :(

Ah, I will resend dropiing that part. 

Thanks
-- 
~Vinod


Re: [alsa-devel] [PATCH v6 07/14] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
On Fri, Jan 05, 2018 at 11:22:15AM -0600, Pierre-Louis Bossart wrote:
> On 1/5/18 11:04 AM, Mark Brown wrote:
> >On Thu, Dec 14, 2017 at 11:19:38AM +0530, Vinod Koul wrote:
> >
> >>+   /* SoundWire register address are contiguous */
> >>+   if (config->reg_stride != 0)
> >>+   return -ENOTSUPP;
> >
> >That doesn't mean the chip hasn't decided not to use half the addresses
> >for some reason - this isn't something the bus should be enforcing.
> 
> Good point. The contiguous requirement is valid only for normative
> registers, where the device has no choice but to follow the standard. For
> the imp-def part where regmap would typically be used, then indeed there is
> no restriction, chip implementers can do whatever they want.
> I have a vague memory that regmap was only intended to be used for this
> latter case, but Vinod and team should clarify this.

Right now it is used by codec for imp-def area. We do plan to add for all
registers eventually

Thanks
-- 
~Vinod


Re: [alsa-devel] [PATCH v6 07/14] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
On Fri, Jan 05, 2018 at 11:22:15AM -0600, Pierre-Louis Bossart wrote:
> On 1/5/18 11:04 AM, Mark Brown wrote:
> >On Thu, Dec 14, 2017 at 11:19:38AM +0530, Vinod Koul wrote:
> >
> >>+   /* SoundWire register address are contiguous */
> >>+   if (config->reg_stride != 0)
> >>+   return -ENOTSUPP;
> >
> >That doesn't mean the chip hasn't decided not to use half the addresses
> >for some reason - this isn't something the bus should be enforcing.
> 
> Good point. The contiguous requirement is valid only for normative
> registers, where the device has no choice but to follow the standard. For
> the imp-def part where regmap would typically be used, then indeed there is
> no restriction, chip implementers can do whatever they want.
> I have a vague memory that regmap was only intended to be used for this
> latter case, but Vinod and team should clarify this.

Right now it is used by codec for imp-def area. We do plan to add for all
registers eventually

Thanks
-- 
~Vinod


Re: [PATCH v6 07/14] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
On Fri, Jan 05, 2018 at 05:04:21PM +, Mark Brown wrote:
> On Thu, Dec 14, 2017 at 11:19:38AM +0530, Vinod Koul wrote:
> 
> > +   /* SoundWire register address are contiguous */
> > +   if (config->reg_stride != 0)
> > +   return -ENOTSUPP;
> 
> That doesn't mean the chip hasn't decided not to use half the addresses
> for some reason - this isn't something the bus should be enforcing.

Agreed, will drop this

-- 
~Vinod


Re: [PATCH v6 07/14] regmap: Add SoundWire bus support

2018-01-05 Thread Vinod Koul
On Fri, Jan 05, 2018 at 05:04:21PM +, Mark Brown wrote:
> On Thu, Dec 14, 2017 at 11:19:38AM +0530, Vinod Koul wrote:
> 
> > +   /* SoundWire register address are contiguous */
> > +   if (config->reg_stride != 0)
> > +   return -ENOTSUPP;
> 
> That doesn't mean the chip hasn't decided not to use half the addresses
> for some reason - this isn't something the bus should be enforcing.

Agreed, will drop this

-- 
~Vinod


Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Dave Hansen
On 01/05/2018 08:54 PM, Hanjun Guo wrote:
> Do you mean NX bit will be brought back later? I'm asking this because
> I tested this patch which it fixed the boot panic issue but the system
> will hang when rebooting the system, because rebooting will also call efi
> then panic as NS bit is set.

Wow, you're running a lot of very lighly-used code paths!  You actually
found a similar but totally separate issue from what I gather.  Thank
you immensely for the quick testing and bug reports!

Could you test the attached fix?

For those playing along at home, I think this will end up being needed
for 4.15 and probably all the backports.  I want to see if it works
before I submit it for real, though.

From: Dave Hansen 

This is another case similar to what EFI does: create a new set of
page tables, map some code at a low address, and jump to it.  PTI
mistakes this low address for userspace and mistakenly marks it
non-executable in an effort to make it unusable for userspace.  Undo
the poison to allow execution.

Signed-off-by: Dave Hansen 
Cc: Ning Sun 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: tboot-de...@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
---

 b/arch/x86/kernel/tboot.c |7 +++
 1 file changed, 7 insertions(+)

diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
--- a/arch/x86/kernel/tboot.c~pti-tboot-fix	2018-01-05 21:50:55.74960 -0800
+++ b/arch/x86/kernel/tboot.c	2018-01-05 22:01:51.393553325 -0800
@@ -124,6 +124,13 @@ static int map_tboot_page(unsigned long
 	pte_t *pte;
 
 	pgd = pgd_offset(_mm, vaddr);
+	/*
+	 * PTI poisons low addresses in the kernel page tables in the
+	 * name of making them unusable for userspace.  To execute
+	 * code at such a low address, the poison must be cleared.
+	 */
+	pgd->pgd &= ~_PAGE_NX;
+
 	p4d = p4d_alloc(_mm, pgd, vaddr);
 	if (!p4d)
 		return -1;
_


Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Dave Hansen
On 01/05/2018 08:54 PM, Hanjun Guo wrote:
> Do you mean NX bit will be brought back later? I'm asking this because
> I tested this patch which it fixed the boot panic issue but the system
> will hang when rebooting the system, because rebooting will also call efi
> then panic as NS bit is set.

Wow, you're running a lot of very lighly-used code paths!  You actually
found a similar but totally separate issue from what I gather.  Thank
you immensely for the quick testing and bug reports!

Could you test the attached fix?

For those playing along at home, I think this will end up being needed
for 4.15 and probably all the backports.  I want to see if it works
before I submit it for real, though.

From: Dave Hansen 

This is another case similar to what EFI does: create a new set of
page tables, map some code at a low address, and jump to it.  PTI
mistakes this low address for userspace and mistakenly marks it
non-executable in an effort to make it unusable for userspace.  Undo
the poison to allow execution.

Signed-off-by: Dave Hansen 
Cc: Ning Sun 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: tboot-de...@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
---

 b/arch/x86/kernel/tboot.c |7 +++
 1 file changed, 7 insertions(+)

diff -puN arch/x86/kernel/tboot.c~pti-tboot-fix arch/x86/kernel/tboot.c
--- a/arch/x86/kernel/tboot.c~pti-tboot-fix	2018-01-05 21:50:55.74960 -0800
+++ b/arch/x86/kernel/tboot.c	2018-01-05 22:01:51.393553325 -0800
@@ -124,6 +124,13 @@ static int map_tboot_page(unsigned long
 	pte_t *pte;
 
 	pgd = pgd_offset(_mm, vaddr);
+	/*
+	 * PTI poisons low addresses in the kernel page tables in the
+	 * name of making them unusable for userspace.  To execute
+	 * code at such a low address, the poison must be cleared.
+	 */
+	pgd->pgd &= ~_PAGE_NX;
+
 	p4d = p4d_alloc(_mm, pgd, vaddr);
 	if (!p4d)
 		return -1;
_


Re: [PATCH v4 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Randy Dunlap
On 01/04/18 18:00, David Woodhouse wrote:
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index d4fc98c..1009d1a 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -429,6 +429,19 @@ config GOLDFISH
> def_bool y
> depends on X86_GOLDFISH
>  
> +config RETPOLINE
> + bool "Avoid speculative indirect branches in kernel"
> + default y
> + help
> +   Compile kernel with the retpoline compiler options to guard against
> +   kernel to user data leaks by avoiding speculative indirect

On first reading, I encountered a parse err^W^W 
on "kernel to user data".  I get it after rereading it, but
kernel-to-user data leaks
would be better. (IMHO)

> +   branches. Requires a compiler with -mindirect-branch=thunk-extern
> +   support for full protection. The kernel may run slower.
> +
> +   Without compiler support, at least indirect branches in assembler
> +   code are eliminated. Since this includes the syscall entry path,
> +   it is not entirely pointless.


-- 
~Randy


Re: [PATCH v4 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Randy Dunlap
On 01/04/18 18:00, David Woodhouse wrote:
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index d4fc98c..1009d1a 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -429,6 +429,19 @@ config GOLDFISH
> def_bool y
> depends on X86_GOLDFISH
>  
> +config RETPOLINE
> + bool "Avoid speculative indirect branches in kernel"
> + default y
> + help
> +   Compile kernel with the retpoline compiler options to guard against
> +   kernel to user data leaks by avoiding speculative indirect

On first reading, I encountered a parse err^W^W 
on "kernel to user data".  I get it after rereading it, but
kernel-to-user data leaks
would be better. (IMHO)

> +   branches. Requires a compiler with -mindirect-branch=thunk-extern
> +   support for full protection. The kernel may run slower.
> +
> +   Without compiler support, at least indirect branches in assembler
> +   code are eliminated. Since this includes the syscall entry path,
> +   it is not entirely pointless.


-- 
~Randy


Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-05 Thread Dan Williams
On Fri, Jan 5, 2018 at 6:52 PM, Linus Torvalds
 wrote:
> On Fri, Jan 5, 2018 at 5:10 PM, Dan Williams  wrote:
>> From: Andi Kleen 
>>
>> When access_ok fails we should always stop speculating.
>> Add the required barriers to the x86 access_ok macro.
>
> Honestly, this seems completely bogus.
>
> The description is pure garbage afaik.
>
> The fact is, we have to stop speculating when access_ok() does *not*
> fail - because that's when we'll actually do the access. And it's that
> access that needs to be non-speculative.
>
> That actually seems to be what the code does (it stops speculation
> when __range_not_ok() returns false, but access_ok() is
> !__range_not_ok()). But the explanation is crap, and dangerous.

Oh, bother, yes, good catch. It's been a long week.  I'll take a look
at moving this to uaccess_begin().


Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-05 Thread Dan Williams
On Fri, Jan 5, 2018 at 6:52 PM, Linus Torvalds
 wrote:
> On Fri, Jan 5, 2018 at 5:10 PM, Dan Williams  wrote:
>> From: Andi Kleen 
>>
>> When access_ok fails we should always stop speculating.
>> Add the required barriers to the x86 access_ok macro.
>
> Honestly, this seems completely bogus.
>
> The description is pure garbage afaik.
>
> The fact is, we have to stop speculating when access_ok() does *not*
> fail - because that's when we'll actually do the access. And it's that
> access that needs to be non-speculative.
>
> That actually seems to be what the code does (it stops speculation
> when __range_not_ok() returns false, but access_ok() is
> !__range_not_ok()). But the explanation is crap, and dangerous.

Oh, bother, yes, good catch. It's been a long week.  I'll take a look
at moving this to uaccess_begin().


[PATCH] trace_uprobe: Display correct offset in uprobe_events

2018-01-05 Thread Ravi Bangoria
Recently, how the pointers being printed with %p has been changed
by commit ad67b74d2469 ("printk: hash addresses printed with %p").
This is causing a regression while showing offset in the
uprobe_events file. Instead of %p, use %px to display offset.

Before patch:

  # perf probe -vv -x /tmp/a.out main
  Opening /sys/kernel/debug/tracing//uprobe_events write=1
  Writing event: p:probe_a/main /tmp/a.out:0x58c

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_a/main /tmp/a.out:0x49a0f352

After patch:

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_a/main /tmp/a.out:0x058c

Signed-off-by: Ravi Bangoria 
---
 kernel/trace/trace_uprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 40592e7b3568..268029ae1be6 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -608,7 +608,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
 
/* Don't print "0x  (null)" when offset is 0 */
if (tu->offset) {
-   seq_printf(m, "0x%p", (void *)tu->offset);
+   seq_printf(m, "0x%px", (void *)tu->offset);
} else {
switch (sizeof(void *)) {
case 4:
-- 
2.13.6




[PATCH] trace_uprobe: Display correct offset in uprobe_events

2018-01-05 Thread Ravi Bangoria
Recently, how the pointers being printed with %p has been changed
by commit ad67b74d2469 ("printk: hash addresses printed with %p").
This is causing a regression while showing offset in the
uprobe_events file. Instead of %p, use %px to display offset.

Before patch:

  # perf probe -vv -x /tmp/a.out main
  Opening /sys/kernel/debug/tracing//uprobe_events write=1
  Writing event: p:probe_a/main /tmp/a.out:0x58c

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_a/main /tmp/a.out:0x49a0f352

After patch:

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_a/main /tmp/a.out:0x058c

Signed-off-by: Ravi Bangoria 
---
 kernel/trace/trace_uprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 40592e7b3568..268029ae1be6 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -608,7 +608,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
 
/* Don't print "0x  (null)" when offset is 0 */
if (tu->offset) {
-   seq_printf(m, "0x%p", (void *)tu->offset);
+   seq_printf(m, "0x%px", (void *)tu->offset);
} else {
switch (sizeof(void *)) {
case 4:
-- 
2.13.6




[PATCH 6/7] arm64: allwinner: h6: add the basical Allwinner H6 DTSI file

2018-01-05 Thread Icenowy Zheng
Allwinner H6 is a new SoC with Cortex-A53 cores from Allwinner, with its
memory map fully reworked and some high-speed peripherals (PCIe, USB
3.0) introduced.

This commit adds the basical DTSI file of it, including the clock
support and UART support.

Signed-off-by: Icenowy Zheng 
---
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 214 +++
 1 file changed, 214 insertions(+)
 create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
new file mode 100644
index ..482f5cb64d07
--- /dev/null
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
@@ -0,0 +1,214 @@
+/*
+ * Copyright (C) 2017 Icenowy Zheng 
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+/ {
+   interrupt-parent = <>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <0>;
+   enable-method = "psci";
+   };
+
+   cpu1: cpu@1 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <1>;
+   enable-method = "psci";
+   };
+
+   cpu2: cpu@2 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <2>;
+   enable-method = "psci";
+   };
+
+   cpu3: cpu@3 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <3>;
+   enable-method = "psci";
+   };
+   };
+
+   iosc: internal-osc-clk {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <1600>;
+   clock-accuracy = <3>;
+   clock-output-names = "iosc";
+   };
+
+   osc24M: osc24M_clk {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <2400>;
+   clock-output-names = "osc24M";
+   };
+
+   osc32k: osc32k_clk {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <32768>;
+   clock-output-names = "osc32k";
+   };
+
+   psci {
+   compatible = "arm,psci-0.2";
+   method = "smc";
+   };
+
+   timer {
+   compatible = "arm,armv8-timer";
+   interrupts = ,
+,
+,
+;
+   };
+
+   soc {
+

[PATCH 6/7] arm64: allwinner: h6: add the basical Allwinner H6 DTSI file

2018-01-05 Thread Icenowy Zheng
Allwinner H6 is a new SoC with Cortex-A53 cores from Allwinner, with its
memory map fully reworked and some high-speed peripherals (PCIe, USB
3.0) introduced.

This commit adds the basical DTSI file of it, including the clock
support and UART support.

Signed-off-by: Icenowy Zheng 
---
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 214 +++
 1 file changed, 214 insertions(+)
 create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
new file mode 100644
index ..482f5cb64d07
--- /dev/null
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
@@ -0,0 +1,214 @@
+/*
+ * Copyright (C) 2017 Icenowy Zheng 
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+/ {
+   interrupt-parent = <>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   cpu0: cpu@0 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <0>;
+   enable-method = "psci";
+   };
+
+   cpu1: cpu@1 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <1>;
+   enable-method = "psci";
+   };
+
+   cpu2: cpu@2 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <2>;
+   enable-method = "psci";
+   };
+
+   cpu3: cpu@3 {
+   compatible = "arm,cortex-a53", "arm,armv8";
+   device_type = "cpu";
+   reg = <3>;
+   enable-method = "psci";
+   };
+   };
+
+   iosc: internal-osc-clk {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <1600>;
+   clock-accuracy = <3>;
+   clock-output-names = "iosc";
+   };
+
+   osc24M: osc24M_clk {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <2400>;
+   clock-output-names = "osc24M";
+   };
+
+   osc32k: osc32k_clk {
+   #clock-cells = <0>;
+   compatible = "fixed-clock";
+   clock-frequency = <32768>;
+   clock-output-names = "osc32k";
+   };
+
+   psci {
+   compatible = "arm,psci-0.2";
+   method = "smc";
+   };
+
+   timer {
+   compatible = "arm,armv8-timer";
+   interrupts = ,
+,
+,
+;
+   };
+
+   soc {
+   compatible = "simple-bus";

Re: [PATCH 01/18] asm-generic/barrier: add generic nospec helpers

2018-01-05 Thread Dan Williams
On Fri, Jan 5, 2018 at 6:55 PM, Linus Torvalds
 wrote:
> On Fri, Jan 5, 2018 at 5:09 PM, Dan Williams  wrote:
>> +#ifndef nospec_ptr
>> +#define nospec_ptr(ptr, lo, hi) 
>>\
>
> Do we actually want this horrible interface?
>
> It just causes the compiler - or inline asm - to generate worse code,
> because it needs to compare against both high and low limits.
>
> Basically all users are arrays that are zero-based, and where a
> comparison against the high _index_ limit would be sufficient.
>
> But the way this is all designed, it's literally designed for bad code
> generation for the unusual case, and the usual array case is written
> in the form of the unusual and wrong non-array case. That really seems
> excessively stupid.

Yes, it appears we can kill nospec_ptr() and move nospec_array_ptr()
to assume 0 based arrays rather than use nospec_ptr.


Re: [PATCH 01/18] asm-generic/barrier: add generic nospec helpers

2018-01-05 Thread Dan Williams
On Fri, Jan 5, 2018 at 6:55 PM, Linus Torvalds
 wrote:
> On Fri, Jan 5, 2018 at 5:09 PM, Dan Williams  wrote:
>> +#ifndef nospec_ptr
>> +#define nospec_ptr(ptr, lo, hi) 
>>\
>
> Do we actually want this horrible interface?
>
> It just causes the compiler - or inline asm - to generate worse code,
> because it needs to compare against both high and low limits.
>
> Basically all users are arrays that are zero-based, and where a
> comparison against the high _index_ limit would be sufficient.
>
> But the way this is all designed, it's literally designed for bad code
> generation for the unusual case, and the usual array case is written
> in the form of the unusual and wrong non-array case. That really seems
> excessively stupid.

Yes, it appears we can kill nospec_ptr() and move nospec_array_ptr()
to assume 0 based arrays rather than use nospec_ptr.


RE: [Intel-wired-lan] [PATCH 09/27] igb: Use timecounter_initialize interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:38 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 09/27] igb: Use timecounter_initialize
> interface
> 
> With new interface timecounter_initialize we can initialize timecounter
> fields and underlying cyclecounter together. Update igb ptp timecounter
> init with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/igb/igb.h |  4 
>  drivers/net/ethernet/intel/igb/igb_ptp.c | 23 ++-
>  2 files changed, 18 insertions(+), 9 deletions(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 09/27] igb: Use timecounter_initialize interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:38 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 09/27] igb: Use timecounter_initialize
> interface
> 
> With new interface timecounter_initialize we can initialize timecounter
> fields and underlying cyclecounter together. Update igb ptp timecounter
> init with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/igb/igb.h |  4 
>  drivers/net/ethernet/intel/igb/igb_ptp.c | 23 ++-
>  2 files changed, 18 insertions(+), 9 deletions(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F


> -Original Message-
> From: Brown, Aaron F
> Sent: Friday, January 5, 2018 8:34 PM
> To: 'Sagar Arun Kamble' ; linux-
> ker...@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: RE: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset
> interface
> 
> > From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> > Behalf Of Sagar Arun Kamble
> > Sent: Thursday, December 14, 2017 11:39 PM
> > To: linux-kernel@vger.kernel.org
> > Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> > ; Kamble, Sagar A
> > ; net...@vger.kernel.org
> > Subject: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset
> > interface
> >
> > With new interface timecounter_reset we can update the start time for
> > timecounter. Update ixgbe_ptp_settime with this new function.
> >
> > Signed-off-by: Sagar Arun Kamble 
> > Cc: Richard Cochran 
> > Cc: Jeff Kirsher 
> > Cc: intel-wired-...@lists.osuosl.org
> > Cc: net...@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> 
> Tested-by: Aaron Brown 
Strike my Tested-by: for this (ixgbe) instance.  It was meant for igb.


RE: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F


> -Original Message-
> From: Brown, Aaron F
> Sent: Friday, January 5, 2018 8:34 PM
> To: 'Sagar Arun Kamble' ; linux-
> ker...@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: RE: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset
> interface
> 
> > From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> > Behalf Of Sagar Arun Kamble
> > Sent: Thursday, December 14, 2017 11:39 PM
> > To: linux-kernel@vger.kernel.org
> > Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> > ; Kamble, Sagar A
> > ; net...@vger.kernel.org
> > Subject: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset
> > interface
> >
> > With new interface timecounter_reset we can update the start time for
> > timecounter. Update ixgbe_ptp_settime with this new function.
> >
> > Signed-off-by: Sagar Arun Kamble 
> > Cc: Richard Cochran 
> > Cc: Jeff Kirsher 
> > Cc: intel-wired-...@lists.osuosl.org
> > Cc: net...@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> 
> Tested-by: Aaron Brown 
Strike my Tested-by: for this (ixgbe) instance.  It was meant for igb.


[PATCH 5/7] clk: sunxi-ng: add support for the Allwinner H6 CCU

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 SoC has a CCU which has been largely rearranged.

Add support for it in the sunxi-ng CCU framework.

Signed-off-by: Icenowy Zheng 
---
 .../devicetree/bindings/clock/sunxi-ccu.txt|1 +
 drivers/clk/sunxi-ng/Kconfig   |5 +
 drivers/clk/sunxi-ng/Makefile  |1 +
 drivers/clk/sunxi-ng/ccu-sun50i-h6.c   | 1206 
 drivers/clk/sunxi-ng/ccu-sun50i-h6.h   |   63 +
 include/dt-bindings/clock/sun50i-h6-ccu.h  |  159 +++
 include/dt-bindings/reset/sun50i-h6-ccu.h  |  110 ++
 7 files changed, 1545 insertions(+)
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.c
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.h
 create mode 100644 include/dt-bindings/clock/sun50i-h6-ccu.h
 create mode 100644 include/dt-bindings/reset/sun50i-h6-ccu.h

diff --git a/Documentation/devicetree/bindings/clock/sunxi-ccu.txt 
b/Documentation/devicetree/bindings/clock/sunxi-ccu.txt
index 4ca21c3a6fc9..9ae27881c924 100644
--- a/Documentation/devicetree/bindings/clock/sunxi-ccu.txt
+++ b/Documentation/devicetree/bindings/clock/sunxi-ccu.txt
@@ -20,6 +20,7 @@ Required properties :
- "allwinner,sun50i-a64-ccu"
- "allwinner,sun50i-a64-r-ccu"
- "allwinner,sun50i-h5-ccu"
+   - "allwinner,sun50i-h6-ccu"
- "nextthing,gr8-ccu"
 
 - reg: Must contain the registers base address and length
diff --git a/drivers/clk/sunxi-ng/Kconfig b/drivers/clk/sunxi-ng/Kconfig
index 6427d0ebe2de..4bc196a49b12 100644
--- a/drivers/clk/sunxi-ng/Kconfig
+++ b/drivers/clk/sunxi-ng/Kconfig
@@ -11,6 +11,11 @@ config SUN50I_A64_CCU
default ARM64 && ARCH_SUNXI
depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
 
+config SUN50I_H6_CCU
+   bool "Support for the Allwinner H6 CCU"
+   default ARM64 && ARCH_SUNXI
+   depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
+
 config SUN4I_A10_CCU
bool "Support for the Allwinner A10/A20 CCU"
select SUNXI_CCU_DIV
diff --git a/drivers/clk/sunxi-ng/Makefile b/drivers/clk/sunxi-ng/Makefile
index 4141c3fe08ae..128a40ee5c5e 100644
--- a/drivers/clk/sunxi-ng/Makefile
+++ b/drivers/clk/sunxi-ng/Makefile
@@ -22,6 +22,7 @@ lib-$(CONFIG_SUNXI_CCU)   += ccu_mp.o
 
 # SoC support
 obj-$(CONFIG_SUN50I_A64_CCU)   += ccu-sun50i-a64.o
+obj-$(CONFIG_SUN50I_H6_CCU)+= ccu-sun50i-h6.o
 obj-$(CONFIG_SUN4I_A10_CCU)+= ccu-sun4i-a10.o
 obj-$(CONFIG_SUN5I_CCU)+= ccu-sun5i.o
 obj-$(CONFIG_SUN6I_A31_CCU)+= ccu-sun6i-a31.o
diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c 
b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
new file mode 100644
index ..18a1e08e7260
--- /dev/null
+++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
@@ -0,0 +1,1206 @@
+/*
+ * Copyright (c) 2017 Icenowy Zheng 
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ccu_common.h"
+#include "ccu_reset.h"
+
+#include "ccu_div.h"
+#include "ccu_gate.h"
+#include "ccu_mp.h"
+#include "ccu_mult.h"
+#include "ccu_nk.h"
+#include "ccu_nkm.h"
+#include "ccu_nkmp.h"
+#include "ccu_nm.h"
+
+#include "ccu-sun50i-h6.h"
+
+/*
+ * The CPU PLL is actually NP clock, with P being /1, /2 or /4. However
+ * P should only be used for output frequencies lower than 288 MHz.
+ *
+ * For now we can just model it as a multiplier clock, and force P to /1.
+ *
+ * The M factor is present in the register's description, but not in the
+ * frequency formula, and it's documented as "M is only used for backdoor
+ * testing", so it's not modelled and then force to 0.
+ */
+#define SUN50I_H6_PLL_CPUX_REG 0x000
+static struct ccu_mult pll_cpux_clk = {
+   .enable = BIT(31),
+   .lock   = BIT(28),
+   .mult   = _SUNXI_CCU_MULT_MIN(8, 8, 12),
+   .common = {
+   .reg= 0x000,
+   .hw.init= CLK_HW_INIT("pll-cpux", "osc24M",
+ _mult_ops,
+ CLK_SET_RATE_UNGATE),
+   },
+};
+
+/* Some PLLs are input * N / div1 / P. Model them as NKMP with no K */
+#define SUN50I_H6_PLL_DDR0_REG 0x010
+static struct ccu_nkmp pll_ddr0_clk = {
+   .enable = BIT(31),
+   .lock   = BIT(28),
+   .n  = _SUNXI_CCU_MULT_MIN(8, 8, 12),
+   .m  = _SUNXI_CCU_DIV(1, 1), /* input divider */
+   .p  = _SUNXI_CCU_DIV(0, 1), 

[PATCH 5/7] clk: sunxi-ng: add support for the Allwinner H6 CCU

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 SoC has a CCU which has been largely rearranged.

Add support for it in the sunxi-ng CCU framework.

Signed-off-by: Icenowy Zheng 
---
 .../devicetree/bindings/clock/sunxi-ccu.txt|1 +
 drivers/clk/sunxi-ng/Kconfig   |5 +
 drivers/clk/sunxi-ng/Makefile  |1 +
 drivers/clk/sunxi-ng/ccu-sun50i-h6.c   | 1206 
 drivers/clk/sunxi-ng/ccu-sun50i-h6.h   |   63 +
 include/dt-bindings/clock/sun50i-h6-ccu.h  |  159 +++
 include/dt-bindings/reset/sun50i-h6-ccu.h  |  110 ++
 7 files changed, 1545 insertions(+)
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.c
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.h
 create mode 100644 include/dt-bindings/clock/sun50i-h6-ccu.h
 create mode 100644 include/dt-bindings/reset/sun50i-h6-ccu.h

diff --git a/Documentation/devicetree/bindings/clock/sunxi-ccu.txt 
b/Documentation/devicetree/bindings/clock/sunxi-ccu.txt
index 4ca21c3a6fc9..9ae27881c924 100644
--- a/Documentation/devicetree/bindings/clock/sunxi-ccu.txt
+++ b/Documentation/devicetree/bindings/clock/sunxi-ccu.txt
@@ -20,6 +20,7 @@ Required properties :
- "allwinner,sun50i-a64-ccu"
- "allwinner,sun50i-a64-r-ccu"
- "allwinner,sun50i-h5-ccu"
+   - "allwinner,sun50i-h6-ccu"
- "nextthing,gr8-ccu"
 
 - reg: Must contain the registers base address and length
diff --git a/drivers/clk/sunxi-ng/Kconfig b/drivers/clk/sunxi-ng/Kconfig
index 6427d0ebe2de..4bc196a49b12 100644
--- a/drivers/clk/sunxi-ng/Kconfig
+++ b/drivers/clk/sunxi-ng/Kconfig
@@ -11,6 +11,11 @@ config SUN50I_A64_CCU
default ARM64 && ARCH_SUNXI
depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
 
+config SUN50I_H6_CCU
+   bool "Support for the Allwinner H6 CCU"
+   default ARM64 && ARCH_SUNXI
+   depends on (ARM64 && ARCH_SUNXI) || COMPILE_TEST
+
 config SUN4I_A10_CCU
bool "Support for the Allwinner A10/A20 CCU"
select SUNXI_CCU_DIV
diff --git a/drivers/clk/sunxi-ng/Makefile b/drivers/clk/sunxi-ng/Makefile
index 4141c3fe08ae..128a40ee5c5e 100644
--- a/drivers/clk/sunxi-ng/Makefile
+++ b/drivers/clk/sunxi-ng/Makefile
@@ -22,6 +22,7 @@ lib-$(CONFIG_SUNXI_CCU)   += ccu_mp.o
 
 # SoC support
 obj-$(CONFIG_SUN50I_A64_CCU)   += ccu-sun50i-a64.o
+obj-$(CONFIG_SUN50I_H6_CCU)+= ccu-sun50i-h6.o
 obj-$(CONFIG_SUN4I_A10_CCU)+= ccu-sun4i-a10.o
 obj-$(CONFIG_SUN5I_CCU)+= ccu-sun5i.o
 obj-$(CONFIG_SUN6I_A31_CCU)+= ccu-sun6i-a31.o
diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c 
b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
new file mode 100644
index ..18a1e08e7260
--- /dev/null
+++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
@@ -0,0 +1,1206 @@
+/*
+ * Copyright (c) 2017 Icenowy Zheng 
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ccu_common.h"
+#include "ccu_reset.h"
+
+#include "ccu_div.h"
+#include "ccu_gate.h"
+#include "ccu_mp.h"
+#include "ccu_mult.h"
+#include "ccu_nk.h"
+#include "ccu_nkm.h"
+#include "ccu_nkmp.h"
+#include "ccu_nm.h"
+
+#include "ccu-sun50i-h6.h"
+
+/*
+ * The CPU PLL is actually NP clock, with P being /1, /2 or /4. However
+ * P should only be used for output frequencies lower than 288 MHz.
+ *
+ * For now we can just model it as a multiplier clock, and force P to /1.
+ *
+ * The M factor is present in the register's description, but not in the
+ * frequency formula, and it's documented as "M is only used for backdoor
+ * testing", so it's not modelled and then force to 0.
+ */
+#define SUN50I_H6_PLL_CPUX_REG 0x000
+static struct ccu_mult pll_cpux_clk = {
+   .enable = BIT(31),
+   .lock   = BIT(28),
+   .mult   = _SUNXI_CCU_MULT_MIN(8, 8, 12),
+   .common = {
+   .reg= 0x000,
+   .hw.init= CLK_HW_INIT("pll-cpux", "osc24M",
+ _mult_ops,
+ CLK_SET_RATE_UNGATE),
+   },
+};
+
+/* Some PLLs are input * N / div1 / P. Model them as NKMP with no K */
+#define SUN50I_H6_PLL_DDR0_REG 0x010
+static struct ccu_nkmp pll_ddr0_clk = {
+   .enable = BIT(31),
+   .lock   = BIT(28),
+   .n  = _SUNXI_CCU_MULT_MIN(8, 8, 12),
+   .m  = _SUNXI_CCU_DIV(1, 1), /* input divider */
+   .p  = _SUNXI_CCU_DIV(0, 1), /* output divider */
+   

Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Hanjun Guo
Hi Jiri,

Thanks for the fix, comments inline.

On 2018/1/6 2:19, Jiri Kosina wrote:
> 
> [ adding Hugh ]
> 
> On Thu, 4 Jan 2018, Dave Hansen wrote:
> 
>>> BTW, we have just reported a bug caused by kaiser[1], which looks like
>>> caused by SMEP. Could you please help to have a look?
>>>
>>> [1] https://lkml.org/lkml/2018/1/5/3
>>
>> Please report that to your kernel vendor.  Your EFI page tables have the
>> NX bit set on the low addresses.  There have been a bunch of iterations
>> of this, but you need to make sure that the EFI kernel mappings don't
>> get _PAGE_NX set on them.  Look at what __pti_set_user_pgd() does in
>> mainline.
> 
> Unfortunately this is more complicated.
> 
> The thing is -- efi=old_memmap is broken even upstream. We will probably 
> not receive too many reports about this against upstream PTI, as most of 
> the machines are using classic high-mapping of EFI regions; but older 
> kernels force on certain machines stil old_memmap (or it can be specified 
> manually on kernel cmdline), where EFI has all its mapping in the 
> userspace range.
> 
> And that explodes, as those get marked NX in the kernel pagetables.
> 
> I've spent most of today tracking this down (the legacy EFI mmap is 
> horrid); the patch below is confirmed to fix it both on current upstream 
> kernel, as well as on original-KAISER based kernels (Hugh's backport) in 
> cases old_memmap is used by EFI.
> 
> I am not super happy about this, but I din't really want to extend the 
> _set_pgd() code to always figure out whether it's dealing wih low EFI 
> mapping or not, as that would be way too much overhead just for this 
> one-off call during boot.
> 
> 
> 
> From: Jiri Kosina 
> Subject: [PATCH] PTI: unbreak EFI old_memmap
> 
> old_memmap's efi_call_phys_prolog() calls set_pgd() with swapper PGD that 
> has PAGE_USER set, which makes PTI set NX on it, and therefore EFI can't 
> execute it's code.
> 
> Fix that by forcefully clearing _PAGE_NX from the PGD (this can't be done
> by the pgprot API).
> 
> _PAGE_NX will be automatically reintroduced in efi_call_phys_epilog(), as 
> _set_pgd() will again notice that this is _PAGE_USER, and set _PAGE_NX on 
> it.
> 
> Signed-off-by: Jiri Kosina 
> ---
>  arch/x86/platform/efi/efi_64.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -95,6 +95,12 @@ pgd_t * __init efi_call_phys_prolog(void
>   save_pgd[pgd] = *pgd_offset_k(pgd * PGDIR_SIZE);
>   vaddress = (unsigned long)__va(pgd * PGDIR_SIZE);
>   set_pgd(pgd_offset_k(pgd * PGDIR_SIZE), 
> *pgd_offset_k(vaddress));
> + /*
> +  * pgprot API doesn't clear it for PGD
> +  *
> +  * Will be brought back automatically in _epilog()
> +  */
> + pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX;

Do you mean NX bit will be brought back later? I'm asking this because
I tested this patch which it fixed the boot panic issue but the system
will hang when rebooting the system, because rebooting will also call efi
then panic as NS bit is set.

[ 1911.622675] BUG: unable to handle kernel paging request at 008041c0
[ 1911.629880] IP: [<008041c0>] 0x8041bf
[ 1911.634389] PGD 8010272cb067 PUD 2025178067 PMD 10272d8067 PTE 804063
[ 1911.641472] Oops: 0011 [#1] SMP
[ 1911.711748] Modules linked in: bum(O) ip_set nfnetlink prio(O) nat(O) 
vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre kboxdriver(O) kbox(O) 
signo_catch(O) vfat fat tg3 intel_powerclamp coretemp intel_rapl crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel i2c_i801 kvm_intel(O) ptp lrw 
gf128mul i2c_core glue_helper ablk_helper pps_core kvm(O) cryptd iTCO_wdt 
iTCO_vendor_support sg pcspkr lpc_ich mfd_core sb_edac mei_me edac_core mei 
shpchp acpi_power_meter acpi_pad remote_trigger(O) nf_conntrack_ipv4 
nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass 
vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat 
nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sr_mod sd_mod cdrom lpfc 
crc_t10dif ahci crct10dif_generic crct10dif_pclmul libahci scsi_transport_fc 
scsi_tgt crct10dif_common libata usb_storage megaraid_sas dm_mod [last 
unloaded: dev_connlimit]
[ 1911.796711] CPU: 0 PID: 12033 Comm: reboot Tainted: G   OE   
---   3.10.0-327.61.59.66_22.x86_64 #1
[ 1911.807449] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.79 11/07/2017
[ 1911.814702] task: 881025a91700 ti: 8810267fc000 task.ti: 
8810267fc000
[ 1911.822401] RIP: 0010:[<008041c0>]  [<008041c0>] 0x8041bf
[ 1911.829407] RSP: 0018:8810267ffd50  EFLAGS: 00010086
[ 1911.834877] RAX: 008041c0 RBX:  RCX: ff425000
[ 1911.842220] RDX: 8820a4e4 RSI: c000 RDI: 002024e4
[ 1911.849563] RBP: 8810267ffd60 R08: 

Re: [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch)

2018-01-05 Thread Hanjun Guo
Hi Jiri,

Thanks for the fix, comments inline.

On 2018/1/6 2:19, Jiri Kosina wrote:
> 
> [ adding Hugh ]
> 
> On Thu, 4 Jan 2018, Dave Hansen wrote:
> 
>>> BTW, we have just reported a bug caused by kaiser[1], which looks like
>>> caused by SMEP. Could you please help to have a look?
>>>
>>> [1] https://lkml.org/lkml/2018/1/5/3
>>
>> Please report that to your kernel vendor.  Your EFI page tables have the
>> NX bit set on the low addresses.  There have been a bunch of iterations
>> of this, but you need to make sure that the EFI kernel mappings don't
>> get _PAGE_NX set on them.  Look at what __pti_set_user_pgd() does in
>> mainline.
> 
> Unfortunately this is more complicated.
> 
> The thing is -- efi=old_memmap is broken even upstream. We will probably 
> not receive too many reports about this against upstream PTI, as most of 
> the machines are using classic high-mapping of EFI regions; but older 
> kernels force on certain machines stil old_memmap (or it can be specified 
> manually on kernel cmdline), where EFI has all its mapping in the 
> userspace range.
> 
> And that explodes, as those get marked NX in the kernel pagetables.
> 
> I've spent most of today tracking this down (the legacy EFI mmap is 
> horrid); the patch below is confirmed to fix it both on current upstream 
> kernel, as well as on original-KAISER based kernels (Hugh's backport) in 
> cases old_memmap is used by EFI.
> 
> I am not super happy about this, but I din't really want to extend the 
> _set_pgd() code to always figure out whether it's dealing wih low EFI 
> mapping or not, as that would be way too much overhead just for this 
> one-off call during boot.
> 
> 
> 
> From: Jiri Kosina 
> Subject: [PATCH] PTI: unbreak EFI old_memmap
> 
> old_memmap's efi_call_phys_prolog() calls set_pgd() with swapper PGD that 
> has PAGE_USER set, which makes PTI set NX on it, and therefore EFI can't 
> execute it's code.
> 
> Fix that by forcefully clearing _PAGE_NX from the PGD (this can't be done
> by the pgprot API).
> 
> _PAGE_NX will be automatically reintroduced in efi_call_phys_epilog(), as 
> _set_pgd() will again notice that this is _PAGE_USER, and set _PAGE_NX on 
> it.
> 
> Signed-off-by: Jiri Kosina 
> ---
>  arch/x86/platform/efi/efi_64.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -95,6 +95,12 @@ pgd_t * __init efi_call_phys_prolog(void
>   save_pgd[pgd] = *pgd_offset_k(pgd * PGDIR_SIZE);
>   vaddress = (unsigned long)__va(pgd * PGDIR_SIZE);
>   set_pgd(pgd_offset_k(pgd * PGDIR_SIZE), 
> *pgd_offset_k(vaddress));
> + /*
> +  * pgprot API doesn't clear it for PGD
> +  *
> +  * Will be brought back automatically in _epilog()
> +  */
> + pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX;

Do you mean NX bit will be brought back later? I'm asking this because
I tested this patch which it fixed the boot panic issue but the system
will hang when rebooting the system, because rebooting will also call efi
then panic as NS bit is set.

[ 1911.622675] BUG: unable to handle kernel paging request at 008041c0
[ 1911.629880] IP: [<008041c0>] 0x8041bf
[ 1911.634389] PGD 8010272cb067 PUD 2025178067 PMD 10272d8067 PTE 804063
[ 1911.641472] Oops: 0011 [#1] SMP
[ 1911.711748] Modules linked in: bum(O) ip_set nfnetlink prio(O) nat(O) 
vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre kboxdriver(O) kbox(O) 
signo_catch(O) vfat fat tg3 intel_powerclamp coretemp intel_rapl crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel i2c_i801 kvm_intel(O) ptp lrw 
gf128mul i2c_core glue_helper ablk_helper pps_core kvm(O) cryptd iTCO_wdt 
iTCO_vendor_support sg pcspkr lpc_ich mfd_core sb_edac mei_me edac_core mei 
shpchp acpi_power_meter acpi_pad remote_trigger(O) nf_conntrack_ipv4 
nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass 
vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat 
nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd sr_mod sd_mod cdrom lpfc 
crc_t10dif ahci crct10dif_generic crct10dif_pclmul libahci scsi_transport_fc 
scsi_tgt crct10dif_common libata usb_storage megaraid_sas dm_mod [last 
unloaded: dev_connlimit]
[ 1911.796711] CPU: 0 PID: 12033 Comm: reboot Tainted: G   OE   
---   3.10.0-327.61.59.66_22.x86_64 #1
[ 1911.807449] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.79 11/07/2017
[ 1911.814702] task: 881025a91700 ti: 8810267fc000 task.ti: 
8810267fc000
[ 1911.822401] RIP: 0010:[<008041c0>]  [<008041c0>] 0x8041bf
[ 1911.829407] RSP: 0018:8810267ffd50  EFLAGS: 00010086
[ 1911.834877] RAX: 008041c0 RBX:  RCX: ff425000
[ 1911.842220] RDX: 8820a4e4 RSI: c000 RDI: 002024e4
[ 1911.849563] RBP: 8810267ffd60 R08: 882024e4 R09: 

[PATCH 4/7] clk: sunxi-ng: Support fixed post-dividers on NKMP style clocks

2018-01-05 Thread Icenowy Zheng
On the new Allwinner H6 SoC, multiple PLL's are NMP style clocks
(modelled as NKMP with no K) and have fixed post-dividers.

Add fixed post divider support to the NKMP style clocks.

Signed-off-by: Icenowy Zheng 
---
 drivers/clk/sunxi-ng/ccu_nkmp.c | 20 +---
 drivers/clk/sunxi-ng/ccu_nkmp.h |  2 ++
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.c b/drivers/clk/sunxi-ng/ccu_nkmp.c
index e58c95787f94..497ac20deb19 100644
--- a/drivers/clk/sunxi-ng/ccu_nkmp.c
+++ b/drivers/clk/sunxi-ng/ccu_nkmp.c
@@ -81,7 +81,7 @@ static unsigned long ccu_nkmp_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
 {
struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw);
-   unsigned long n, m, k, p;
+   unsigned long n, m, k, p, rate;
u32 reg;
 
reg = readl(nkmp->common.base + nkmp->common.reg);
@@ -107,7 +107,11 @@ static unsigned long ccu_nkmp_recalc_rate(struct clk_hw 
*hw,
p = reg >> nkmp->p.shift;
p &= (1 << nkmp->p.width) - 1;
 
-   return (parent_rate * n * k >> p) / m;
+   rate = (parent_rate * n * k >> p) / m;
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate /= nkmp->fixed_post_div;
+
+   return rate;
 }
 
 static long ccu_nkmp_round_rate(struct clk_hw *hw, unsigned long rate,
@@ -116,6 +120,9 @@ static long ccu_nkmp_round_rate(struct clk_hw *hw, unsigned 
long rate,
struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw);
struct _ccu_nkmp _nkmp;
 
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate *= nkmp->fixed_post_div;
+
_nkmp.min_n = nkmp->n.min ?: 1;
_nkmp.max_n = nkmp->n.max ?: 1 << nkmp->n.width;
_nkmp.min_k = nkmp->k.min ?: 1;
@@ -127,7 +134,11 @@ static long ccu_nkmp_round_rate(struct clk_hw *hw, 
unsigned long rate,
 
ccu_nkmp_find_best(*parent_rate, rate, &_nkmp);
 
-   return *parent_rate * _nkmp.n * _nkmp.k / (_nkmp.m * _nkmp.p);
+   rate = *parent_rate * _nkmp.n * _nkmp.k / (_nkmp.m * _nkmp.p);
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate = rate / nkmp->fixed_post_div;
+
+   return rate;
 }
 
 static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate,
@@ -138,6 +149,9 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned 
long rate,
unsigned long flags;
u32 reg;
 
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate = rate * nkmp->fixed_post_div;
+
_nkmp.min_n = nkmp->n.min ?: 1;
_nkmp.max_n = nkmp->n.max ?: 1 << nkmp->n.width;
_nkmp.min_k = nkmp->k.min ?: 1;
diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.h b/drivers/clk/sunxi-ng/ccu_nkmp.h
index a82facbc6144..6940503e7fc4 100644
--- a/drivers/clk/sunxi-ng/ccu_nkmp.h
+++ b/drivers/clk/sunxi-ng/ccu_nkmp.h
@@ -34,6 +34,8 @@ struct ccu_nkmp {
struct ccu_div_internal m;
struct ccu_div_internal p;
 
+   unsigned intfixed_post_div;
+
struct ccu_common   common;
 };
 
-- 
2.14.2



[PATCH 4/7] clk: sunxi-ng: Support fixed post-dividers on NKMP style clocks

2018-01-05 Thread Icenowy Zheng
On the new Allwinner H6 SoC, multiple PLL's are NMP style clocks
(modelled as NKMP with no K) and have fixed post-dividers.

Add fixed post divider support to the NKMP style clocks.

Signed-off-by: Icenowy Zheng 
---
 drivers/clk/sunxi-ng/ccu_nkmp.c | 20 +---
 drivers/clk/sunxi-ng/ccu_nkmp.h |  2 ++
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.c b/drivers/clk/sunxi-ng/ccu_nkmp.c
index e58c95787f94..497ac20deb19 100644
--- a/drivers/clk/sunxi-ng/ccu_nkmp.c
+++ b/drivers/clk/sunxi-ng/ccu_nkmp.c
@@ -81,7 +81,7 @@ static unsigned long ccu_nkmp_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
 {
struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw);
-   unsigned long n, m, k, p;
+   unsigned long n, m, k, p, rate;
u32 reg;
 
reg = readl(nkmp->common.base + nkmp->common.reg);
@@ -107,7 +107,11 @@ static unsigned long ccu_nkmp_recalc_rate(struct clk_hw 
*hw,
p = reg >> nkmp->p.shift;
p &= (1 << nkmp->p.width) - 1;
 
-   return (parent_rate * n * k >> p) / m;
+   rate = (parent_rate * n * k >> p) / m;
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate /= nkmp->fixed_post_div;
+
+   return rate;
 }
 
 static long ccu_nkmp_round_rate(struct clk_hw *hw, unsigned long rate,
@@ -116,6 +120,9 @@ static long ccu_nkmp_round_rate(struct clk_hw *hw, unsigned 
long rate,
struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw);
struct _ccu_nkmp _nkmp;
 
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate *= nkmp->fixed_post_div;
+
_nkmp.min_n = nkmp->n.min ?: 1;
_nkmp.max_n = nkmp->n.max ?: 1 << nkmp->n.width;
_nkmp.min_k = nkmp->k.min ?: 1;
@@ -127,7 +134,11 @@ static long ccu_nkmp_round_rate(struct clk_hw *hw, 
unsigned long rate,
 
ccu_nkmp_find_best(*parent_rate, rate, &_nkmp);
 
-   return *parent_rate * _nkmp.n * _nkmp.k / (_nkmp.m * _nkmp.p);
+   rate = *parent_rate * _nkmp.n * _nkmp.k / (_nkmp.m * _nkmp.p);
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate = rate / nkmp->fixed_post_div;
+
+   return rate;
 }
 
 static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate,
@@ -138,6 +149,9 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned 
long rate,
unsigned long flags;
u32 reg;
 
+   if (nkmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+   rate = rate * nkmp->fixed_post_div;
+
_nkmp.min_n = nkmp->n.min ?: 1;
_nkmp.max_n = nkmp->n.max ?: 1 << nkmp->n.width;
_nkmp.min_k = nkmp->k.min ?: 1;
diff --git a/drivers/clk/sunxi-ng/ccu_nkmp.h b/drivers/clk/sunxi-ng/ccu_nkmp.h
index a82facbc6144..6940503e7fc4 100644
--- a/drivers/clk/sunxi-ng/ccu_nkmp.h
+++ b/drivers/clk/sunxi-ng/ccu_nkmp.h
@@ -34,6 +34,8 @@ struct ccu_nkmp {
struct ccu_div_internal m;
struct ccu_div_internal p;
 
+   unsigned intfixed_post_div;
+
struct ccu_common   common;
 };
 
-- 
2.14.2



[PATCH] mm: ratelimit end_swap_bio_write() error

2018-01-05 Thread Sergey Senozhatsky
Use the ratelimited printk() version for swap-device write error
reporting. We can use ZRAM as a swap-device, and the tricky part
here is that zsmalloc() stores compressed objects in memory, thus
it has to allocates pages during swap-out. If the system is short
on memory, then we begin to flood printk() log buffer with the
same "Write-error on swap-device XXX" error messages and sometimes
simply lockup the system.

Signed-off-by: Sergey Senozhatsky 
---
 mm/page_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index e93f1a4cacd7..422cd49bcba8 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -63,7 +63,7 @@ void end_swap_bio_write(struct bio *bio)
 * Also clear PG_reclaim to avoid rotate_reclaimable_page()
 */
set_page_dirty(page);
-   pr_alert("Write-error on swap-device (%u:%u:%llu)\n",
+   pr_alert_ratelimited("Write-error on swap-device 
(%u:%u:%llu)\n",
 MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)),
 (unsigned long long)bio->bi_iter.bi_sector);
ClearPageReclaim(page);
-- 
2.15.1



[PATCH] mm: ratelimit end_swap_bio_write() error

2018-01-05 Thread Sergey Senozhatsky
Use the ratelimited printk() version for swap-device write error
reporting. We can use ZRAM as a swap-device, and the tricky part
here is that zsmalloc() stores compressed objects in memory, thus
it has to allocates pages during swap-out. If the system is short
on memory, then we begin to flood printk() log buffer with the
same "Write-error on swap-device XXX" error messages and sometimes
simply lockup the system.

Signed-off-by: Sergey Senozhatsky 
---
 mm/page_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index e93f1a4cacd7..422cd49bcba8 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -63,7 +63,7 @@ void end_swap_bio_write(struct bio *bio)
 * Also clear PG_reclaim to avoid rotate_reclaimable_page()
 */
set_page_dirty(page);
-   pr_alert("Write-error on swap-device (%u:%u:%llu)\n",
+   pr_alert_ratelimited("Write-error on swap-device 
(%u:%u:%llu)\n",
 MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)),
 (unsigned long long)bio->bi_iter.bi_sector);
ClearPageReclaim(page);
-- 
2.15.1



Hey

2018-01-05 Thread Financial Services
Loan Offer at 3% Lowest Rate Get Now.


Hey

2018-01-05 Thread Financial Services
Loan Offer at 3% Lowest Rate Get Now.


RE: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:39 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset
> interface
> 
> With new interface timecounter_reset we can update the start time for
> timecounter. Update ixgbe_ptp_settime with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:39 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 22/27] ixgbe: Use timecounter_reset
> interface
> 
> With new interface timecounter_reset we can update the start time for
> timecounter. Update ixgbe_ptp_settime with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 21/27] igb: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:39 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 21/27] igb: Use timecounter_reset
> interface
> 
> With new interface timecounter_reset we can update the start time for
> timecounter. Update igb_ptp_settime_82576 with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/igb/igb_ptp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 21/27] igb: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:39 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 21/27] igb: Use timecounter_reset
> interface
> 
> With new interface timecounter_reset we can update the start time for
> timecounter. Update igb_ptp_settime_82576 with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/igb/igb_ptp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 08/27] e1000e: Use timecounter_initialize interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:38 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 08/27] e1000e: Use timecounter_initialize
> interface
> 
> With new interface timecounter_initialize we can initialize timecounter
> fields and underlying cyclecounter together. Update e1000e timecounter
> init with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/e1000e/e1000.h  |  4 
>  drivers/net/ethernet/intel/e1000e/netdev.c | 31 +-
> 
>  2 files changed, 22 insertions(+), 13 deletions(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 08/27] e1000e: Use timecounter_initialize interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:38 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 08/27] e1000e: Use timecounter_initialize
> interface
> 
> With new interface timecounter_initialize we can initialize timecounter
> fields and underlying cyclecounter together. Update e1000e timecounter
> init with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/e1000e/e1000.h  |  4 
>  drivers/net/ethernet/intel/e1000e/netdev.c | 31 +-
> 
>  2 files changed, 22 insertions(+), 13 deletions(-)
> 

Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 20/27] e1000e: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:39 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 20/27] e1000e: Use timecounter_reset
> interface
> 
> With new interface timecounter_reset we can update the start time for
> timecounter. Update e1000e_phc_settime with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/e1000e/ptp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 20/27] e1000e: Use timecounter_reset interface

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Sagar Arun Kamble
> Sent: Thursday, December 14, 2017 11:39 PM
> To: linux-kernel@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org; Richard Cochran
> ; Kamble, Sagar A
> ; net...@vger.kernel.org
> Subject: [Intel-wired-lan] [PATCH 20/27] e1000e: Use timecounter_reset
> interface
> 
> With new interface timecounter_reset we can update the start time for
> timecounter. Update e1000e_phc_settime with this new function.
> 
> Signed-off-by: Sagar Arun Kamble 
> Cc: Richard Cochran 
> Cc: Jeff Kirsher 
> Cc: intel-wired-...@lists.osuosl.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/ethernet/intel/e1000e/ptp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
Tested-by: Aaron Brown 


RE: [Intel-wired-lan] [PATCH 1/1] timecounter: Make cyclecounter struct part of timecounter struct

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Jeff Kirsher
> Sent: Wednesday, December 6, 2017 8:25 AM
> To: Kamble, Sagar A ; linux-
> ker...@vger.kernel.org
> Cc: alsa-de...@alsa-project.org; linux-r...@vger.kernel.org;
> net...@vger.kernel.org; Richard Cochran ;
> Stephen Boyd ; Chris Wilson  wilson.co.uk>; John Stultz ; intel-wired-
> l...@lists.osuosl.org; Thomas Gleixner ;
> kvm...@lists.cs.columbia.edu; linux-arm-ker...@lists.infradead.org
> Subject: Re: [Intel-wired-lan] [PATCH 1/1] timecounter: Make cyclecounter
> struct part of timecounter struct
> 
> On Sat, 2017-12-02 at 10:01 +0530, Sagar Arun Kamble wrote:
> > There is no real need for the users of timecounters to define
> > cyclecounter
> > and timecounter variables separately. Since timecounter will always
> > be
> > based on cyclecounter, have cyclecounter struct as member of
> > timecounter
> > struct.
> >
> > Suggested-by: Chris Wilson 
> > Signed-off-by: Sagar Arun Kamble 
> > Cc: Chris Wilson 
> > Cc: Richard Cochran 
> > Cc: John Stultz 
> > Cc: Thomas Gleixner 
> > Cc: Stephen Boyd 
> > Cc: linux-kernel@vger.kernel.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: net...@vger.kernel.org
> > Cc: intel-wired-...@lists.osuosl.org
> > Cc: linux-r...@vger.kernel.org
> > Cc: alsa-de...@alsa-project.org
> > Cc: kvm...@lists.cs.columbia.edu
> 
> Acked-by: Jeff Kirsher 
> 

Tested-by: Aaron Brown 

> For the changes to the Intel drivers.
> 
> > ---
> >  arch/microblaze/kernel/timer.c | 20 ++--
> >  drivers/clocksource/arm_arch_timer.c   | 19 ++--
> >  drivers/net/ethernet/amd/xgbe/xgbe-dev.c   |  3 +-
> >  drivers/net/ethernet/amd/xgbe/xgbe-ptp.c   |  9 +++---
> >  drivers/net/ethernet/amd/xgbe/xgbe.h   |  1 -
> >  drivers/net/ethernet/broadcom/bnx2x/bnx2x.h|  1 -
> >  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 20 ++--
> >  drivers/net/ethernet/freescale/fec.h   |  1 -
> >  drivers/net/ethernet/freescale/fec_ptp.c   | 30 +---
> > --
> >  drivers/net/ethernet/intel/e1000e/e1000.h  |  1 -
> >  drivers/net/ethernet/intel/e1000e/netdev.c | 27 --
> > --
> >  drivers/net/ethernet/intel/e1000e/ptp.c|  2 +-
> >  drivers/net/ethernet/intel/igb/igb.h   |  1 -
> >  drivers/net/ethernet/intel/igb/igb_ptp.c   | 25 --
> > -
> >  drivers/net/ethernet/intel/ixgbe/ixgbe.h   |  1 -
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c   | 17 +-
> >  drivers/net/ethernet/mellanox/mlx4/en_clock.c  | 28 --
> > ---
> >  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  1 -
> >  .../net/ethernet/mellanox/mlx5/core/lib/clock.c| 34 ++
> > --
> >  drivers/net/ethernet/qlogic/qede/qede_ptp.c| 20 ++--
> >  drivers/net/ethernet/ti/cpts.c | 36
> > --
> >  drivers/net/ethernet/ti/cpts.h |  1 -
> >  include/linux/mlx5/driver.h|  1 -
> >  include/linux/timecounter.h|  4 +--
> >  include/sound/hdaudio.h|  1 -
> >  kernel/time/timecounter.c  | 28 --
> > ---
> >  sound/hda/hdac_stream.c|  7 +++--
> >  virt/kvm/arm/arch_timer.c  |  6 ++--
> >  28 files changed, 163 insertions(+), 182 deletions(-)


[PATCH 3/7] pinctrl: sunxi: add support for the Allwinner H6 main pin controller

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 SoC has two pin controllers, one main controller
(called CPUX-PORT in user manual) and one controller in CPUs power
domain (called CPUS-PORT in user manual).

This commit introduces support for the main pin controller on H6.

The pin bank A and B are not wired out and hidden from the SoC's
documents, however it's shown that the "ATE" (an AC200 chip
co-packaged with the H6 die) is connected to the main SoC die via these
pin banks. The information about these banks is just copied from the BSP
pinctrl driver, but re-formatted to fit the mainline pinctrl driver
format.

Signed-off-by: Icenowy Zheng 
---
 .../bindings/pinctrl/allwinner,sunxi-pinctrl.txt   |   4 +-
 drivers/pinctrl/sunxi/Kconfig  |   4 +
 drivers/pinctrl/sunxi/Makefile |   1 +
 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c  | 679 +
 4 files changed, 687 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c

diff --git 
a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt 
b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
index 09789fdfa749..4523e658b9f2 100644
--- a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
@@ -27,6 +27,7 @@ Required properties:
   "allwinner,sun50i-a64-pinctrl"
   "allwinner,sun50i-a64-r-pinctrl"
   "allwinner,sun50i-h5-pinctrl"
+  "allwinner,sun50i-h6-pinctrl"
   "nextthing,gr8-pinctrl"
 
 - reg: Should contain the register physical address and length for the
@@ -39,7 +40,8 @@ Required properties:
 
 Note: For backward compatibility reasons, the hosc and losc clocks are only
 required if you need to use the optional input-debounce property. Any new
-device tree should set them.
+device tree should set them. For the pin controllers on Allwinner H6 SoC,
+there's no APB bus gate, and the "apb" clock should be omitted.
 
 Optional properties:
   - input-debounce: Array of debouncing periods in microseconds. One period per
diff --git a/drivers/pinctrl/sunxi/Kconfig b/drivers/pinctrl/sunxi/Kconfig
index bfce99d86dfc..5de1f63b07bb 100644
--- a/drivers/pinctrl/sunxi/Kconfig
+++ b/drivers/pinctrl/sunxi/Kconfig
@@ -77,4 +77,8 @@ config PINCTRL_SUN50I_H5
def_bool ARM64 && ARCH_SUNXI
select PINCTRL_SUNXI
 
+config PINCTRL_SUN50I_H6
+   def_bool ARM64 && ARCH_SUNXI
+   select PINCTRL_SUNXI
+
 endif
diff --git a/drivers/pinctrl/sunxi/Makefile b/drivers/pinctrl/sunxi/Makefile
index 12a752e836ef..3c4aec6611e9 100644
--- a/drivers/pinctrl/sunxi/Makefile
+++ b/drivers/pinctrl/sunxi/Makefile
@@ -18,5 +18,6 @@ obj-$(CONFIG_PINCTRL_SUN8I_H3)+= 
pinctrl-sun8i-h3.o
 obj-$(CONFIG_PINCTRL_SUN8I_H3_R)   += pinctrl-sun8i-h3-r.o
 obj-$(CONFIG_PINCTRL_SUN8I_V3S)+= pinctrl-sun8i-v3s.o
 obj-$(CONFIG_PINCTRL_SUN50I_H5)+= pinctrl-sun50i-h5.o
+obj-$(CONFIG_PINCTRL_SUN50I_H6)+= pinctrl-sun50i-h6.o
 obj-$(CONFIG_PINCTRL_SUN9I_A80)+= pinctrl-sun9i-a80.o
 obj-$(CONFIG_PINCTRL_SUN9I_A80_R)  += pinctrl-sun9i-a80-r.o
diff --git a/drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c 
b/drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c
new file mode 100644
index ..bfc5df8719d8
--- /dev/null
+++ b/drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c
@@ -0,0 +1,679 @@
+/*
+ * Allwinner H6 SoC pinctrl driver.
+ *
+ * Copyright (C) 2017 Icenowy Zheng 
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2.  This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pinctrl-sunxi.h"
+
+static const struct sunxi_desc_pin h6_pins[] = {
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 0),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ERXD1 */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 1),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ERXD0 */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 2),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ECRS_DV */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 3),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ERXERR */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 4),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ETXD1 */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 5),
+ 

RE: [Intel-wired-lan] [PATCH 1/1] timecounter: Make cyclecounter struct part of timecounter struct

2018-01-05 Thread Brown, Aaron F
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Jeff Kirsher
> Sent: Wednesday, December 6, 2017 8:25 AM
> To: Kamble, Sagar A ; linux-
> ker...@vger.kernel.org
> Cc: alsa-de...@alsa-project.org; linux-r...@vger.kernel.org;
> net...@vger.kernel.org; Richard Cochran ;
> Stephen Boyd ; Chris Wilson  wilson.co.uk>; John Stultz ; intel-wired-
> l...@lists.osuosl.org; Thomas Gleixner ;
> kvm...@lists.cs.columbia.edu; linux-arm-ker...@lists.infradead.org
> Subject: Re: [Intel-wired-lan] [PATCH 1/1] timecounter: Make cyclecounter
> struct part of timecounter struct
> 
> On Sat, 2017-12-02 at 10:01 +0530, Sagar Arun Kamble wrote:
> > There is no real need for the users of timecounters to define
> > cyclecounter
> > and timecounter variables separately. Since timecounter will always
> > be
> > based on cyclecounter, have cyclecounter struct as member of
> > timecounter
> > struct.
> >
> > Suggested-by: Chris Wilson 
> > Signed-off-by: Sagar Arun Kamble 
> > Cc: Chris Wilson 
> > Cc: Richard Cochran 
> > Cc: John Stultz 
> > Cc: Thomas Gleixner 
> > Cc: Stephen Boyd 
> > Cc: linux-kernel@vger.kernel.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: net...@vger.kernel.org
> > Cc: intel-wired-...@lists.osuosl.org
> > Cc: linux-r...@vger.kernel.org
> > Cc: alsa-de...@alsa-project.org
> > Cc: kvm...@lists.cs.columbia.edu
> 
> Acked-by: Jeff Kirsher 
> 

Tested-by: Aaron Brown 

> For the changes to the Intel drivers.
> 
> > ---
> >  arch/microblaze/kernel/timer.c | 20 ++--
> >  drivers/clocksource/arm_arch_timer.c   | 19 ++--
> >  drivers/net/ethernet/amd/xgbe/xgbe-dev.c   |  3 +-
> >  drivers/net/ethernet/amd/xgbe/xgbe-ptp.c   |  9 +++---
> >  drivers/net/ethernet/amd/xgbe/xgbe.h   |  1 -
> >  drivers/net/ethernet/broadcom/bnx2x/bnx2x.h|  1 -
> >  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 20 ++--
> >  drivers/net/ethernet/freescale/fec.h   |  1 -
> >  drivers/net/ethernet/freescale/fec_ptp.c   | 30 +---
> > --
> >  drivers/net/ethernet/intel/e1000e/e1000.h  |  1 -
> >  drivers/net/ethernet/intel/e1000e/netdev.c | 27 --
> > --
> >  drivers/net/ethernet/intel/e1000e/ptp.c|  2 +-
> >  drivers/net/ethernet/intel/igb/igb.h   |  1 -
> >  drivers/net/ethernet/intel/igb/igb_ptp.c   | 25 --
> > -
> >  drivers/net/ethernet/intel/ixgbe/ixgbe.h   |  1 -
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c   | 17 +-
> >  drivers/net/ethernet/mellanox/mlx4/en_clock.c  | 28 --
> > ---
> >  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  1 -
> >  .../net/ethernet/mellanox/mlx5/core/lib/clock.c| 34 ++
> > --
> >  drivers/net/ethernet/qlogic/qede/qede_ptp.c| 20 ++--
> >  drivers/net/ethernet/ti/cpts.c | 36
> > --
> >  drivers/net/ethernet/ti/cpts.h |  1 -
> >  include/linux/mlx5/driver.h|  1 -
> >  include/linux/timecounter.h|  4 +--
> >  include/sound/hdaudio.h|  1 -
> >  kernel/time/timecounter.c  | 28 --
> > ---
> >  sound/hda/hdac_stream.c|  7 +++--
> >  virt/kvm/arm/arch_timer.c  |  6 ++--
> >  28 files changed, 163 insertions(+), 182 deletions(-)


[PATCH 3/7] pinctrl: sunxi: add support for the Allwinner H6 main pin controller

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 SoC has two pin controllers, one main controller
(called CPUX-PORT in user manual) and one controller in CPUs power
domain (called CPUS-PORT in user manual).

This commit introduces support for the main pin controller on H6.

The pin bank A and B are not wired out and hidden from the SoC's
documents, however it's shown that the "ATE" (an AC200 chip
co-packaged with the H6 die) is connected to the main SoC die via these
pin banks. The information about these banks is just copied from the BSP
pinctrl driver, but re-formatted to fit the mainline pinctrl driver
format.

Signed-off-by: Icenowy Zheng 
---
 .../bindings/pinctrl/allwinner,sunxi-pinctrl.txt   |   4 +-
 drivers/pinctrl/sunxi/Kconfig  |   4 +
 drivers/pinctrl/sunxi/Makefile |   1 +
 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c  | 679 +
 4 files changed, 687 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c

diff --git 
a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt 
b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
index 09789fdfa749..4523e658b9f2 100644
--- a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
@@ -27,6 +27,7 @@ Required properties:
   "allwinner,sun50i-a64-pinctrl"
   "allwinner,sun50i-a64-r-pinctrl"
   "allwinner,sun50i-h5-pinctrl"
+  "allwinner,sun50i-h6-pinctrl"
   "nextthing,gr8-pinctrl"
 
 - reg: Should contain the register physical address and length for the
@@ -39,7 +40,8 @@ Required properties:
 
 Note: For backward compatibility reasons, the hosc and losc clocks are only
 required if you need to use the optional input-debounce property. Any new
-device tree should set them.
+device tree should set them. For the pin controllers on Allwinner H6 SoC,
+there's no APB bus gate, and the "apb" clock should be omitted.
 
 Optional properties:
   - input-debounce: Array of debouncing periods in microseconds. One period per
diff --git a/drivers/pinctrl/sunxi/Kconfig b/drivers/pinctrl/sunxi/Kconfig
index bfce99d86dfc..5de1f63b07bb 100644
--- a/drivers/pinctrl/sunxi/Kconfig
+++ b/drivers/pinctrl/sunxi/Kconfig
@@ -77,4 +77,8 @@ config PINCTRL_SUN50I_H5
def_bool ARM64 && ARCH_SUNXI
select PINCTRL_SUNXI
 
+config PINCTRL_SUN50I_H6
+   def_bool ARM64 && ARCH_SUNXI
+   select PINCTRL_SUNXI
+
 endif
diff --git a/drivers/pinctrl/sunxi/Makefile b/drivers/pinctrl/sunxi/Makefile
index 12a752e836ef..3c4aec6611e9 100644
--- a/drivers/pinctrl/sunxi/Makefile
+++ b/drivers/pinctrl/sunxi/Makefile
@@ -18,5 +18,6 @@ obj-$(CONFIG_PINCTRL_SUN8I_H3)+= 
pinctrl-sun8i-h3.o
 obj-$(CONFIG_PINCTRL_SUN8I_H3_R)   += pinctrl-sun8i-h3-r.o
 obj-$(CONFIG_PINCTRL_SUN8I_V3S)+= pinctrl-sun8i-v3s.o
 obj-$(CONFIG_PINCTRL_SUN50I_H5)+= pinctrl-sun50i-h5.o
+obj-$(CONFIG_PINCTRL_SUN50I_H6)+= pinctrl-sun50i-h6.o
 obj-$(CONFIG_PINCTRL_SUN9I_A80)+= pinctrl-sun9i-a80.o
 obj-$(CONFIG_PINCTRL_SUN9I_A80_R)  += pinctrl-sun9i-a80-r.o
diff --git a/drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c 
b/drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c
new file mode 100644
index ..bfc5df8719d8
--- /dev/null
+++ b/drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c
@@ -0,0 +1,679 @@
+/*
+ * Allwinner H6 SoC pinctrl driver.
+ *
+ * Copyright (C) 2017 Icenowy Zheng 
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2.  This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pinctrl-sunxi.h"
+
+static const struct sunxi_desc_pin h6_pins[] = {
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 0),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ERXD1 */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 1),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ERXD0 */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 2),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ECRS_DV */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 3),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ERXERR */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 4),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+ SUNXI_FUNCTION(0x2, "emac")), /* ETXD1 */
+   SUNXI_PIN(SUNXI_PINCTRL_PIN(A, 5),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+

[PATCH 2/7] pinctrl: sunxi: support pin controllers with holes among IRQ banks

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 SoC have its pin controllers with the first IRQ-capable
GPIO bank at IRQ bank 1 and the second bank at IRQ bank 5. This
situation cannot be processed with the current pinctrl IRQ code, as it
only expects a offset to all IRQ banks.

Update the code to use a logical IRQ bank to hardware IRQ bank map, so
the new situation in H6 main pin controller can be processed. The old
special situation which uses a constant offset (on A33 and V3s, both
with a offset of 1) can be also processed with the new code.

Signed-off-by: Icenowy Zheng 
---
 drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c |  4 ++-
 drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c |  4 ++-
 drivers/pinctrl/sunxi/pinctrl-sunxi.c | 16 ++--
 drivers/pinctrl/sunxi/pinctrl-sunxi.h | 41 +--
 4 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c 
b/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c
index da387211a75e..f043afa1aac5 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c
@@ -481,11 +481,13 @@ static const struct sunxi_desc_pin sun8i_a33_pins[] = {
  SUNXI_FUNCTION(0x3, "uart3")),/* CTS */
 };
 
+static const unsigned int sun8i_a33_pinctrl_irq_bank_map[] = { 1, 2 };
+
 static const struct sunxi_pinctrl_desc sun8i_a33_pinctrl_data = {
.pins = sun8i_a33_pins,
.npins = ARRAY_SIZE(sun8i_a33_pins),
.irq_banks = 2,
-   .irq_bank_base = 1,
+   .irq_bank_map = sun8i_a33_pinctrl_irq_bank_map,
.disable_strict_mode = true,
 };
 
diff --git a/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c 
b/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c
index 496ba34e1f5f..6704ce8e5e3d 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c
@@ -293,11 +293,13 @@ static const struct sunxi_desc_pin sun8i_v3s_pins[] = {
  SUNXI_FUNCTION_IRQ_BANK(0x6, 1, 5)),  /* PG_EINT5 */
 };
 
+static const unsigned int sun8i_v3s_pinctrl_irq_bank_map[] = { 1, 2 };
+
 static const struct sunxi_pinctrl_desc sun8i_v3s_pinctrl_data = {
.pins = sun8i_v3s_pins,
.npins = ARRAY_SIZE(sun8i_v3s_pins),
.irq_banks = 2,
-   .irq_bank_base = 1,
+   .irq_bank_map = sun8i_v3s_pinctrl_irq_bank_map,
.irq_read_needs_mux = true
 };
 
diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c 
b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
index 68cd505679d9..67ceb40fcb86 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
@@ -832,7 +832,7 @@ static void sunxi_pinctrl_irq_release_resources(struct 
irq_data *d)
 static int sunxi_pinctrl_irq_set_type(struct irq_data *d, unsigned int type)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
-   u32 reg = sunxi_irq_cfg_reg(d->hwirq, pctl->desc->irq_bank_base);
+   u32 reg = sunxi_irq_cfg_reg(d->hwirq, pctl->desc->irq_bank_map);
u8 index = sunxi_irq_cfg_offset(d->hwirq);
unsigned long flags;
u32 regval;
@@ -880,7 +880,7 @@ static void sunxi_pinctrl_irq_ack(struct irq_data *d)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
u32 status_reg = sunxi_irq_status_reg(d->hwirq,
- pctl->desc->irq_bank_base);
+ pctl->desc->irq_bank_map);
u8 status_idx = sunxi_irq_status_offset(d->hwirq);
 
/* Clear the IRQ */
@@ -890,7 +890,7 @@ static void sunxi_pinctrl_irq_ack(struct irq_data *d)
 static void sunxi_pinctrl_irq_mask(struct irq_data *d)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
-   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_base);
+   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_map);
u8 idx = sunxi_irq_ctrl_offset(d->hwirq);
unsigned long flags;
u32 val;
@@ -907,7 +907,7 @@ static void sunxi_pinctrl_irq_mask(struct irq_data *d)
 static void sunxi_pinctrl_irq_unmask(struct irq_data *d)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
-   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_base);
+   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_map);
u8 idx = sunxi_irq_ctrl_offset(d->hwirq);
unsigned long flags;
u32 val;
@@ -999,7 +999,7 @@ static void sunxi_pinctrl_irq_handler(struct irq_desc *desc)
if (bank == pctl->desc->irq_banks)
return;
 
-   reg = sunxi_irq_status_reg_from_bank(bank, pctl->desc->irq_bank_base);
+   reg = sunxi_irq_status_reg_from_bank(bank, pctl->desc->irq_bank_map);
val = readl(pctl->membase + reg);
 
if (val) {
@@ -1237,7 +1237,7 @@ static int sunxi_pinctrl_setup_debounce(struct 
sunxi_pinctrl *pctl,
writel(src | div << 4,
   pctl->membase +
   

[PATCH 2/7] pinctrl: sunxi: support pin controllers with holes among IRQ banks

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 SoC have its pin controllers with the first IRQ-capable
GPIO bank at IRQ bank 1 and the second bank at IRQ bank 5. This
situation cannot be processed with the current pinctrl IRQ code, as it
only expects a offset to all IRQ banks.

Update the code to use a logical IRQ bank to hardware IRQ bank map, so
the new situation in H6 main pin controller can be processed. The old
special situation which uses a constant offset (on A33 and V3s, both
with a offset of 1) can be also processed with the new code.

Signed-off-by: Icenowy Zheng 
---
 drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c |  4 ++-
 drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c |  4 ++-
 drivers/pinctrl/sunxi/pinctrl-sunxi.c | 16 ++--
 drivers/pinctrl/sunxi/pinctrl-sunxi.h | 41 +--
 4 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c 
b/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c
index da387211a75e..f043afa1aac5 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c
@@ -481,11 +481,13 @@ static const struct sunxi_desc_pin sun8i_a33_pins[] = {
  SUNXI_FUNCTION(0x3, "uart3")),/* CTS */
 };
 
+static const unsigned int sun8i_a33_pinctrl_irq_bank_map[] = { 1, 2 };
+
 static const struct sunxi_pinctrl_desc sun8i_a33_pinctrl_data = {
.pins = sun8i_a33_pins,
.npins = ARRAY_SIZE(sun8i_a33_pins),
.irq_banks = 2,
-   .irq_bank_base = 1,
+   .irq_bank_map = sun8i_a33_pinctrl_irq_bank_map,
.disable_strict_mode = true,
 };
 
diff --git a/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c 
b/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c
index 496ba34e1f5f..6704ce8e5e3d 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c
@@ -293,11 +293,13 @@ static const struct sunxi_desc_pin sun8i_v3s_pins[] = {
  SUNXI_FUNCTION_IRQ_BANK(0x6, 1, 5)),  /* PG_EINT5 */
 };
 
+static const unsigned int sun8i_v3s_pinctrl_irq_bank_map[] = { 1, 2 };
+
 static const struct sunxi_pinctrl_desc sun8i_v3s_pinctrl_data = {
.pins = sun8i_v3s_pins,
.npins = ARRAY_SIZE(sun8i_v3s_pins),
.irq_banks = 2,
-   .irq_bank_base = 1,
+   .irq_bank_map = sun8i_v3s_pinctrl_irq_bank_map,
.irq_read_needs_mux = true
 };
 
diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c 
b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
index 68cd505679d9..67ceb40fcb86 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
@@ -832,7 +832,7 @@ static void sunxi_pinctrl_irq_release_resources(struct 
irq_data *d)
 static int sunxi_pinctrl_irq_set_type(struct irq_data *d, unsigned int type)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
-   u32 reg = sunxi_irq_cfg_reg(d->hwirq, pctl->desc->irq_bank_base);
+   u32 reg = sunxi_irq_cfg_reg(d->hwirq, pctl->desc->irq_bank_map);
u8 index = sunxi_irq_cfg_offset(d->hwirq);
unsigned long flags;
u32 regval;
@@ -880,7 +880,7 @@ static void sunxi_pinctrl_irq_ack(struct irq_data *d)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
u32 status_reg = sunxi_irq_status_reg(d->hwirq,
- pctl->desc->irq_bank_base);
+ pctl->desc->irq_bank_map);
u8 status_idx = sunxi_irq_status_offset(d->hwirq);
 
/* Clear the IRQ */
@@ -890,7 +890,7 @@ static void sunxi_pinctrl_irq_ack(struct irq_data *d)
 static void sunxi_pinctrl_irq_mask(struct irq_data *d)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
-   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_base);
+   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_map);
u8 idx = sunxi_irq_ctrl_offset(d->hwirq);
unsigned long flags;
u32 val;
@@ -907,7 +907,7 @@ static void sunxi_pinctrl_irq_mask(struct irq_data *d)
 static void sunxi_pinctrl_irq_unmask(struct irq_data *d)
 {
struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d);
-   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_base);
+   u32 reg = sunxi_irq_ctrl_reg(d->hwirq, pctl->desc->irq_bank_map);
u8 idx = sunxi_irq_ctrl_offset(d->hwirq);
unsigned long flags;
u32 val;
@@ -999,7 +999,7 @@ static void sunxi_pinctrl_irq_handler(struct irq_desc *desc)
if (bank == pctl->desc->irq_banks)
return;
 
-   reg = sunxi_irq_status_reg_from_bank(bank, pctl->desc->irq_bank_base);
+   reg = sunxi_irq_status_reg_from_bank(bank, pctl->desc->irq_bank_map);
val = readl(pctl->membase + reg);
 
if (val) {
@@ -1237,7 +1237,7 @@ static int sunxi_pinctrl_setup_debounce(struct 
sunxi_pinctrl *pctl,
writel(src | div << 4,
   pctl->membase +
   

[PATCH 1/7] pinctrl: sunxi: add support for pin controllers without bus gate

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 pin controllers (both the main one and the CPUs one)
have no bus gate clocks.

Add support for this kind of pin controllers.

Signed-off-by: Icenowy Zheng 
---
 drivers/pinctrl/sunxi/pinctrl-sunxi.c | 30 --
 drivers/pinctrl/sunxi/pinctrl-sunxi.h |  1 +
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c 
b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
index 4b6cb25bc796..68cd505679d9 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
@@ -1182,7 +1182,12 @@ static int sunxi_pinctrl_setup_debounce(struct 
sunxi_pinctrl *pctl,
unsigned int hosc_div, losc_div;
struct clk *hosc, *losc;
u8 div, src;
-   int i, ret;
+   int i, ret, clk_count;
+
+   if (pctl->desc->without_bus_gate)
+   clk_count = 2;
+   else
+   clk_count = 3;
 
/* Deal with old DTs that didn't have the oscillators */
if (of_count_phandle_with_args(node, "clocks", "#clock-cells") != 3)
@@ -1360,15 +1365,19 @@ int sunxi_pinctrl_init_with_variant(struct 
platform_device *pdev,
goto gpiochip_error;
}
 
-   clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(clk)) {
-   ret = PTR_ERR(clk);
-   goto gpiochip_error;
-   }
+   if (!desc->without_bus_gate) {
+   clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(clk)) {
+   ret = PTR_ERR(clk);
+   goto gpiochip_error;
+   }
 
-   ret = clk_prepare_enable(clk);
-   if (ret)
-   goto gpiochip_error;
+   ret = clk_prepare_enable(clk);
+   if (ret)
+   goto gpiochip_error;
+   } else {
+   clk = NULL;
+   }
 
pctl->irq = devm_kcalloc(>dev,
 pctl->desc->irq_banks,
@@ -1425,7 +1434,8 @@ int sunxi_pinctrl_init_with_variant(struct 
platform_device *pdev,
return 0;
 
 clk_error:
-   clk_disable_unprepare(clk);
+   if (clk)
+   clk_disable_unprepare(clk);
 gpiochip_error:
gpiochip_remove(pctl->chip);
return ret;
diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.h 
b/drivers/pinctrl/sunxi/pinctrl-sunxi.h
index 11b128f54ed2..ccb6230f0bb5 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.h
+++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.h
@@ -113,6 +113,7 @@ struct sunxi_pinctrl_desc {
unsignedirq_bank_base;
boolirq_read_needs_mux;
booldisable_strict_mode;
+   boolwithout_bus_gate;
 };
 
 struct sunxi_pinctrl_function {
-- 
2.14.2



[PATCH 1/7] pinctrl: sunxi: add support for pin controllers without bus gate

2018-01-05 Thread Icenowy Zheng
The Allwinner H6 pin controllers (both the main one and the CPUs one)
have no bus gate clocks.

Add support for this kind of pin controllers.

Signed-off-by: Icenowy Zheng 
---
 drivers/pinctrl/sunxi/pinctrl-sunxi.c | 30 --
 drivers/pinctrl/sunxi/pinctrl-sunxi.h |  1 +
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c 
b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
index 4b6cb25bc796..68cd505679d9 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
+++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
@@ -1182,7 +1182,12 @@ static int sunxi_pinctrl_setup_debounce(struct 
sunxi_pinctrl *pctl,
unsigned int hosc_div, losc_div;
struct clk *hosc, *losc;
u8 div, src;
-   int i, ret;
+   int i, ret, clk_count;
+
+   if (pctl->desc->without_bus_gate)
+   clk_count = 2;
+   else
+   clk_count = 3;
 
/* Deal with old DTs that didn't have the oscillators */
if (of_count_phandle_with_args(node, "clocks", "#clock-cells") != 3)
@@ -1360,15 +1365,19 @@ int sunxi_pinctrl_init_with_variant(struct 
platform_device *pdev,
goto gpiochip_error;
}
 
-   clk = devm_clk_get(>dev, NULL);
-   if (IS_ERR(clk)) {
-   ret = PTR_ERR(clk);
-   goto gpiochip_error;
-   }
+   if (!desc->without_bus_gate) {
+   clk = devm_clk_get(>dev, NULL);
+   if (IS_ERR(clk)) {
+   ret = PTR_ERR(clk);
+   goto gpiochip_error;
+   }
 
-   ret = clk_prepare_enable(clk);
-   if (ret)
-   goto gpiochip_error;
+   ret = clk_prepare_enable(clk);
+   if (ret)
+   goto gpiochip_error;
+   } else {
+   clk = NULL;
+   }
 
pctl->irq = devm_kcalloc(>dev,
 pctl->desc->irq_banks,
@@ -1425,7 +1434,8 @@ int sunxi_pinctrl_init_with_variant(struct 
platform_device *pdev,
return 0;
 
 clk_error:
-   clk_disable_unprepare(clk);
+   if (clk)
+   clk_disable_unprepare(clk);
 gpiochip_error:
gpiochip_remove(pctl->chip);
return ret;
diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.h 
b/drivers/pinctrl/sunxi/pinctrl-sunxi.h
index 11b128f54ed2..ccb6230f0bb5 100644
--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.h
+++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.h
@@ -113,6 +113,7 @@ struct sunxi_pinctrl_desc {
unsignedirq_bank_base;
boolirq_read_needs_mux;
booldisable_strict_mode;
+   boolwithout_bus_gate;
 };
 
 struct sunxi_pinctrl_function {
-- 
2.14.2



[PATCH 0/7] Initial Allwinner H6 support

2018-01-05 Thread Icenowy Zheng
This patchset adds initial support for the Allwinner H6 SoC.

It's quite different from earlier Allwinner SoCs. For example, the
memory map is refactored, and the CCU is rearranged. It's also the first
Allwinner SoC with PCI Express interface, and the second one with USB
3.0 (the first one is A80).

This patchset adds the most basical support for it, including the main pin
controller, the main CCU and the basical device tree.

Icenowy Zheng (7):
  pinctrl: sunxi: add support for pin controllers without bus gate
  pinctrl: sunxi: support pin controllers with holes among IRQ banks
  pinctrl: sunxi: add support for the Allwinner H6 main pin controller
  clk: sunxi-ng: Support fixed post-dividers on NKMP style clocks
  clk: sunxi-ng: add support for the Allwinner H6 CCU
  arm64: allwinner: h6: add the basical Allwinner H6 DTSI file
  arm64: allwinner: h6: add support for Pine H64 board

 .../devicetree/bindings/clock/sunxi-ccu.txt|1 +
 .../bindings/pinctrl/allwinner,sunxi-pinctrl.txt   |4 +-
 arch/arm64/boot/dts/allwinner/Makefile |1 +
 .../boot/dts/allwinner/sun50i-h6-pine-h64.dts  |   66 ++
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi   |  214 
 drivers/clk/sunxi-ng/Kconfig   |5 +
 drivers/clk/sunxi-ng/Makefile  |1 +
 drivers/clk/sunxi-ng/ccu-sun50i-h6.c   | 1206 
 drivers/clk/sunxi-ng/ccu-sun50i-h6.h   |   63 +
 drivers/clk/sunxi-ng/ccu_nkmp.c|   20 +-
 drivers/clk/sunxi-ng/ccu_nkmp.h|2 +
 drivers/pinctrl/sunxi/Kconfig  |4 +
 drivers/pinctrl/sunxi/Makefile |1 +
 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c  |  679 +++
 drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c  |4 +-
 drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c  |4 +-
 drivers/pinctrl/sunxi/pinctrl-sunxi.c  |   46 +-
 drivers/pinctrl/sunxi/pinctrl-sunxi.h  |   42 +-
 include/dt-bindings/clock/sun50i-h6-ccu.h  |  159 +++
 include/dt-bindings/reset/sun50i-h6-ccu.h  |  110 ++
 20 files changed, 2595 insertions(+), 37 deletions(-)
 create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts
 create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.c
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.h
 create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c
 create mode 100644 include/dt-bindings/clock/sun50i-h6-ccu.h
 create mode 100644 include/dt-bindings/reset/sun50i-h6-ccu.h

-- 
2.14.2



[PATCH 0/7] Initial Allwinner H6 support

2018-01-05 Thread Icenowy Zheng
This patchset adds initial support for the Allwinner H6 SoC.

It's quite different from earlier Allwinner SoCs. For example, the
memory map is refactored, and the CCU is rearranged. It's also the first
Allwinner SoC with PCI Express interface, and the second one with USB
3.0 (the first one is A80).

This patchset adds the most basical support for it, including the main pin
controller, the main CCU and the basical device tree.

Icenowy Zheng (7):
  pinctrl: sunxi: add support for pin controllers without bus gate
  pinctrl: sunxi: support pin controllers with holes among IRQ banks
  pinctrl: sunxi: add support for the Allwinner H6 main pin controller
  clk: sunxi-ng: Support fixed post-dividers on NKMP style clocks
  clk: sunxi-ng: add support for the Allwinner H6 CCU
  arm64: allwinner: h6: add the basical Allwinner H6 DTSI file
  arm64: allwinner: h6: add support for Pine H64 board

 .../devicetree/bindings/clock/sunxi-ccu.txt|1 +
 .../bindings/pinctrl/allwinner,sunxi-pinctrl.txt   |4 +-
 arch/arm64/boot/dts/allwinner/Makefile |1 +
 .../boot/dts/allwinner/sun50i-h6-pine-h64.dts  |   66 ++
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi   |  214 
 drivers/clk/sunxi-ng/Kconfig   |5 +
 drivers/clk/sunxi-ng/Makefile  |1 +
 drivers/clk/sunxi-ng/ccu-sun50i-h6.c   | 1206 
 drivers/clk/sunxi-ng/ccu-sun50i-h6.h   |   63 +
 drivers/clk/sunxi-ng/ccu_nkmp.c|   20 +-
 drivers/clk/sunxi-ng/ccu_nkmp.h|2 +
 drivers/pinctrl/sunxi/Kconfig  |4 +
 drivers/pinctrl/sunxi/Makefile |1 +
 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c  |  679 +++
 drivers/pinctrl/sunxi/pinctrl-sun8i-a33.c  |4 +-
 drivers/pinctrl/sunxi/pinctrl-sun8i-v3s.c  |4 +-
 drivers/pinctrl/sunxi/pinctrl-sunxi.c  |   46 +-
 drivers/pinctrl/sunxi/pinctrl-sunxi.h  |   42 +-
 include/dt-bindings/clock/sun50i-h6-ccu.h  |  159 +++
 include/dt-bindings/reset/sun50i-h6-ccu.h  |  110 ++
 20 files changed, 2595 insertions(+), 37 deletions(-)
 create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts
 create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.c
 create mode 100644 drivers/clk/sunxi-ng/ccu-sun50i-h6.h
 create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-h6.c
 create mode 100644 include/dt-bindings/clock/sun50i-h6-ccu.h
 create mode 100644 include/dt-bindings/reset/sun50i-h6-ccu.h

-- 
2.14.2



Re: [PATCH v4 00/14] Modernization and fixes for NuBus subsystem

2018-01-05 Thread Finn Thain
Hi Geert,

On Fri, 5 Jan 2018, Geert Uytterhoeven wrote:

> 
> I assume you meant this to go in through the m68k tree?
> 

Yes, please. Because the NuBus-PowerMac port is out-of-tree, the m68k tree 
seems more appropriate than the powerpc tree for this submission.

> Can you please run this through checkpatch, and fix the reported white 
> space and other real (some are false positives) issues?
> 

Checkpatch is great but it is also incredibly noisy. So I missed a long 
line that should have been wrapped in patch 4/14 and another in 5/14. 
Sorry about that.

Checkpatch said, "Symbolic permissions are not preferred". But I didn't 
find any advice in the style guide, so I just retained the existing code 
style. What is your preference here?

Checkpatch also said, "code indent should use tabs where possible" though 
I've used only tabs to indent (according to scope, of course). Checkpatch 
also says, "please, no spaces at the start of a line". Yet it is common 
practice to put spaces at the start of a continuation (after any 
indentation tabs, of course) when wrapping lines*. Please let me know your 
preference.

Checkpatch asked, "added, moved or deleted file(s), does MAINTAINERS need 
updating?" Regarding drivers/nubus/*, that question is not a new one. The 
issue can be addressed in this patch or an earlier one, so as to keep 
checkpatch happy, or it can be addressed in a separate submission... Do we 
bring drivers/nubus/* under the Mac 68k subsystem? Isn't it a subsystem 
itself? (If maintain that code, do I get to exercise my discretion 
regarding checkpatch limitations?)

The rest of the checkpatch output seems to be irrelevant (or am I missing 
something?) --

Macros with complex values should be enclosed in parentheses

trailing statements should be on next line

Possible unwrapped commit description

braces {} are not necessary for single statement blocks

file is marked as 'obsolete' in the MAINTAINERS hierarchy.  No 
unnecessary modifications please.

suspect code indent for conditional statements

Please let me know how you would like me to address these issues, and I'll 
re-submit.

Thanks for your review.

* IMO checkpatch is really good at certain things but line wrap isn't one 
of them. The git project's Documentation/CodingGuidelines seems to be a 
better description of Linux development practice than checkpatch's regexps:

   There are two schools of thought when it comes to splitting a long
   logical line into multiple lines.  Some people push the second and
   subsequent lines far enough to the right with tabs and align them:

if (the_beginning_of_a_very_long_expression_that_has_to ||
span_more_than_a_single_line_of ||
the_source_text) {
...

   while other people prefer to align the second and the subsequent
   lines with the column immediately inside the opening parenthesis,
   with tabs and spaces, following our "tabstop is always a multiple
   of 8" convention:

if (the_beginning_of_a_very_long_expression_that_has_to ||
span_more_than_a_single_line_of ||
the_source_text) {
...

   Both are valid, and we use both.  Again, just do not mix styles in
   the same part of the code and mimic existing styles in the
   neighbourhood.

-- 


Re: [PATCH v4 00/14] Modernization and fixes for NuBus subsystem

2018-01-05 Thread Finn Thain
Hi Geert,

On Fri, 5 Jan 2018, Geert Uytterhoeven wrote:

> 
> I assume you meant this to go in through the m68k tree?
> 

Yes, please. Because the NuBus-PowerMac port is out-of-tree, the m68k tree 
seems more appropriate than the powerpc tree for this submission.

> Can you please run this through checkpatch, and fix the reported white 
> space and other real (some are false positives) issues?
> 

Checkpatch is great but it is also incredibly noisy. So I missed a long 
line that should have been wrapped in patch 4/14 and another in 5/14. 
Sorry about that.

Checkpatch said, "Symbolic permissions are not preferred". But I didn't 
find any advice in the style guide, so I just retained the existing code 
style. What is your preference here?

Checkpatch also said, "code indent should use tabs where possible" though 
I've used only tabs to indent (according to scope, of course). Checkpatch 
also says, "please, no spaces at the start of a line". Yet it is common 
practice to put spaces at the start of a continuation (after any 
indentation tabs, of course) when wrapping lines*. Please let me know your 
preference.

Checkpatch asked, "added, moved or deleted file(s), does MAINTAINERS need 
updating?" Regarding drivers/nubus/*, that question is not a new one. The 
issue can be addressed in this patch or an earlier one, so as to keep 
checkpatch happy, or it can be addressed in a separate submission... Do we 
bring drivers/nubus/* under the Mac 68k subsystem? Isn't it a subsystem 
itself? (If maintain that code, do I get to exercise my discretion 
regarding checkpatch limitations?)

The rest of the checkpatch output seems to be irrelevant (or am I missing 
something?) --

Macros with complex values should be enclosed in parentheses

trailing statements should be on next line

Possible unwrapped commit description

braces {} are not necessary for single statement blocks

file is marked as 'obsolete' in the MAINTAINERS hierarchy.  No 
unnecessary modifications please.

suspect code indent for conditional statements

Please let me know how you would like me to address these issues, and I'll 
re-submit.

Thanks for your review.

* IMO checkpatch is really good at certain things but line wrap isn't one 
of them. The git project's Documentation/CodingGuidelines seems to be a 
better description of Linux development practice than checkpatch's regexps:

   There are two schools of thought when it comes to splitting a long
   logical line into multiple lines.  Some people push the second and
   subsequent lines far enough to the right with tabs and align them:

if (the_beginning_of_a_very_long_expression_that_has_to ||
span_more_than_a_single_line_of ||
the_source_text) {
...

   while other people prefer to align the second and the subsequent
   lines with the column immediately inside the opening parenthesis,
   with tabs and spaces, following our "tabstop is always a multiple
   of 8" convention:

if (the_beginning_of_a_very_long_expression_that_has_to ||
span_more_than_a_single_line_of ||
the_source_text) {
...

   Both are valid, and we use both.  Again, just do not mix styles in
   the same part of the code and mimic existing styles in the
   neighbourhood.

-- 


Re: [PATCH v2 4/8] x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature

2018-01-05 Thread Dave Hansen
On 01/05/2018 06:12 PM, Tim Chen wrote:
>  .macro ENABLE_IBRS
> - ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_SPEC_CTRL
> + testl   $1, dynamic_ibrs
> + jz  .Lskip_\@

There was an earlier suggestion to use STATIC_JUMP_IF_... here.  That's
a good suggestion, but we encountered some issues with it either
crashing the kernel at boot or not properly turning on/off.

We would love to do that minor really soon, but we figured everyone
would rather see this version than wait for us to debug such a minor tweak.


Re: [PATCH v2 4/8] x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature

2018-01-05 Thread Dave Hansen
On 01/05/2018 06:12 PM, Tim Chen wrote:
>  .macro ENABLE_IBRS
> - ALTERNATIVE "jmp .Lskip_\@", "", X86_FEATURE_SPEC_CTRL
> + testl   $1, dynamic_ibrs
> + jz  .Lskip_\@

There was an earlier suggestion to use STATIC_JUMP_IF_... here.  That's
a good suggestion, but we encountered some issues with it either
crashing the kernel at boot or not properly turning on/off.

We would love to do that minor really soon, but we figured everyone
would rather see this version than wait for us to debug such a minor tweak.


Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-05 Thread Linus Torvalds
On Fri, Jan 5, 2018 at 6:52 PM, Linus Torvalds
 wrote:
>
> The fact is, we have to stop speculating when access_ok() does *not*
> fail - because that's when we'll actually do the access. And it's that
> access that needs to be non-speculative.

I also suspect we should probably do this entirely differently.

Maybe the whole lfence can be part of uaccess_begin() instead (ie
currently 'stac()'). That would fit the existing structure better, I
think. And it would avoid any confusion about the whole "when to stop
speculation".

 Linus


Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-05 Thread Linus Torvalds
On Fri, Jan 5, 2018 at 6:52 PM, Linus Torvalds
 wrote:
>
> The fact is, we have to stop speculating when access_ok() does *not*
> fail - because that's when we'll actually do the access. And it's that
> access that needs to be non-speculative.

I also suspect we should probably do this entirely differently.

Maybe the whole lfence can be part of uaccess_begin() instead (ie
currently 'stac()'). That would fit the existing structure better, I
think. And it would avoid any confusion about the whole "when to stop
speculation".

 Linus


RE: [PATCH 2/2] cpufreq: imx6q: add 696MHz operating point for i.mx6ul

2018-01-05 Thread Anson Huang
Hi, Rafael

Best Regards!
Anson Huang


> -Original Message-
> From: rjwyso...@gmail.com [mailto:rjwyso...@gmail.com] On Behalf Of Rafael
> J. Wysocki
> Sent: 2018-01-05 8:21 PM
> To: Anson Huang 
> Cc: linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org; Linux
> PM ; Linux Kernel Mailing List  ker...@vger.kernel.org>; Shawn Guo ; Sascha Hauer
> ; Fabio Estevam ; Rob
> Herring ; Mark Rutland ;
> Russell King - ARM Linux ; Rafael J. Wysocki
> ; Viresh Kumar ; Jacky Bai
> ; A.s. Dong 
> Subject: Re: [PATCH 2/2] cpufreq: imx6q: add 696MHz operating point for
> i.mx6ul
> 
> On Tue, Jan 2, 2018 at 6:07 PM, Anson Huang  wrote:
> > Add 696MHz operating point for i.MX6UL, only for those parts with
> > speed grading fuse set to 2b'10 supports 696MHz operating point, so,
> > speed grading check is also added for i.MX6UL in this patch, the clock
> > tree for each operating point are as below:
> >
> > 696MHz:
> > pll1   69600
> >pll1_bypass 69600
> >   pll1_sys 69600
> >  pll1_sw   69600
> > arm69600
> > 528MHz:
> > pll2   52800
> >pll2_bypass 52800
> >   pll2_bus 52800
> >  ca7_secondary_sel 52800
> > step   52800
> >pll1_sw 52800
> >   arm  52800
> > 396MHz:
> > pll2_pfd2_396m 39600
> >ca7_secondary_sel   39600
> >   step 39600
> >  pll1_sw   39600
> > arm39600
> > 198MHz:
> > pll2_pfd2_396m 39600
> >ca7_secondary_sel   39600
> >   step 39600
> >  pll1_sw   39600
> > arm19800
> >
> > Signed-off-by: Anson Huang 
> 
> This doesn't apply for me and in a nontrivial way.
> 
> What kernel is it against?

I did it based on linux-next, it should be on linux-next-pm branch, I redo
the patch set V2 based on linux-next-pm, also redo the test,
sorry for the inconvenience.

Anson.

> 
> > ---
> >  drivers/cpufreq/imx6q-cpufreq.c | 46
> > -
> >  1 file changed, 45 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/imx6q-cpufreq.c
> > b/drivers/cpufreq/imx6q-cpufreq.c index d9b2c2d..cbda0cc 100644
> > --- a/drivers/cpufreq/imx6q-cpufreq.c
> > +++ b/drivers/cpufreq/imx6q-cpufreq.c
> > @@ -120,6 +120,10 @@ static int imx6q_set_target(struct cpufreq_policy
> *policy, unsigned int index)
> > clk_set_parent(secondary_sel_clk, 
> > pll2_pfd2_396m_clk);
> > clk_set_parent(step_clk, secondary_sel_clk);
> > clk_set_parent(pll1_sw_clk, step_clk);
> > +   if (freq_hz > clk_get_rate(pll2_bus_clk)) {
> > +   clk_set_rate(pll1_sys_clk, new_freq * 1000);
> > +   clk_set_parent(pll1_sw_clk, pll1_sys_clk);
> > +   }
> > } else {
> > clk_set_parent(step_clk, pll2_pfd2_396m_clk);
> > clk_set_parent(pll1_sw_clk, step_clk); @@ -244,6
> > +248,43 @@ static void imx6q_opp_check_speed_grading(struct device *dev)
> > of_node_put(np);
> >  }
> >
> > +#define OCOTP_CFG3_6UL_SPEED_696MHZ0x2
> > +
> > +static void imx6ul_opp_check_speed_grading(struct device *dev) {
> > +   struct device_node *np;
> > +   void __iomem *base;
> > +   u32 val;
> > +
> > +   np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp");
> > +   if (!np)
> > +   return;
> > +
> > +   base = of_iomap(np, 0);
> > +   if (!base) {
> > +   dev_err(dev, "failed to map ocotp\n");
> > +   goto put_node;
> > +   }
> > +
> > +   /*
> > +* Speed GRADING[1:0] defines the max speed of ARM:
> > +* 2b'00: Reserved;
> > +* 2b'01: 52800Hz;
> > +* 2b'10: 69600Hz;
> > +* 2b'11: Reserved;
> > +* We need to set the max speed of ARM according to fuse map.
> > +*/
> > +   val = readl_relaxed(base + OCOTP_CFG3);
> > +   val >>= OCOTP_CFG3_SPEED_SHIFT;
> > +   val &= 0x3;
> > +   if (val != OCOTP_CFG3_6UL_SPEED_696MHZ)
> > +   if (dev_pm_opp_disable(dev, 69600))
> > +   dev_warn(dev, "failed to disable 696MHz OPP\n");
> > +   iounmap(base);
> > +put_node:
> > +   of_node_put(np);
> > +}
> > +
> >  static int 

RE: [PATCH 2/2] cpufreq: imx6q: add 696MHz operating point for i.mx6ul

2018-01-05 Thread Anson Huang
Hi, Rafael

Best Regards!
Anson Huang


> -Original Message-
> From: rjwyso...@gmail.com [mailto:rjwyso...@gmail.com] On Behalf Of Rafael
> J. Wysocki
> Sent: 2018-01-05 8:21 PM
> To: Anson Huang 
> Cc: linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org; Linux
> PM ; Linux Kernel Mailing List  ker...@vger.kernel.org>; Shawn Guo ; Sascha Hauer
> ; Fabio Estevam ; Rob
> Herring ; Mark Rutland ;
> Russell King - ARM Linux ; Rafael J. Wysocki
> ; Viresh Kumar ; Jacky Bai
> ; A.s. Dong 
> Subject: Re: [PATCH 2/2] cpufreq: imx6q: add 696MHz operating point for
> i.mx6ul
> 
> On Tue, Jan 2, 2018 at 6:07 PM, Anson Huang  wrote:
> > Add 696MHz operating point for i.MX6UL, only for those parts with
> > speed grading fuse set to 2b'10 supports 696MHz operating point, so,
> > speed grading check is also added for i.MX6UL in this patch, the clock
> > tree for each operating point are as below:
> >
> > 696MHz:
> > pll1   69600
> >pll1_bypass 69600
> >   pll1_sys 69600
> >  pll1_sw   69600
> > arm69600
> > 528MHz:
> > pll2   52800
> >pll2_bypass 52800
> >   pll2_bus 52800
> >  ca7_secondary_sel 52800
> > step   52800
> >pll1_sw 52800
> >   arm  52800
> > 396MHz:
> > pll2_pfd2_396m 39600
> >ca7_secondary_sel   39600
> >   step 39600
> >  pll1_sw   39600
> > arm39600
> > 198MHz:
> > pll2_pfd2_396m 39600
> >ca7_secondary_sel   39600
> >   step 39600
> >  pll1_sw   39600
> > arm19800
> >
> > Signed-off-by: Anson Huang 
> 
> This doesn't apply for me and in a nontrivial way.
> 
> What kernel is it against?

I did it based on linux-next, it should be on linux-next-pm branch, I redo
the patch set V2 based on linux-next-pm, also redo the test,
sorry for the inconvenience.

Anson.

> 
> > ---
> >  drivers/cpufreq/imx6q-cpufreq.c | 46
> > -
> >  1 file changed, 45 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/imx6q-cpufreq.c
> > b/drivers/cpufreq/imx6q-cpufreq.c index d9b2c2d..cbda0cc 100644
> > --- a/drivers/cpufreq/imx6q-cpufreq.c
> > +++ b/drivers/cpufreq/imx6q-cpufreq.c
> > @@ -120,6 +120,10 @@ static int imx6q_set_target(struct cpufreq_policy
> *policy, unsigned int index)
> > clk_set_parent(secondary_sel_clk, 
> > pll2_pfd2_396m_clk);
> > clk_set_parent(step_clk, secondary_sel_clk);
> > clk_set_parent(pll1_sw_clk, step_clk);
> > +   if (freq_hz > clk_get_rate(pll2_bus_clk)) {
> > +   clk_set_rate(pll1_sys_clk, new_freq * 1000);
> > +   clk_set_parent(pll1_sw_clk, pll1_sys_clk);
> > +   }
> > } else {
> > clk_set_parent(step_clk, pll2_pfd2_396m_clk);
> > clk_set_parent(pll1_sw_clk, step_clk); @@ -244,6
> > +248,43 @@ static void imx6q_opp_check_speed_grading(struct device *dev)
> > of_node_put(np);
> >  }
> >
> > +#define OCOTP_CFG3_6UL_SPEED_696MHZ0x2
> > +
> > +static void imx6ul_opp_check_speed_grading(struct device *dev) {
> > +   struct device_node *np;
> > +   void __iomem *base;
> > +   u32 val;
> > +
> > +   np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp");
> > +   if (!np)
> > +   return;
> > +
> > +   base = of_iomap(np, 0);
> > +   if (!base) {
> > +   dev_err(dev, "failed to map ocotp\n");
> > +   goto put_node;
> > +   }
> > +
> > +   /*
> > +* Speed GRADING[1:0] defines the max speed of ARM:
> > +* 2b'00: Reserved;
> > +* 2b'01: 52800Hz;
> > +* 2b'10: 69600Hz;
> > +* 2b'11: Reserved;
> > +* We need to set the max speed of ARM according to fuse map.
> > +*/
> > +   val = readl_relaxed(base + OCOTP_CFG3);
> > +   val >>= OCOTP_CFG3_SPEED_SHIFT;
> > +   val &= 0x3;
> > +   if (val != OCOTP_CFG3_6UL_SPEED_696MHZ)
> > +   if (dev_pm_opp_disable(dev, 69600))
> > +   dev_warn(dev, "failed to disable 696MHz OPP\n");
> > +   iounmap(base);
> > +put_node:
> > +   of_node_put(np);
> > +}
> > +
> >  static int imx6q_cpufreq_probe(struct platform_device *pdev)  {
> > struct device_node *np;
> > @@ -311,7 +352,10 @@ static int imx6q_cpufreq_probe(struct
> platform_device *pdev)
> > goto put_reg;
> > }
> >
> > -   imx6q_opp_check_speed_grading(cpu_dev);
> > +   if 

[PATCH V2 1/2] ARM: dts: imx6ul: add 696MHz operating point

2018-01-05 Thread Anson Huang
Add 696MHz operating point according to datasheet
(Rev. 0, 12/2015).

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6ul.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
index d5181f8..963e169 100644
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@@ -68,12 +68,14 @@
clock-latency = <61036>; /* two CLK32 periods */
operating-points = <
/* kHz  uV */
+   696000  1275000
528000  1175000
396000  1025000
198000  95
>;
fsl,soc-operating-points = <
/* KHz  uV */
+   696000  1275000
528000  1175000
396000  1175000
198000  1175000
-- 
1.9.1



[PATCH V2 2/2] cpufreq: imx6q: add 696MHz operating point for i.mx6ul

2018-01-05 Thread Anson Huang
Add 696MHz operating point for i.MX6UL, only for those
parts with speed grading fuse set to 2b'10 supports
696MHz operating point, so, speed grading check is also
added for i.MX6UL in this patch, the clock tree for each
operating point are as below:

696MHz:
pll1   69600
   pll1_bypass 69600
  pll1_sys 69600
 pll1_sw   69600
arm69600
528MHz:
pll2   52800
   pll2_bypass 52800
  pll2_bus 52800
 ca7_secondary_sel 52800
step   52800
   pll1_sw 52800
  arm  52800
396MHz:
pll2_pfd2_396m 39600
   ca7_secondary_sel   39600
  step 39600
 pll1_sw   39600
arm39600
198MHz:
pll2_pfd2_396m 39600
   ca7_secondary_sel   39600
  step 39600
 pll1_sw   39600
arm19800

Signed-off-by: Anson Huang 
---
changes since v1:
redo the patch based on linux-next-pm.
 drivers/cpufreq/imx6q-cpufreq.c | 46 -
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index 8bfb077..741f22e 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -136,6 +136,10 @@ static int imx6q_set_target(struct cpufreq_policy *policy, 
unsigned int index)
   clks[PLL2_PFD2_396M].clk);
clk_set_parent(clks[STEP].clk, clks[SECONDARY_SEL].clk);
clk_set_parent(clks[PLL1_SW].clk, clks[STEP].clk);
+   if (freq_hz > clk_get_rate(clks[PLL2_BUS].clk)) {
+   clk_set_rate(clks[PLL1_SYS].clk, new_freq * 1000);
+   clk_set_parent(clks[PLL1_SW].clk, clks[PLL1_SYS].clk);
+   }
} else {
clk_set_parent(clks[STEP].clk, clks[PLL2_PFD2_396M].clk);
clk_set_parent(clks[PLL1_SW].clk, clks[STEP].clk);
@@ -260,6 +264,43 @@ static void imx6q_opp_check_speed_grading(struct device 
*dev)
of_node_put(np);
 }
 
+#define OCOTP_CFG3_6UL_SPEED_696MHZ0x2
+
+static void imx6ul_opp_check_speed_grading(struct device *dev)
+{
+   struct device_node *np;
+   void __iomem *base;
+   u32 val;
+
+   np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp");
+   if (!np)
+   return;
+
+   base = of_iomap(np, 0);
+   if (!base) {
+   dev_err(dev, "failed to map ocotp\n");
+   goto put_node;
+   }
+
+   /*
+* Speed GRADING[1:0] defines the max speed of ARM:
+* 2b'00: Reserved;
+* 2b'01: 52800Hz;
+* 2b'10: 69600Hz;
+* 2b'11: Reserved;
+* We need to set the max speed of ARM according to fuse map.
+*/
+   val = readl_relaxed(base + OCOTP_CFG3);
+   val >>= OCOTP_CFG3_SPEED_SHIFT;
+   val &= 0x3;
+   if (val != OCOTP_CFG3_6UL_SPEED_696MHZ)
+   if (dev_pm_opp_disable(dev, 69600))
+   dev_warn(dev, "failed to disable 696MHz OPP\n");
+   iounmap(base);
+put_node:
+   of_node_put(np);
+}
+
 static int imx6q_cpufreq_probe(struct platform_device *pdev)
 {
struct device_node *np;
@@ -314,7 +355,10 @@ static int imx6q_cpufreq_probe(struct platform_device 
*pdev)
goto put_reg;
}
 
-   imx6q_opp_check_speed_grading(cpu_dev);
+   if (of_machine_is_compatible("fsl,imx6ul"))
+   imx6ul_opp_check_speed_grading(cpu_dev);
+   else
+   imx6q_opp_check_speed_grading(cpu_dev);
 
/* Because we have added the OPPs here, we must free them */
free_opp = true;
-- 
1.9.1



[PATCH V2 1/2] ARM: dts: imx6ul: add 696MHz operating point

2018-01-05 Thread Anson Huang
Add 696MHz operating point according to datasheet
(Rev. 0, 12/2015).

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6ul.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
index d5181f8..963e169 100644
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@@ -68,12 +68,14 @@
clock-latency = <61036>; /* two CLK32 periods */
operating-points = <
/* kHz  uV */
+   696000  1275000
528000  1175000
396000  1025000
198000  95
>;
fsl,soc-operating-points = <
/* KHz  uV */
+   696000  1275000
528000  1175000
396000  1175000
198000  1175000
-- 
1.9.1



[PATCH V2 2/2] cpufreq: imx6q: add 696MHz operating point for i.mx6ul

2018-01-05 Thread Anson Huang
Add 696MHz operating point for i.MX6UL, only for those
parts with speed grading fuse set to 2b'10 supports
696MHz operating point, so, speed grading check is also
added for i.MX6UL in this patch, the clock tree for each
operating point are as below:

696MHz:
pll1   69600
   pll1_bypass 69600
  pll1_sys 69600
 pll1_sw   69600
arm69600
528MHz:
pll2   52800
   pll2_bypass 52800
  pll2_bus 52800
 ca7_secondary_sel 52800
step   52800
   pll1_sw 52800
  arm  52800
396MHz:
pll2_pfd2_396m 39600
   ca7_secondary_sel   39600
  step 39600
 pll1_sw   39600
arm39600
198MHz:
pll2_pfd2_396m 39600
   ca7_secondary_sel   39600
  step 39600
 pll1_sw   39600
arm19800

Signed-off-by: Anson Huang 
---
changes since v1:
redo the patch based on linux-next-pm.
 drivers/cpufreq/imx6q-cpufreq.c | 46 -
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index 8bfb077..741f22e 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -136,6 +136,10 @@ static int imx6q_set_target(struct cpufreq_policy *policy, 
unsigned int index)
   clks[PLL2_PFD2_396M].clk);
clk_set_parent(clks[STEP].clk, clks[SECONDARY_SEL].clk);
clk_set_parent(clks[PLL1_SW].clk, clks[STEP].clk);
+   if (freq_hz > clk_get_rate(clks[PLL2_BUS].clk)) {
+   clk_set_rate(clks[PLL1_SYS].clk, new_freq * 1000);
+   clk_set_parent(clks[PLL1_SW].clk, clks[PLL1_SYS].clk);
+   }
} else {
clk_set_parent(clks[STEP].clk, clks[PLL2_PFD2_396M].clk);
clk_set_parent(clks[PLL1_SW].clk, clks[STEP].clk);
@@ -260,6 +264,43 @@ static void imx6q_opp_check_speed_grading(struct device 
*dev)
of_node_put(np);
 }
 
+#define OCOTP_CFG3_6UL_SPEED_696MHZ0x2
+
+static void imx6ul_opp_check_speed_grading(struct device *dev)
+{
+   struct device_node *np;
+   void __iomem *base;
+   u32 val;
+
+   np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp");
+   if (!np)
+   return;
+
+   base = of_iomap(np, 0);
+   if (!base) {
+   dev_err(dev, "failed to map ocotp\n");
+   goto put_node;
+   }
+
+   /*
+* Speed GRADING[1:0] defines the max speed of ARM:
+* 2b'00: Reserved;
+* 2b'01: 52800Hz;
+* 2b'10: 69600Hz;
+* 2b'11: Reserved;
+* We need to set the max speed of ARM according to fuse map.
+*/
+   val = readl_relaxed(base + OCOTP_CFG3);
+   val >>= OCOTP_CFG3_SPEED_SHIFT;
+   val &= 0x3;
+   if (val != OCOTP_CFG3_6UL_SPEED_696MHZ)
+   if (dev_pm_opp_disable(dev, 69600))
+   dev_warn(dev, "failed to disable 696MHz OPP\n");
+   iounmap(base);
+put_node:
+   of_node_put(np);
+}
+
 static int imx6q_cpufreq_probe(struct platform_device *pdev)
 {
struct device_node *np;
@@ -314,7 +355,10 @@ static int imx6q_cpufreq_probe(struct platform_device 
*pdev)
goto put_reg;
}
 
-   imx6q_opp_check_speed_grading(cpu_dev);
+   if (of_machine_is_compatible("fsl,imx6ul"))
+   imx6ul_opp_check_speed_grading(cpu_dev);
+   else
+   imx6q_opp_check_speed_grading(cpu_dev);
 
/* Because we have added the OPPs here, we must free them */
free_opp = true;
-- 
1.9.1



Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-05 Thread Mike Galbraith
On Fri, 2018-01-05 at 15:28 -0800, Hugh Dickins wrote:
> On Fri, Jan 5, 2018 at 6:03 AM, Mike Galbraith  wrote:
> > On Fri, 2018-01-05 at 14:34 +0100, Greg Kroah-Hartman wrote:
> >>
> >> Ok, we found two patches that were missing in 4.4-stable that were in
> >> the SLES12 tree (thanks to Jamie Iles), now I only have 19k more to sift
> >> through :)
> >
> > As you know, in enterprise, uname -r means you might find something
> > this old in your kernel if you look hard enough :)
> 
> Mike, I think there's a good chance that Greg's 4.4.110 final will fix
> your "segfault at ff5ff100" crashes: please give it a try when
> you can, and let us know - thanks.

Already done, and yes, it did.

-Mike


Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-05 Thread Mike Galbraith
On Fri, 2018-01-05 at 15:28 -0800, Hugh Dickins wrote:
> On Fri, Jan 5, 2018 at 6:03 AM, Mike Galbraith  wrote:
> > On Fri, 2018-01-05 at 14:34 +0100, Greg Kroah-Hartman wrote:
> >>
> >> Ok, we found two patches that were missing in 4.4-stable that were in
> >> the SLES12 tree (thanks to Jamie Iles), now I only have 19k more to sift
> >> through :)
> >
> > As you know, in enterprise, uname -r means you might find something
> > this old in your kernel if you look hard enough :)
> 
> Mike, I think there's a good chance that Greg's 4.4.110 final will fix
> your "segfault at ff5ff100" crashes: please give it a try when
> you can, and let us know - thanks.

Already done, and yes, it did.

-Mike


Re: [PATCH 01/18] asm-generic/barrier: add generic nospec helpers

2018-01-05 Thread Linus Torvalds
On Fri, Jan 5, 2018 at 5:09 PM, Dan Williams  wrote:
> +#ifndef nospec_ptr
> +#define nospec_ptr(ptr, lo, hi)  
>   \

Do we actually want this horrible interface?

It just causes the compiler - or inline asm - to generate worse code,
because it needs to compare against both high and low limits.

Basically all users are arrays that are zero-based, and where a
comparison against the high _index_ limit would be sufficient.

But the way this is all designed, it's literally designed for bad code
generation for the unusual case, and the usual array case is written
in the form of the unusual and wrong non-array case. That really seems
excessively stupid.

 Linus


Re: [PATCH 01/18] asm-generic/barrier: add generic nospec helpers

2018-01-05 Thread Linus Torvalds
On Fri, Jan 5, 2018 at 5:09 PM, Dan Williams  wrote:
> +#ifndef nospec_ptr
> +#define nospec_ptr(ptr, lo, hi)  
>   \

Do we actually want this horrible interface?

It just causes the compiler - or inline asm - to generate worse code,
because it needs to compare against both high and low limits.

Basically all users are arrays that are zero-based, and where a
comparison against the high _index_ limit would be sufficient.

But the way this is all designed, it's literally designed for bad code
generation for the unusual case, and the usual array case is written
in the form of the unusual and wrong non-array case. That really seems
excessively stupid.

 Linus


  1   2   3   4   5   6   7   8   9   10   >