from:"Marc Zyngier"

Re: [PATCH -next] irq-chip/gic-v3-its: Fixed an issue where the ITS executes the residual commands in the queue again when the ITS wakes up from sleep mode.

2020-11-07 Thread Marc Zyngier


[dropping Jason, whose email address has been bouncing for weeks now]

On 2020-11-07 10:42, Xu Qiang wrote:

On my platform, ITS_FLAGS_SAVE_SUSPEND_STATE is not set,thus do nothing


Which platform?


in its suspend and resuse function.On the other hand,firmware stores
GITS_CTRL,GITS_CBASER,GITS_CWRITER and GITS_BASER in the suspend,
and restores these registers in the resume. As a result, the ITS 
executes

the residual commands in the queue.


Which firmware are you using? I just had a look at the trusted firmware 
source
code, and while it definitely does something that *looks* like what you 
are

describing, it doesn't re-enable the ITS on resume.

So what are you running?



Memory corruption may occur in the following scenarios:

The kernel sends three commands in the following sequence:
1.mapd(deviceA, ITT_addr1, valid:1)
2.mapti(deviceA):ITS write ITT_addr1 memory;
3.mapd(deviceA, ITT_addr1, valid:0) and kfree(ITT_addr1);


The ITS doesn't 'kfree' stuff.


4.mapd(deviceA, ITT_addr2, valid:1);
5.mapti(deviceA):ITS write ITT_addr2 memory;


I don't think this example is relevant. The core of the problem is that
the ITS gets re-enabled by your firmware. What are the affected systems?



To solve this problem,dropping the checks for 
ITS_FLAGS_SAVE_SUSPEND_STATE.


Signed-off-by: Xu Qiang 
---
 drivers/irqchip/irq-gic-v3-its.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c 
b/drivers/irqchip/irq-gic-v3-its.c

index 0fec31931e11..06f2c1c252b9 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -42,7 +42,6 @@
 #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING  (1ULL << 0)
 #define ITS_FLAGS_WORKAROUND_CAVIUM_22375  (1ULL << 1)
 #define ITS_FLAGS_WORKAROUND_CAVIUM_23144  (1ULL << 2)
-#define ITS_FLAGS_SAVE_SUSPEND_STATE   (1ULL << 3)

 #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING(1 << 0)
 #define RDIST_FLAGS_RD_TABLES_PREALLOCATED (1 << 1)
@@ -4741,9 +4740,6 @@ static int its_save_disable(void)
list_for_each_entry(its, _nodes, entry) {
void __iomem *base;

-   if (!(its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE))
-   continue;
-
base = its->base;
its->ctlr_save = readl_relaxed(base + GITS_CTLR);
err = its_force_quiescent(base);
@@ -4762,9 +4758,6 @@ static int its_save_disable(void)
list_for_each_entry_continue_reverse(its, _nodes, entry) {
void __iomem *base;

-   if (!(its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE))
-   continue;
-
base = its->base;
writel_relaxed(its->ctlr_save, base + GITS_CTLR);
}
@@ -4784,9 +4777,6 @@ static void its_restore_enable(void)
void __iomem *base;
int i;

-   if (!(its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE))
-   continue;
-
base = its->base;

/*
@@ -5074,9 +5064,6 @@ static int __init its_probe_one(struct resource 
*res,

ctlr |= GITS_CTLR_ImDe;
writel_relaxed(ctlr, its->base + GITS_CTLR);

-   if (GITS_TYPER_HCC(typer))
-   its->flags |= ITS_FLAGS_SAVE_SUSPEND_STATE;
-
err = its_init_domain(handle, its);
if (err)
goto out_free_tables;


I'm OK with the patch itself, but I don't want to paper over broken 
firmware.
I'll get TF-A fixed one way or another, but I want to be sure yours is 
too.

If firmware does its job correctly, we shouldn't have to do all of this.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] KVM: arm64: Fix build error in user_mem_abort()

2020-11-06 Thread Marc Zyngier

On Tue, 3 Nov 2020 11:30:09 +1100, Gavin Shan wrote:
> The PUD and PMD are folded into PGD when the following options are
> enabled. In that case, PUD_SHIFT is equal to PMD_SHIFT and we fail
> to build with the indicated errors:
> 
>CONFIG_ARM64_VA_BITS_42=y
>CONFIG_ARM64_PAGE_SHIFT=16
>CONFIG_PGTABLE_LEVELS=3
> 
> [...]

Applied to next, thanks!

[1/1] KVM: arm64: Fix build error in user_mem_abort()
  commit: faf000397e7f103df9953a312e1df21df1dc797f

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v2 3/5] ARM: implement support for SMCCC TRNG entropy source

2020-11-06 Thread Marc Zyngier


On 2020-11-06 15:30, Ard Biesheuvel wrote:

On Fri, 6 Nov 2020 at 16:30, Marc Zyngier  wrote:


[...]


I don't think this cast is safe. At least not on 64bit.


True, but this is arch/arm


I think the glasses theme becomes recurrent. Apologies for the noise.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v2 3/5] ARM: implement support for SMCCC TRNG entropy source

2020-11-06 Thread Marc Zyngier


On 2020-11-05 12:56, Andre Przywara wrote:

From: Ard Biesheuvel 

Implement arch_get_random_seed_*() for ARM based on the firmware
or hypervisor provided entropy source described in ARM DEN0098.

This will make the kernel's random number generator consume entropy
provided by this interface, at early boot, and periodically at
runtime when reseeding.

Cc: Linus Walleij 
Cc: Russell King 
Signed-off-by: Ard Biesheuvel 
[Andre: rework to be initialised by the SMCCC firmware driver]
Signed-off-by: Andre Przywara 
---
 arch/arm/Kconfig  |  4 ++
 arch/arm/include/asm/archrandom.h | 64 +++
 2 files changed, 68 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index fe2f17eb2b50..06fda4f954fd 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1667,6 +1667,10 @@ config STACKPROTECTOR_PER_TASK
  Enable this option to switch to a different method that uses a
  different canary value for each task.

+config ARCH_RANDOM
+   def_bool y
+   depends on HAVE_ARM_SMCCC
+
 endmenu

 menu "Boot options"
diff --git a/arch/arm/include/asm/archrandom.h
b/arch/arm/include/asm/archrandom.h
index a8e84ca5c2ee..f3e96a5b65f8 100644
--- a/arch/arm/include/asm/archrandom.h
+++ b/arch/arm/include/asm/archrandom.h
@@ -2,9 +2,73 @@
 #ifndef _ASM_ARCHRANDOM_H
 #define _ASM_ARCHRANDOM_H

+#ifdef CONFIG_ARCH_RANDOM
+
+#include 
+#include 
+
+#define ARM_SMCCC_TRNG_MIN_VERSION 0x1UL
+
+extern bool smccc_trng_available;
+
+static inline bool __init smccc_probe_trng(void)
+{
+   struct arm_smccc_res res;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_VERSION, );
+   if ((s32)res.a0 < 0)
+   return false;
+   if (res.a0 >= ARM_SMCCC_TRNG_MIN_VERSION) {
+   /* double check that the 32-bit flavor is available */
+   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_FEATURES,
+ARM_SMCCC_TRNG_RND32,
+);
+   if ((s32)res.a0 >= 0)
+   return true;
+   }
+
+   return false;
+}
+
+static inline bool __must_check arch_get_random_long(unsigned long *v)
+{
+   return false;
+}
+
+static inline bool __must_check arch_get_random_int(unsigned int *v)
+{
+   return false;
+}
+
+static inline bool __must_check arch_get_random_seed_long(unsigned 
long *v)

+{
+   struct arm_smccc_res res;
+
+   if (smccc_trng_available) {
+   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_RND32, 8 * sizeof(*v), 
);
+
+   if (res.a0 != 0)
+   return false;
+
+   *v = res.a3;
+   return true;
+   }
+
+   return false;
+}
+
+static inline bool __must_check arch_get_random_seed_int(unsigned int 
*v)

+{
+   return arch_get_random_seed_long((unsigned long *)v);


I don't think this cast is safe. At least not on 64bit.

M.
--
Who you jivin' with that Cosmik Debris?

Re: [RFC PATCH 00/26] kvm: arm64: Always-on nVHE hypervisor

2020-11-06 Thread Marc Zyngier


On 2020-11-04 18:36, David Brazdil wrote:

As we progress towards being able to keep guest state private to the
host running nVHE hypervisor, this series allows the hypervisor to
install itself on newly booted CPUs before the host is allowed to run
on them.

To this end, the hypervisor starts trapping host SMCs and intercepting
host's PSCI CPU_ON/OFF/SUSPEND calls. It replaces the host's entry 
point

with its own, initializes the EL2 state of the new CPU and installs
the nVHE hyp vector before ERETing to the host's entry point.

Other PSCI SMCs are forwarded to EL3, though only the known set of SMCs
implemented in the kernel is allowed. Non-PSCI SMCs are also forwarded
to EL3. Future changes will need to ensure the safety of all SMCs wrt.
private guests.

The host is still allowed to reset EL2 back to the stub vector, eg. for
hibernation or kexec, but will not disable nVHE when there are no VMs.

Tested on Rock Pi 4b.


Sending this as an RFC to get feedback on the following decisions:

1) The kernel checks new cores' features against the finalized system
capabilities. To avoid the need to move this code/data to EL2, the
implementation only allows to boot cores that were online at the time 
of

KVM initialization.

2) Trapping and forwarding SMCs cannot be switched off. This could 
cause
issues eg. if EL3 always returned to EL1. A kernel command line flag 
may

be needed to turn the feature off on such platforms.


I'd rather have it the other way around (buy-in rather than turn off).
On top of the potential issue with stupid EL3s, there is the issue that
PSCI is optional, and that protected VMs won't be able to work without
it. Another related thing is that EL3 itself is optional.

Note that this flag shouldn't be specific to PSCI proxying. It should 
also

control Stage-2 wrapping, and the whole pKVM.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: Aw: Re: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-06 Thread Marc Zyngier


On 2020-11-05 23:00, Thomas Gleixner wrote:

On Thu, Nov 05 2020 at 09:20, Marc Zyngier wrote:

On 2020-11-04 23:14, Thomas Gleixner wrote:

/* Resource alignment requirements */
resource_size_t (*align_resource)(struct pci_dev *dev,


If that's the direction of travel, we also need something like this
for configuration where the host bridge relies on an external MSI 
block

that uses MSI domains (boot-tested in a GICv3 guest).


Some more context would be helpful. Brain fails to decode the logic
here.


OK, let me try again.

The MSI controller, which is the thing that deals with MSIs in the 
system
(GICv2m, GICv3-ITS, and a number of others), is optional, is not part of 
the
host bridge (it has nothing to do with PCI at all), and the bridge 
driver has

absolutely no idea  whether:

- there is something that provides MSI or not
- that something has successfully been initialised or not (which 
translates

  into an MSI domain being present or not)

This is the case for most ARM systems, and all KVM/arm guests. Booting a 
VM
without MSIs is absolutely trivial, and actually makes sense for some of 
the

smaller guests.

In these conditions, your no_msi attribute doesn't work as is: we can't 
decide
on its value at probe time without extracting all of the OF/ACPI logic 
that
deals with MSI domains from the core code, and making it available to 
the host

bridge drivers for systems that follow that model.

Using the flow you insist on requires parsing the topology twice:

- once to find out whether there is actually a MSI provider registered
  for the host bridge in order to set the no_msi flag

- once to actually be able to store the domain into the pci_bus 
structure,

  as it isn't available at probe time.

My last suggestion is to indicate to the core code that there is a 
*possible*
MSI controller available in the form of a MSI domain. This is still 
suboptimal

compared to checking the presence an MSI domain in core code (my initial
suggestion), but the fallback stuff gets in the way (though I still 
think it

can be made to work).

Anyway, this was my last attempt at addressing the problem. Most people
won't see it. The couple of drivers that require the fallback hack are
usually selected in distro kernels, and do a good job hiding the error.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v2 3/5] gpio: msc313: MStar MSC313 GPIO driver

2020-11-05 Thread Marc Zyngier

On 2020-11-05 15:23, Daniel Palmer wrote:

Hi Marc,

On Thu, 5 Nov 2020 at 21:08, Marc Zyngier  wrote:

On 2020-11-05 09:40, Linus Walleij wrote:
> On Mon, Oct 19, 2020 at 4:10 PM Daniel Palmer  wrote:

[...]

>> +/* The parent interrupt controller needs the GIC interrupt type set
>> to GIC_SPI
>> + * so we need to provide the fwspec. Essentially
>> gpiochip_populate_parent_fwspec_twocell
>> + * that puts GIC_SPI into the first cell.
>> + */

nit: comment style.

I've fixed these and some other bits for the v3.
I've held off on pushing that until the rest of it seemed right.

>> +static void *msc313_gpio_populate_parent_fwspec(struct gpio_chip *gc,
>> +unsigned int
>> parent_hwirq,
>> +unsigned int parent_type)
>> +{
>> +   struct irq_fwspec *fwspec;
>> +
>> +   fwspec = kmalloc(sizeof(*fwspec), GFP_KERNEL);
>> +   if (!fwspec)
>> +   return NULL;
>> +
>> +   fwspec->fwnode = gc->irq.parent_domain->fwnode;
>> +   fwspec->param_count = 3;
>> +   fwspec->param[0] = GIC_SPI;
>> +   fwspec->param[1] = parent_hwirq;
>> +   fwspec->param[2] = parent_type;
>> +
>> +   return fwspec;
>> +}
>
> Clever. Looping in Marc Z so he can say if this looks allright to him.

Yup, this looks correct. However, looking at the bit of the patch that
isn't quoted here, I see that msc313_gpio_irqchip doesn't have a
.irq_set_affinity callback. Is this system UP only?

What is in mainline right now is UP only but there are chips with a
second cortex A7 that I have working in my tree.
So I will add that in for v3 if I can work out what I should actually
do there. :)

Probably nothing more than setting the callback to 
irq_chip_set_affinity_parent,

I'd expect.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v2 5/5] KVM: arm64: implement the TRNG hypervisor call

2020-11-05 Thread Marc Zyngier


On 2020-11-05 12:56, Andre Przywara wrote:

From: Ard Biesheuvel 

Provide a hypervisor implementation of the ARM architected TRNG 
firmware
interface described in ARM spec DEN0098. All function IDs are 
implemented,
including both 32-bit and 64-bit versions of the TRNG_RND service, 
which

is the centerpiece of the API.

The API is backed by arch_get_unsigned_seed_long(), which is 
implemented

in terms of RNDRRS currently, and will be alternatively backed by a SMC
call to the secure firmware using same interface after a future patch.
If neither are available, the kernel's entropy pool is used instead.

Signed-off-by: Ard Biesheuvel 
Signed-off-by: Andre Przywara 
---
 arch/arm64/include/asm/kvm_host.h |  2 +
 arch/arm64/kvm/Makefile   |  2 +-
 arch/arm64/kvm/hypercalls.c   |  6 ++
 arch/arm64/kvm/trng.c | 91 +++
 4 files changed, 100 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/trng.c

diff --git a/arch/arm64/include/asm/kvm_host.h
b/arch/arm64/include/asm/kvm_host.h
index 781d029b8aa8..615932bacf76 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -652,4 +652,6 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu 
*vcpu);

 #define kvm_arm_vcpu_sve_finalized(vcpu) \
((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)

+int kvm_trng_call(struct kvm_vcpu *vcpu);
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 1504c81fbf5d..a510037e3270 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -16,7 +16,7 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o
$(KVM)/eventfd.o \
 inject_fault.o regmap.o va_layout.o handle_exit.o \
 guest.o debug.o reset.o sys_regs.o \
 vgic-sys-reg-v3.o fpsimd.o pmu.o \
-aarch32.o arch_timer.o \
+aarch32.o arch_timer.o trng.o \
 vgic/vgic.o vgic/vgic-init.o \
 vgic/vgic-irqfd.o vgic/vgic-v2.o \
 vgic/vgic-v3.o vgic/vgic-v4.o \
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 25ea4ecb6449..ead21b98b620 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -71,6 +71,12 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
if (gpa != GPA_INVALID)
val = gpa;
break;
+   case ARM_SMCCC_TRNG_VERSION:
+   case ARM_SMCCC_TRNG_FEATURES:
+   case ARM_SMCCC_TRNG_GET_UUID:
+   case ARM_SMCCC_TRNG_RND32:
+   case ARM_SMCCC_TRNG_RND64:
+   return kvm_trng_call(vcpu);
default:
return kvm_psci_call(vcpu);
}
diff --git a/arch/arm64/kvm/trng.c b/arch/arm64/kvm/trng.c
new file mode 100644
index ..5a27b2d99977
--- /dev/null
+++ b/arch/arm64/kvm/trng.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2020 Arm Ltd.
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#define ARM_SMCCC_TRNG_VERSION_1_0 0x1UL
+
+#define TRNG_SUCCESS   0UL


SMCCC_RET_SUCCESS


+#define TRNG_NOT_SUPPORTED ((unsigned long)-1)


SMCCC_RET_NOT_SUPPORTED


+#define TRNG_INVALID_PARAMETER ((unsigned long)-2)


*crap*. Why isn't that the same value as SMCCC_RET_INVALID_PARAMETER?
Is it too late to fix the spec?


+#define TRNG_NO_ENTROPY((unsigned long)-3)
+
+#define MAX_BITS32 96
+#define MAX_BITS64 192


Nothing seems to be using these definitions.


+
+static const uuid_t arm_smc_trng_uuid __aligned(4) = UUID_INIT(
+	0x023534a2, 0xe0bc, 0x45ec, 0x95, 0xdd, 0x33, 0x34, 0xc1, 0xcc, 0x31, 
0x89);


I object to the lack of Easter egg in this UUID. Or at least one I can
understand. ;-)


+
+static int kvm_trng_do_rnd(struct kvm_vcpu *vcpu, int size)
+{
+   u32 num_bits = smccc_get_arg1(vcpu);
+   u64 bits[3];
+   int i;
+
+   if (num_bits > 3 * size) {
+   smccc_set_retval(vcpu, TRNG_INVALID_PARAMETER, 0, 0, 0);
+   return 1;
+   }
+
+   /* get as many bits as we need to fulfil the request */
+   for (i = 0; i < DIV_ROUND_UP(num_bits, 64); i++)
+   /* use the arm64 specific backend directly if one exists */
+   if (!arch_get_random_seed_long((unsigned long *)[i]))
+   bits[i] = get_random_long();
+
+   if (num_bits % 64)
+   bits[i - 1] &= U64_MAX >> (64 - (num_bits % 64));
+
+   while (i < ARRAY_SIZE(bits))
+   bits[i++] = 0;


I just wasted 3 minutes trying to understand what this was doing, only 
to

realise this is clearing the [MAX_BITS:num_bits] range.

How about using a bitmap instead? It would result in the exact same data
structure, only much more readable (and no u64->unsigned long casts).


+
+   if (size == 32)
+   smccc_set_retval(vcpu, TRNG_SUCCESS, lower_32_bits(bits[1]),
+

Re: [PATCH v2 4/5] arm64: Add support for SMCCC TRNG entropy source

2020-11-05 Thread Marc Zyngier


On 2020-11-05 14:34, Ard Biesheuvel wrote:

On Thu, 5 Nov 2020 at 15:30, Mark Rutland  wrote:


On Thu, Nov 05, 2020 at 03:04:57PM +0100, Ard Biesheuvel wrote:
> On Thu, 5 Nov 2020 at 15:03, Mark Rutland  wrote:
> > On Thu, Nov 05, 2020 at 01:41:42PM +, Mark Brown wrote:
> > > On Thu, Nov 05, 2020 at 12:56:55PM +, Andre Przywara wrote:

> > That said, I'm not sure it's great to plumb this under the
> > arch_get_random*() interfaces, e.g. given this measn that
> > add_interrupt_randomness() will end up trapping to the host all the time
> > when it calls arch_get_random_seed_long().
>
> As it turns out, add_interrupt_randomness() isn't actually used on ARM.

It's certainly called on arm64, per a warning I just hacked in:

[1.083802] [ cut here ]
[1.084802] add_interrupt_randomness called
[1.085685] WARNING: CPU: 1 PID: 0 at drivers/char/random.c:1267 
add_interrupt_randomness+0x2e8/0x318

[1.087599] Modules linked in:
[1.088258] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
5.10.0-rc2-dirty #13

[1.089672] Hardware name: linux,dummy-virt (DT)
[1.090659] pstate: 60400085 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
[1.091910] pc : add_interrupt_randomness+0x2e8/0x318
[1.092965] lr : add_interrupt_randomness+0x2e8/0x318
[1.094021] sp : 80001000be80
[1.094732] x29: 80001000be80 x28: 2d0c80209840
[1.095859] x27: 137c3e3a x26: 8000100abdd0
[1.096978] x25: 0035 x24: 67918bda8000
[1.098100] x23: c57c31923fe8 x22: fffedc14
[1.099224] x21: 2d0dbef796a0 x20: c57c331d16a0
[1.100339] x19: c57c33720a48 x18: 0010
[1.101459] x17:  x16: 0002
[1.102578] x15: 00e7 x14: 80001000bb20
[1.103706] x13: ffea x12: c57c337b56e8
[1.104821] x11: 0003 x10: c57c3379d6a8
[1.105944] x9 : c57c3379d700 x8 : 00017fe8
[1.107073] x7 : c000efff x6 : 0001
[1.108186] x5 : 00057fa8 x4 : 
[1.109305] x3 :  x2 : c57c337455d0
[1.110428] x1 : db8dc9c2a1e0f600 x0 : 
[1.111552] Call trace:
[1.112083]  add_interrupt_randomness+0x2e8/0x318
[1.113074]  handle_irq_event_percpu+0x48/0x90
[1.114016]  handle_irq_event+0x48/0xf8
[1.114826]  handle_fasteoi_irq+0xa4/0x130
[1.115689]  generic_handle_irq+0x30/0x48
[1.116528]  __handle_domain_irq+0x64/0xc0
[1.117392]  gic_handle_irq+0xc0/0x138
[1.118194]  el1_irq+0xbc/0x180
[1.118870]  arch_cpu_idle+0x20/0x30
[1.119630]  default_idle_call+0x8c/0x350
[1.120479]  do_idle+0x224/0x298
[1.121163]  cpu_startup_entry+0x28/0x70
[1.121994]  secondary_start_kernel+0x184/0x198

... and I couldn't immediately spot why 32-bit arm  would be 
different.




Hmm, I actually meant both arm64 and ARM.

Marc looked into this at my request a while ago, and I had a look
myself as well at the time, and IIRC, we both concluded that we don't
hit that code path. Darn.


Yes, I remember checking this. Obviously, I need a new pair of 
glasses...


M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH -next] irq-chip/gic-v3-its: Fixed an issue where the ITS executes the residual commands in the queue again when the ITS wakes up from sleep mode.

2020-11-05 Thread Marc Zyngier


On 2020-11-05 14:06, xuqiang (M) wrote:

在 2020/11/5 21:12, Marc Zyngier 写道:

Please don't top-post.

On 2020-11-05 11:54, xuqiang (M) wrote:

The kernel sends three commands in the following sequence:

1.mapd(deviceA, ITT_addr1, valid:1)

2.mapti(deviceA):ITS write ITT_addr1 memory;

3.mapd(deviceA, ITT_addr1, valid:0) and kfree(ITT_addr1);

4.mapd(deviceA, ITT_addr2, valid:1);

5.mapti(deviceA):ITS write ITT_addr2 memory;

In this case, the processor enters the sleep mode. After the kernel
performs the suspend operation, the firmware performs the store
operation and saves GITS_CBASER and GITS_CWRITER registers.

Then, the processor is woken up, and the firmware restores 
GITS_CBASER

and GITS_CWRITER registers. Because GITS_CWRITER register is not 0,
ITS will read the above command sequence execution from the command
queue, causing ITT_addr1 memory to be trampled.


This cannot work. By doing a memset on the command queue, you are
only feeding crap to the ITS (command 0 simply does not exist).
Consider yourself lucky that it doesn't just lock-up.

What needs to happen is the restore sequence that is already in the
driver, so that the command queue is in a sane state before 
re-enabling

the ITS.

    M.



On my platform, ITS_FLAGS_SAVE_SUSPEND_STATE is not set, thus
the first if condition in its_save_disable under list_for_each_entry 
goes

to the continue, however, I want to set the GITS_CWRITER to 0 at the
end of list_for_each_entry.

Do you have any suggestions about how to do this?


That's pretty much dropping the checks for ITS_FLAGS_SAVE_SUSPEND_STATE,
isn't it? But I assume your ITS is already enabled by the time you 
reenter

the kernel? If so, I bet your firmware is doing more than just writing
to CBASER and CWRITER...

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH -next] irq-chip/gic-v3-its: Fixed an issue where the ITS executes the residual commands in the queue again when the ITS wakes up from sleep mode.

2020-11-05 Thread Marc Zyngier


Please don't top-post.

On 2020-11-05 11:54, xuqiang (M) wrote:

The kernel sends three commands in the following sequence:

1.mapd(deviceA, ITT_addr1, valid:1)

2.mapti(deviceA):ITS write ITT_addr1 memory;

3.mapd(deviceA, ITT_addr1, valid:0) and kfree(ITT_addr1);

4.mapd(deviceA, ITT_addr2, valid:1);

5.mapti(deviceA):ITS write ITT_addr2 memory;

In this case, the processor enters the sleep mode. After the kernel
performs the suspend operation, the firmware performs the store
operation and saves GITS_CBASER and GITS_CWRITER registers.

Then, the processor is woken up, and the firmware restores GITS_CBASER
and GITS_CWRITER registers. Because GITS_CWRITER register is not 0,
ITS will read the above command sequence execution from the command
queue, causing ITT_addr1 memory to be trampled.


This cannot work. By doing a memset on the command queue, you are
only feeding crap to the ITS (command 0 simply does not exist).
Consider yourself lucky that it doesn't just lock-up.

What needs to happen is the restore sequence that is already in the
driver, so that the command queue is in a sane state before re-enabling
the ITS.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v2 3/5] gpio: msc313: MStar MSC313 GPIO driver

2020-11-05 Thread Marc Zyngier


On 2020-11-05 09:40, Linus Walleij wrote:

On Mon, Oct 19, 2020 at 4:10 PM Daniel Palmer  wrote:


[...]

+/* The parent interrupt controller needs the GIC interrupt type set 
to GIC_SPI
+ * so we need to provide the fwspec. Essentially 
gpiochip_populate_parent_fwspec_twocell

+ * that puts GIC_SPI into the first cell.
+ */


nit: comment style.


+static void *msc313_gpio_populate_parent_fwspec(struct gpio_chip *gc,
+unsigned int 
parent_hwirq,

+unsigned int parent_type)
+{
+   struct irq_fwspec *fwspec;
+
+   fwspec = kmalloc(sizeof(*fwspec), GFP_KERNEL);
+   if (!fwspec)
+   return NULL;
+
+   fwspec->fwnode = gc->irq.parent_domain->fwnode;
+   fwspec->param_count = 3;
+   fwspec->param[0] = GIC_SPI;
+   fwspec->param[1] = parent_hwirq;
+   fwspec->param[2] = parent_type;
+
+   return fwspec;
+}


Clever. Looping in Marc Z so he can say if this looks allright to him.


Yup, this looks correct. However, looking at the bit of the patch that
isn't quoted here, I see that msc313_gpio_irqchip doesn't have a
.irq_set_affinity callback. Is this system UP only?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 18/26] kvm: arm64: Intercept PSCI_CPU_OFF host SMC calls

2020-11-05 Thread Marc Zyngier


On 2020-11-04 18:36, David Brazdil wrote:
Add a handler of the CPU_OFF PSCI host SMC trapped in KVM nVHE hyp 
code.

When invoked, it changes the recorded state of the core to OFF before
forwarding the call to EL3. If the call fails, it changes the state 
back

to ON and returns the error to the host.

Signed-off-by: David Brazdil 
---
 arch/arm64/kvm/hyp/nvhe/psci.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/psci.c 
b/arch/arm64/kvm/hyp/nvhe/psci.c

index c3d0a6246c66..00dc0cab860c 100644
--- a/arch/arm64/kvm/hyp/nvhe/psci.c
+++ b/arch/arm64/kvm/hyp/nvhe/psci.c
@@ -13,6 +13,8 @@
 #include 
 #include 

+#include 
+
 /* Config options set by the host. */
 u32 kvm_host_psci_version = PSCI_VERSION(0, 0);
 u32 kvm_host_psci_function_id[PSCI_FN_MAX];
@@ -20,6 +22,7 @@ s64 hyp_physvirt_offset;

 #define __hyp_pa(x) ((phys_addr_t)(x) + hyp_physvirt_offset)

+static DEFINE_PER_CPU(hyp_spinlock_t, psci_cpu_lock);
 DEFINE_PER_CPU(enum kvm_nvhe_psci_state, psci_cpu_state);

 static u64 get_psci_func_id(struct kvm_cpu_context *host_ctxt)
@@ -76,9 +79,32 @@ static __noreturn unsigned long
psci_forward_noreturn(struct kvm_cpu_context *ho
hyp_panic(); /* unreachable */
 }

+static int psci_cpu_off(u64 func_id, struct kvm_cpu_context 
*host_ctxt)

+{
+   hyp_spinlock_t *cpu_lock = this_cpu_ptr(_cpu_lock);
+   enum kvm_nvhe_psci_state *cpu_power = this_cpu_ptr(_cpu_state);
+   u32 power_state = (u32)host_ctxt->regs.regs[1];
+   int ret;
+
+   /* Change the recorded state to OFF before forwarding the call. */
+   hyp_spin_lock(cpu_lock);
+   *cpu_power = KVM_NVHE_PSCI_CPU_OFF;
+   hyp_spin_unlock(cpu_lock);


So at this point, another CPU can observe the vcpu being "off", and 
issue

a PSCI_ON, which may result in an "already on". I'm not sure this is an
actual issue, but it is worth documenting.

What is definitely missing is a rational about *why* we need to track 
the

state of the vcpus. I naively imagined that we could directly proxy the
PSCI calls to EL3, only repainting PC for PSCI_ON.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 12/26] kvm: arm64: Add SMC handler in nVHE EL2

2020-11-05 Thread Marc Zyngier


On 2020-11-04 18:36, David Brazdil wrote:

Add handler of host SMCs in KVM nVHE trap handler. Forward all SMCs to
EL3 and propagate the result back to EL1. This is done in preparation
for validating host SMCs.

Signed-off-by: David Brazdil 
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 36 ++
 1 file changed, 36 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 19332c20fcde..fffc2dc09a1f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -106,6 +106,38 @@ static void handle_host_hcall(struct
kvm_cpu_context *host_ctxt)
host_ctxt->regs.regs[1] = ret;
 }

+static void skip_host_instruction(void)
+{
+   write_sysreg_el2(read_sysreg_el2(SYS_ELR) + 4, SYS_ELR);
+}
+
+static void forward_host_smc(struct kvm_cpu_context *host_ctxt)
+{
+   struct arm_smccc_res res;
+
+   arm_smccc_1_1_smc(host_ctxt->regs.regs[0], host_ctxt->regs.regs[1],
+ host_ctxt->regs.regs[2], host_ctxt->regs.regs[3],
+ host_ctxt->regs.regs[4], host_ctxt->regs.regs[5],
+ host_ctxt->regs.regs[6], host_ctxt->regs.regs[7],
+ );
+   host_ctxt->regs.regs[0] = res.a0;
+   host_ctxt->regs.regs[1] = res.a1;
+   host_ctxt->regs.regs[2] = res.a2;
+   host_ctxt->regs.regs[3] = res.a3;
+}
+
+static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
+{
+   /*
+* Unlike HVC, the return address of an SMC is the instruction's PC.
+* Move the return address past the instruction.
+*/
+   skip_host_instruction();
+
+   /* Forward SMC not handled in EL2 to EL3. */
+   forward_host_smc(host_ctxt);
+}
+
 void handle_trap(struct kvm_cpu_context *host_ctxt)
 {
u64 esr = read_sysreg_el2(SYS_ESR);
@@ -114,6 +146,10 @@ void handle_trap(struct kvm_cpu_context 
*host_ctxt)

case ESR_ELx_EC_HVC64:
handle_host_hcall(host_ctxt);
break;
+   case ESR_ELx_EC_SMC32:


How is that even possible? Host EL1 is strictly 64bit, so SMC32 cannot 
occur.



+   case ESR_ELx_EC_SMC64:
+   handle_host_smc(host_ctxt);
+   break;
default:
hyp_panic();
}


Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 02/26] psci: Export configured PSCI function IDs

2020-11-05 Thread Marc Zyngier


On 2020-11-04 18:36, David Brazdil wrote:

Function IDs used by PSCI are configurable for v0.1 via DT/APCI. If the
host is using PSCI v0.1, KVM's PSCI proxy needs to use the same IDs.
Expose the array holding the information.

Signed-off-by: David Brazdil 
---
 drivers/firmware/psci/psci.c | 10 +-
 include/linux/psci.h | 10 ++
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/firmware/psci/psci.c 
b/drivers/firmware/psci/psci.c

index ff523bdbfe3f..ffcb88f60e21 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -60,15 +60,7 @@ typedef unsigned long (psci_fn)(unsigned long, 
unsigned long,

unsigned long, unsigned long);
 static psci_fn *invoke_psci_fn;

-enum psci_function {
-   PSCI_FN_CPU_SUSPEND,
-   PSCI_FN_CPU_ON,
-   PSCI_FN_CPU_OFF,
-   PSCI_FN_MIGRATE,
-   PSCI_FN_MAX,
-};
-
-static u32 psci_function_id[PSCI_FN_MAX];
+u32 psci_function_id[PSCI_FN_MAX];

 #define PSCI_0_2_POWER_STATE_MASK  \
(PSCI_0_2_POWER_STATE_ID_MASK | \
diff --git a/include/linux/psci.h b/include/linux/psci.h
index cb35b90d1746..877d844ee6d9 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -29,6 +29,16 @@ bool psci_has_osi_support(void);
  */
 extern int psci_driver_version;

+enum psci_function {
+   PSCI_FN_CPU_SUSPEND,
+   PSCI_FN_CPU_ON,
+   PSCI_FN_CPU_OFF,
+   PSCI_FN_MIGRATE,
+   PSCI_FN_MAX,
+};
+
+extern u32 psci_function_id[PSCI_FN_MAX];


I am very reluctant to expose this array to the rest of the kernel.
The temptation becomes huge for people to start writing to it
from random drivers in order to route PSCI calls somewhere else.
Consider exporting an accessor instead.

Same thing for the following patch (there already are a couple of
accessors for psci_cpu_suspend_feature, which you could make visible).

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 01/26] psci: Export configured PSCI version

2020-11-05 Thread Marc Zyngier


On 2020-11-04 18:36, David Brazdil wrote:

The version of PSCI that the kernel should use to communicate with
firmware is typically obtained from probing PSCI_VERSION. However, that
doesn't work for PSCI v0.1 where the host gets the information from
DT/ACPI, or if PSCI is not supported / was disabled.

KVM's PSCI proxy for the host needs to be configured with the same
version used by the host driver. Expose the PSCI version used by the
host.

Signed-off-by: David Brazdil 
---
 drivers/firmware/psci/psci.c | 6 ++
 include/linux/psci.h | 8 
 2 files changed, 14 insertions(+)

diff --git a/drivers/firmware/psci/psci.c 
b/drivers/firmware/psci/psci.c

index 00af99b6f97c..ff523bdbfe3f 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -49,6 +49,8 @@ static int resident_cpu = -1;
 struct psci_operations psci_ops;
 static enum arm_smccc_conduit psci_conduit = SMCCC_CONDUIT_NONE;

+int psci_driver_version = PSCI_VERSION(0, 0);
+
 bool psci_tos_resident_on(int cpu)
 {
return cpu == resident_cpu;
@@ -461,6 +463,8 @@ static int __init psci_probe(void)
return -EINVAL;
}

+   psci_driver_version = ver;
+
psci_0_2_set_functions();

psci_init_migrate();
@@ -514,6 +518,8 @@ static int __init psci_0_1_init(struct device_node 
*np)


pr_info("Using PSCI v0.1 Function IDs from DT\n");

+   psci_driver_version = PSCI_VERSION(0, 1);
+
if (!of_property_read_u32(np, "cpu_suspend", )) {
psci_function_id[PSCI_FN_CPU_SUSPEND] = id;
psci_ops.cpu_suspend = psci_cpu_suspend;
diff --git a/include/linux/psci.h b/include/linux/psci.h
index 2a1bfb890e58..cb35b90d1746 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -21,6 +21,14 @@ bool psci_power_state_is_valid(u32 state);
 int psci_set_osi_mode(bool enable);
 bool psci_has_osi_support(void);

+/**
+ * The version of the PSCI specification followed by the driver.
+ * This is equivalent to calling PSCI_VERSION except:
+ *   (a) it also works for PSCI v0.1, which does not support 
PSCI_VERSION, and

+ *   (b) it is set to v0.0 if the PSCI driver was not initialized.
+ */
+extern int psci_driver_version;
+
 struct psci_operations {
u32 (*get_version)(void);
int (*cpu_suspend)(u32 state, unsigned long entry_point);


How about providing a get_version callback for pre-0.2 implementations
instead? This would avoid exposing more symbols (psci_ops is already
global).

Thanks,

M.

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index 00af99b6f97c..b84454e12d92 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -500,6 +500,11 @@ static int __init psci_0_2_init(struct device_node 
*np)

return psci_probe();
 }

+static u32 psci_0_1_get_version(void)
+{
+   return PSCI_VERSION(0, 1);
+}
+
 /*
  * PSCI < v0.2 get PSCI Function IDs via DT.
  */
@@ -514,6 +519,8 @@ static int __init psci_0_1_init(struct device_node 
*np)


pr_info("Using PSCI v0.1 Function IDs from DT\n");

+   psci_ops.get_version = psci_0_1_get_version;
+
if (!of_property_read_u32(np, "cpu_suspend", )) {
psci_function_id[PSCI_FN_CPU_SUSPEND] = id;
psci_ops.cpu_suspend = psci_cpu_suspend;

--
Jazz is not dead. It just smells funny...

Re: [PATCH 4/4] arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver

2020-11-05 Thread Marc Zyngier


On 2020-11-04 23:22, Konrad Dybcio wrote:

QCOM KRYO2XX Silver cores are Cortex-A53 based and are
susceptible to the 845719 erratum. Add them to the lookup
list to apply the erratum.

Signed-off-by: Konrad Dybcio 
---
 arch/arm64/kernel/cpu_errata.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kernel/cpu_errata.c 
b/arch/arm64/kernel/cpu_errata.c

index 61314fd70f13..cafaf0da05b7 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -299,6 +299,8 @@ static const struct midr_range 
erratum_845719_list[] = {

MIDR_REV_RANGE(MIDR_CORTEX_A53, 0, 0, 4),
/* Brahma-B53 r0p[0] */
MIDR_REV(MIDR_BRAHMA_B53, 0, 0),
+   /* Kryo2XX Silver rAp4 */
+   MIDR_REV(MIDR_QCOM_KRYO_2XX_SILVER, 0xa, 0x4),


Is this the only affected version? If this is actually an A53, how do 
the

revisions map between Kryo and Cortex cores?

M.
--
Jazz is not dead. It just smells funny...

Re: Aw: Re: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-05 Thread Marc Zyngier


On 2020-11-04 23:14, Thomas Gleixner wrote:

[...]


TBH, that's butt ugly. So after staring long enough into the PCI code I
came up with a way to transport that information to the probe code.

That allows a particular device to say 'I can't do MSI' and at the same
time keeps the warning machinery intact which tells us that a 
particular

host controller driver is broken.

Uncompiled and untested as usual :)

Thanks,

tglx
---
 drivers/pci/controller/pcie-mediatek.c |4 
 drivers/pci/probe.c|3 +++
 include/linux/pci.h|1 +
 3 files changed, 8 insertions(+)

--- a/drivers/pci/controller/pcie-mediatek.c
+++ b/drivers/pci/controller/pcie-mediatek.c
@@ -143,6 +143,7 @@ struct mtk_pcie_port;
  * struct mtk_pcie_soc - differentiate between host generations
  * @need_fix_class_id: whether this host's class ID needed to be fixed 
or not
  * @need_fix_device_id: whether this host's device ID needed to be 
fixed or not

+ * @no_msi: Bridge has no MSI support
  * @device_id: device ID which this host need to be fixed
  * @ops: pointer to configuration access functions
  * @startup: pointer to controller setting functions
@@ -151,6 +152,7 @@ struct mtk_pcie_port;
 struct mtk_pcie_soc {
bool need_fix_class_id;
bool need_fix_device_id;
+   bool no_msi;
unsigned int device_id;
struct pci_ops *ops;
int (*startup)(struct mtk_pcie_port *port);
@@ -1084,6 +1086,7 @@ static int mtk_pcie_probe(struct platfor

host->ops = pcie->soc->ops;
host->sysdata = pcie;
+   host->no_msi = pcie->soc->no_msi;

err = pci_host_probe(host);
if (err)
@@ -1173,6 +1176,7 @@ static const struct dev_pm_ops mtk_pcie_
 };

 static const struct mtk_pcie_soc mtk_pcie_soc_v1 = {
+   .no_msi = true,
.ops = _pcie_ops,
.startup = mtk_pcie_startup_port,
 };
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -889,6 +889,9 @@ static int pci_register_host_bridge(stru
if (!bus)
return -ENOMEM;

+   if (bridge->no_msi)
+   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
+
bridge->bus = bus;

/* Temporarily move resources off the list */
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -545,6 +545,7 @@ struct pci_host_bridge {
unsigned intnative_dpc:1;   /* OS may use PCIe DPC */
unsigned intpreserve_config:1;  /* Preserve FW resource setup */
unsigned intsize_windows:1; /* Enable root bus sizing */
+   unsigned intno_msi:1;   /* Bridge has no MSI support */

/* Resource alignment requirements */
resource_size_t (*align_resource)(struct pci_dev *dev,


If that's the direction of travel, we also need something like this
for configuration where the host bridge relies on an external MSI block
that uses MSI domains (boot-tested in a GICv3 guest).

M.

diff --git a/drivers/pci/controller/pci-host-common.c 
b/drivers/pci/controller/pci-host-common.c

index 6ce34a1deecb..603f6fbbe68a 100644
--- a/drivers/pci/controller/pci-host-common.c
+++ b/drivers/pci/controller/pci-host-common.c
@@ -77,6 +77,7 @@ int pci_host_common_probe(struct platform_device 
*pdev)


bridge->sysdata = cfg;
bridge->ops = (struct pci_ops *)>pci_ops;
+   bridge->msi_domain = true;

platform_set_drvdata(pdev, bridge);

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 16fb150fbb8d..f421b2869bca 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -889,9 +889,6 @@ static int pci_register_host_bridge(struct 
pci_host_bridge *bridge)

if (!bus)
return -ENOMEM;

-   if (bridge->no_msi)
-   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
-
bridge->bus = bus;

/* Temporarily move resources off the list */
@@ -928,6 +925,9 @@ static int pci_register_host_bridge(struct 
pci_host_bridge *bridge)

device_enable_async_suspend(bus->bridge);
pci_set_bus_of_node(bus);
pci_set_bus_msi_domain(bus);
+   if (bridge->no_msi ||
+   (bridge->msi_domain && !bus->dev.msi_domain))
+   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;

if (!parent)
set_dev_node(bus->bridge, pcibus_to_node(bus));
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c2a0c1d471d6..81f72fd46e06 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -546,6 +546,7 @@ struct pci_host_bridge {
unsigned intpreserve_config:1;  /* Preserve FW resource setup */
unsigned intsize_windows:1; /* Enable root bus sizing */
unsigned intno_msi:1;   /* Bridge has no MSI support */
+   unsigned intmsi_domain:1;   /* Bridge wants MSI domain */

/* Resource alignment requirements */
resource_size_t (*align_resource)(struct pci_dev *dev,

--
Jazz is not dead. It just smells funny...

Re: [PATCH -next] irq-chip/gic-v3-its: Fixed an issue where the ITS executes the residual commands in the queue again when the ITS wakes up from sleep mode.

2020-11-03 Thread Marc Zyngier

On Tue, 03 Nov 2020 08:11:23 +,
Xu Qiang  wrote:
> 
> During wakeup, the ATF restore interface restores the values of
> the cbaser and cwriter registers. As a result, the ITS executes
> the residual commands in the queue, which may cause memory corruption.
> 
> To solve this problem, clear all data in the command queue
> in the suspend interface of the ITS driver.
> 
> Signed-off-by: Xu Qiang 
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index 0fec31931e11..b8487f78ac21 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -4741,6 +4741,14 @@ static int its_save_disable(void)
>   list_for_each_entry(its, _nodes, entry) {
>   void __iomem *base;
>  
> + /*
> +  * Clear the command queue so that the ITS will not re-execute
> +  * the remaining commands in the command queue when
> +  * the cwriter and cbaser registers are restored
> +  * in the restore interface of the firmware.
> +  */
> + memset(its->cmd_base, 0, ITS_CMD_QUEUE_SZ);
> +
>   if (!(its->flags & ITS_FLAGS_SAVE_SUSPEND_STATE))
>   continue;

You are wiping the ITS queue before even stopping the ITS. How well is
that going to work? What if there is something in flight?

I don't understand what you are trying to do here, nor how ATF is
involved. So please describe the whole sequence of events, and we'll
decide whether that's something we need to fix.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: Aw: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-03 Thread Marc Zyngier


On 2020-11-03 10:31, Thomas Gleixner wrote:

On Tue, Nov 03 2020 at 09:54, Marc Zyngier wrote:

On 2020-11-02 22:18, Thomas Gleixner wrote:

So we really need some other solution and removing the warning is not
an option. If MSI is enabled then we want to get a warning when a PCI
device has no MSI domain associated. Explicitly expressing the PCIE
brigde misfeature of not supporting MSI is way better than silently
returning an error code which is swallowed anyway.


I don't disagree here, though the PCI_MSI_ARCH_FALLBACKS mechanism
makes it more difficult to establish.


Only for the few leftovers which implement msi_controller, i.e.

drivers/pci/controller/pci-hyperv.c
drivers/pci/controller/pci-tegra.c
drivers/pci/controller/pcie-rcar-host.c
drivers/pci/controller/pcie-xilinx.c

The architectures which select PCI_MSI_ARCH_FALLBACKS are:

arch/ia64/Kconfig:  select PCI_MSI_ARCH_FALLBACKS if PCI_MSI
arch/mips/Kconfig:  select PCI_MSI_ARCH_FALLBACKS if PCI_MSI
arch/powerpc/Kconfig:   select PCI_MSI_ARCH_FALLBACKS   if 
PCI_MSI

arch/s390/Kconfig:  select PCI_MSI_ARCH_FALLBACKS   if PCI_MSI
arch/sparc/Kconfig: select PCI_MSI_ARCH_FALLBACKS if PCI_MSI

implement arch_setup_msi_irq() which makes it magically work :)

Whatever the preferred way is via flags at host probe time or 
flagging

it post probe I don't care much as long as it is consistent.


Host probe time is going to require some changes in the core PCI api,
as everything that checks for a MSI domain is based on the pci_bus
structure, which is only allocated much later.


Yeah, it's nasty. One possible solution is to add flags or a callback 
to

pci_ops, but it's not pretty either.

I think we should go with the 'mark it after pci_host_probe()' hack for
5.10-rc. The real fix will be larger and go into 5.11.

Thoughts?


We can do that, although I worried that it isn't 100% reliable:

pci_host_probe() ends up calling pci_add_device(), and will start
probing devices if the endpoint drivers have already registered
with the core code, long before the flag gets set.

Here's what I've hacked together for a guest that doesn't have
any MSI capability:

diff --git a/drivers/pci/controller/pci-host-common.c 
b/drivers/pci/controller/pci-host-common.c

index 6ce34a1deecb..7dd5145cd38d 100644
--- a/drivers/pci/controller/pci-host-common.c
+++ b/drivers/pci/controller/pci-host-common.c
@@ -55,6 +55,7 @@ int pci_host_common_probe(struct platform_device 
*pdev)

struct pci_host_bridge *bridge;
struct pci_config_window *cfg;
const struct pci_ecam_ops *ops;
+   int ret;

ops = of_device_get_match_data(>dev);
if (!ops)
@@ -80,7 +81,10 @@ int pci_host_common_probe(struct platform_device 
*pdev)


platform_set_drvdata(pdev, bridge);

-   return pci_host_probe(bridge);
+   ret = pci_host_probe(bridge);
+   bridge->bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(pci_host_common_probe);

(plus another hack to get the host controller to initialise a bit
later, though building it as a module will achieve the same thing):

[0.369114] 9pnet: Installing 9P2000 support
[0.369807] mpls_gso: MPLS GSO support
[0.370512] registered taskstats version 1
[0.371204] Loading compiled-in X.509 certificates
[0.371988] zswap: loaded using pool lzo/zbud
[0.373041] pci-host-generic 4000.pci: host bridge /pci ranges:
[0.374045] pci-host-generic 4000.pci:   IO 
0x00..0x00 -> 0x00
[0.375458] pci-host-generic 4000.pci:  MEM 
0x004100..0x007fff -> 0x004100
[0.376848] pci-host-generic 4000.pci: ECAM at [mem 
0x4000-0x40ff] for [bus 00]
[0.378204] pci-host-generic 4000.pci: PCI host bridge to bus 
:00

[0.379316] pci_bus :00: root bus resource [bus 00]
[0.380146] pci_bus :00: root bus resource [io  0x-0x]
[0.381131] pci_bus :00: root bus resource [mem 
0x4100-0x7fff]

[0.382267] pci :00:00.0: [1af4:1009] type 00 class 0xff
[0.383369] pci :00:00.0: reg 0x10: [io  0x6200-0x62ff]
[0.384286] pci :00:00.0: reg 0x14: [mem 0x4100-0x41ff]
[0.385324] pci :00:00.0: reg 0x18: [mem 0x41000200-0x410003ff]
[0.386680] pci :00:01.0: [1af4:1009] type 00 class 0xff
[0.387778] pci :00:01.0: reg 0x10: [io  0x6300-0x63ff]
[0.388696] pci :00:01.0: reg 0x14: [mem 0x41000400-0x410004ff]
[0.389730] pci :00:01.0: reg 0x18: [mem 0x41000600-0x410007ff]
[0.391070] pci :00:02.0: [1af4:1000] type 00 class 0x02
[0.392212] pci :00:02.0: reg 0x10: [io  0x6400-0x64ff]
[0.393137] pci :00:02.0: reg 0x14: [mem 0x41000800-0x410008ff]
[0.394163] pci :00:02.0: reg 0x18: [mem 0x41000a00-0x41000bff]
[0.395678] pci :00:00.0: BAR 2: assigned [mem 
0x4100-0x410001ff]
[0.396762] pci :00:01.0: BAR 2: assigned [mem 
0x4100

Re: Aw: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-03 Thread Marc Zyngier


On 2020-11-03 10:16, Thomas Gleixner wrote:

On Tue, Nov 03 2020 at 09:54, Marc Zyngier wrote:

On 2020-11-02 22:18, Thomas Gleixner wrote:

On Mon, Nov 02 2020 at 17:16, Thomas Gleixner wrote:

On Mon, Nov 02 2020 at 11:30, Marc Zyngier wrote:

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -871,6 +871,8 @@ static void pci_set_bus_msi_domain(struct 
pci_bus

*bus)
d = pci_host_bridge_msi_domain(b);

dev_set_msi_domain(>dev, d);
+   if (!d)
+   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;


Hrm, that might break legacy setups (no irqdomain support). I'd 
rather

prefer to explicitly tell the pci core at host registration time.


s/might break/ breaks / Just validated :)


For my own edification, can you point me to the failing case?


Any architecture which selects PCI_MSI_ARCH_FALLBACKS and does not have
irqdomain support runs into:

if (!d)
bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;

which in turn makes pci_msi_supported() return 0 and consequently makes
pci_enable_msi[x]() fail.


I pointer that out in [1], together with a potential fix. Not sure if
anything else breaks though.

Thanks,

M.

[1] 
https://lore.kernel.org/r/336d6588567949029c52ecfbb8766...@kernel.org/

--
Jazz is not dead. It just smells funny...

Re: Aw: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-03 Thread Marc Zyngier


On 2020-11-02 22:18, Thomas Gleixner wrote:

On Mon, Nov 02 2020 at 17:16, Thomas Gleixner wrote:

On Mon, Nov 02 2020 at 11:30, Marc Zyngier wrote:

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -871,6 +871,8 @@ static void pci_set_bus_msi_domain(struct pci_bus
*bus)
d = pci_host_bridge_msi_domain(b);

dev_set_msi_domain(>dev, d);
+   if (!d)
+   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;


Hrm, that might break legacy setups (no irqdomain support). I'd rather
prefer to explicitly tell the pci core at host registration time.


s/might break/ breaks / Just validated :)


For my own edification, can you point me to the failing case?

So we really need some other solution and removing the warning is not 
an

option. If MSI is enabled then we want to get a warning when a PCI
device has no MSI domain associated. Explicitly expressing the PCIE
brigde misfeature of not supporting MSI is way better than silently
returning an error code which is swallowed anyway.


I don't disagree here, though the PCI_MSI_ARCH_FALLBACKS mechanism
makes it more difficult to establish.


Whatever the preferred way is via flags at host probe time or flagging
it post probe I don't care much as long as it is consistent.


Host probe time is going to require some changes in the core PCI api,
as everything that checks for a MSI domain is based on the pci_bus
structure, which is only allocated much later.

I'll have a think.

M.
--
Jazz is not dead. It just smells funny...

Re: Using fixed LPI number for some Device ID

2020-11-03 Thread Marc Zyngier


On 2020-11-03 05:22, Dongjiu Geng wrote:

On 2020/10/31 17:55, Marc Zyngier wrote:

Dongjiu,

On Sat, 31 Oct 2020 02:19:19 +,
Dongjiu Geng  wrote:


Hi Marc,
Sorry to disturb you, Currently the LPI number is not fixed for the
device. The LPI number is dynamically allocated start from 8092.
For two OS which shares the ITS, One OS needs to configure the
device interrupt required by another OS, and the other OS uses a
fixed interrupt ID to respond the interrupt. Therefore, the LPI IRQ
number of the device needed be fixed. I want to upstream this
feature that allocate fixed LPI number for the device that is
specified through the DTS. What is your meaning?  Thanks


I think you are starting from the wrong premises.

You can't "share" an ITS directly between two operating systems. The
ITS can only be controlled by a single operating system, because its
function goes way beyond allocating an LPI. How would you deal with
simple things such as masking an interrupt, which requires:

- Access to memory (configuration table)
- Access to the command queue (to insert an invalidation command)
- Access to MMIO registers (to kick the command queue into action)

all of which needs to be exclusive of concurrent modifications. How do
you propose this is implemented in a safe manner by two operating
systems which, by nature, distrust each other? Allocating LPIs is the
least of your problems, really.

Yes， I agree with you it . But in my HW platform, using
virtualization, the performance
deteriorates greatly.  So I distributed the I/O devices to different
operation systems. During the startup of one OS,
interrupts are bound to different OS in one OS, which can be exclusive
of concurrent modifications.

In fact it has some limitations as you said, such mask/enable/route
Interrupts, If want to
mask interrupts, need to mask interrupts on the source device.

If you think it is not a common feature, I will used it as a local
customization function and not upstream.


I don't think this makes sense for Linux, at least not in a way
that limits the way the kernel deals with simple things such as
LPI allocation.

We have systems in the tree where Linux route interrupts on behalf
of other agents in the system (see what the TI PRUSS subsystem does,
for example), and even direct interrupt injection is, to an extent,
doing that. This requires a standardised way for describing the routing,
the allocation, and potentially the life cycle of the interrupt.

But hardcoding the allocation based on some non-standard scheme
is not something I'm considering.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: Aw: Re: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-02 Thread Marc Zyngier


On 2020-11-02 11:56, Frank Wunderlich wrote:

looks good on bananapi-r2, no warning, pcie-card and hdd recognized


Thanks for giving it a shot. Still needs a bit of tweaking, as I expect
it to break configurations that select CONFIG_PCI_MSI_ARCH_FALLBACKS
(we have to assume that MSIs can be handled until we hit the 
arch-specific

stuff).

There is also a small nit in the way we allow userspace to mess with
this flag via sysfs, and similar restrictions should probably apply.

Updated patch below.

M.

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index d15c881e2e7e..5bb1306162c7 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -387,10 +387,20 @@ static ssize_t msi_bus_store(struct device *dev, 
struct device_attribute *attr,

return count;
}

-   if (val)
+   if (val) {
+   /*
+* If there is no possibility for this bus to deal with
+* MSIs, then allowing them to be requested would lead to
+* the kernel complaining loudly. In this situation, don't
+* let userspace mess things up.
+*/
+   if (!pci_bus_is_msi_capable(subordinate))
+   return -EINVAL;
+
subordinate->bus_flags &= ~PCI_BUS_FLAGS_NO_MSI;
-   else
+   } else {
subordinate->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
+   }

 	dev_info(>dev, "MSI/MSI-X %s for future drivers of 
devices on this bus\n",

 val ? "allowed" : "disallowed");
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4289030b0fff..28861cc6435a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -871,6 +871,8 @@ static void pci_set_bus_msi_domain(struct pci_bus 
*bus)

d = pci_host_bridge_msi_domain(b);

dev_set_msi_domain(>dev, d);
+   if (!pci_bus_is_msi_capable(bus))
+   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
 }

 static int pci_register_host_bridge(struct pci_host_bridge *bridge)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 22207a79762c..6aadb863dff4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2333,6 +2333,12 @@ pci_host_bridge_acpi_msi_domain(struct pci_bus 
*bus) { return NULL; }
 static inline bool pci_pr3_present(struct pci_dev *pdev) { return 
false; }

 #endif

+static inline bool pci_bus_is_msi_capable(struct pci_bus *bus)
+{
+   return (IS_ENABLED(CONFIG_PCI_MSI_ARCH_FALLBACKS) ||
+   dev_get_msi_domain(>dev));
+}
+
 #ifdef CONFIG_EEH
 static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
 {

--
Jazz is not dead. It just smells funny...

Re: Aw: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-02 Thread Marc Zyngier


On 2020-11-01 22:27, Thomas Gleixner wrote:

On Sun, Nov 01 2020 at 21:47, Marc Zyngier wrote:

On Sun, 01 Nov 2020 18:27:13 +,
Frank Wunderlich  wrote:
Thinking of it a bit more, I think this is the wrong solution.

PCI MSIs are optional, and not a requirement. I can trivially spin a
VM with PCI devices and yet no MSI capability (yes, it is more
difficult with real HW), and this results in a bunch of warning, none
of which are actually indicative of anything being wrong.


Well. No.

The problem is that the device enumerates MSI capability, but the host
bridge is not proving support for MSI.

The host bridge fails to mark the bus with PCI_BUS_FLAGS_NO_MSI. That's
the reason why this runs into this issue.


Right, that's the piece I was missing, thanks for that.

However, that doesn't really address the issue when the host bridge and
the MSI widget are two separate entities, oblivious of each other (which
is a pretty common thing on the ARM side).

In this configuration, you can't really decide whether you have a MSI
domain in the host bridge driver (the association is done in the code
PCI code, after you have registered it with the core code), and by the
time you get a pointer to the bus, the endpoints have already been 
probed.


The following patch makes it work for me (GICv3 guest without an ITS)by
checking for the presence of an MSI domain at the point where we 
actually

perform this association, and before starting to scan for endpoints.

I *think* this should work for the MTK thingy, but someone needs to
go and check.

Thanks,

M.

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4289030b0fff..bb363eb103a2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -871,6 +871,8 @@ static void pci_set_bus_msi_domain(struct pci_bus 
*bus)

d = pci_host_bridge_msi_domain(b);

dev_set_msi_domain(>dev, d);
+   if (!d)
+   bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
 }

 static int pci_register_host_bridge(struct pci_host_bridge *bridge)

--
Jazz is not dead. It just smells funny...

Re: Aw: Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-01 Thread Marc Zyngier

On Sun, 01 Nov 2020 18:27:13 +,
Frank Wunderlich  wrote:
> 
> > Gesendet: Sonntag, 01. November 2020 um 18:54 Uhr
> > Von: "Ryder Lee" 
> 
> > Yea, mt7623 (mtk_pcie_soc_v1) does not support MSI, so that's a way to
> > handle it.
> >
> > @Frank, could you help to test it?
> >
> > Ryder
> 
> compiles clean for mt7623/armhf and mt7622/aarch64 so far
> 
> at least bananapi-r2/mt7623 booting is clean now - no warning pcie
> and sata/ahci seems still working as expected. I have a mt7615 card
> in pcie-slot (firmware-load and init without errors) and hdd
> connected to outer sata port (can access partitions witout errors)
> 
> booted r64 too, still see no warning, but have not yet connected
> hdd/pcie-device, but i guess this should not break anything here
> 
> so Marc, if you post the patch separately, you can add my tested-by
> ;) thank you for this

Thinking of it a bit more, I think this is the wrong solution.

PCI MSIs are optional, and not a requirement. I can trivially spin a
VM with PCI devices and yet no MSI capability (yes, it is more
difficult with real HW), and this results in a bunch of warning, none
of which are actually indicative of anything being wrong.

I think the real fix is to get rid of the warnings altogether. If we
could detect that there should be an MSI domain associated with the
device and that it wasn't there, that'd be a good reason to scream.
But on its own, the absence of a MSI domain is not an indication of
anything being amiss.

M.

-- 
Without deviation from the norm, progress is not possible.

[tip: irq/urgent] irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY

2020-11-01 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: d26dd4131d0d6ad7aa294a7f8d18782b47c27c93
Gitweb:
https://git.kernel.org/tip/d26dd4131d0d6ad7aa294a7f8d18782b47c27c93
Author:Marc Zyngier 
AuthorDate:Fri, 16 Oct 2020 09:28:23 +01:00
Committer: Marc Zyngier 
CommitterDate: Fri, 16 Oct 2020 10:51:12 +01:00

irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY

Now that GENERIC_IRQ_IPI selects IRQ_DOMAIN_HIERARCHY, there is no
need to have this conditional select for IRQ_MIPS_CPU. Similarily,
MIPS_GIC only needs selecting GENERIC_IRQ_IPI.

Suggested-by: Thomas Gleixner 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/Kconfig | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index cd734df..38785a0 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -180,7 +180,6 @@ config IRQ_MIPS_CPU
select GENERIC_IRQ_CHIP
select GENERIC_IRQ_IPI if SYS_SUPPORTS_MULTITHREADING
select IRQ_DOMAIN
-   select IRQ_DOMAIN_HIERARCHY if GENERIC_IRQ_IPI
select GENERIC_IRQ_EFFECTIVE_AFF_MASK
 
 config CLPS711X_IRQCHIP
@@ -307,7 +306,6 @@ config KEYSTONE_IRQ
 config MIPS_GIC
bool
select GENERIC_IRQ_IPI
-   select IRQ_DOMAIN_HIERARCHY
select MIPS_CM
 
 config INGENIC_IRQ

[tip: irq/urgent] irqchip/mst: Make mst_intc_of_init static

2020-11-01 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: 893a7cfb6b0bea650fafa43838d7f7f8f0f076bc
Gitweb:
https://git.kernel.org/tip/893a7cfb6b0bea650fafa43838d7f7f8f0f076bc
Author:Marc Zyngier 
AuthorDate:Thu, 15 Oct 2020 22:26:26 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 15 Oct 2020 22:32:31 +01:00

irqchip/mst: Make mst_intc_of_init static

mst_intc_of_init has no external caller, so let's make it static.

Reported-by: kernel test robot 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/irq-mst-intc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-mst-intc.c b/drivers/irqchip/irq-mst-intc.c
index 4be0775..143657b 100644
--- a/drivers/irqchip/irq-mst-intc.c
+++ b/drivers/irqchip/irq-mst-intc.c
@@ -154,8 +154,8 @@ static const struct irq_domain_ops mst_intc_domain_ops = {
.free   = irq_domain_free_irqs_common,
 };
 
-int __init
-mst_intc_of_init(struct device_node *dn, struct device_node *parent)
+static int __init mst_intc_of_init(struct device_node *dn,
+  struct device_node *parent)
 {
struct irq_domain *domain, *domain_parent;
struct mst_intc_chip_data *cd;

[tip: irq/urgent] irqchip/bcm2836: Fix missing __init annotation

2020-11-01 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: 57733e009f0c7e0526e10a18be12f56996c5460e
Gitweb:
https://git.kernel.org/tip/57733e009f0c7e0526e10a18be12f56996c5460e
Author:Marc Zyngier 
AuthorDate:Sun, 25 Oct 2020 11:10:29 
Committer: Marc Zyngier 
CommitterDate: Sun, 25 Oct 2020 11:10:29 

irqchip/bcm2836: Fix missing __init annotation

bcm2836_arm_irqchip_smp_init() calls set_smp_ipi_range(), which has
an __init annotation. Make sure the caller has the same annotation.

Reported-by: kernel test robot 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/irq-bcm2836.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
index 97838eb..cbc7c74 100644
--- a/drivers/irqchip/irq-bcm2836.c
+++ b/drivers/irqchip/irq-bcm2836.c
@@ -244,7 +244,7 @@ static int bcm2836_cpu_dying(unsigned int cpu)
 
 #define BITS_PER_MBOX  32
 
-static void bcm2836_arm_irqchip_smp_init(void)
+static void __init bcm2836_arm_irqchip_smp_init(void)
 {
struct irq_fwspec ipi_fwspec = {
.fwnode = intc.domain->fwnode,

[tip: irq/urgent] genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY

2020-11-01 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: 151a535171be6ff824a0a3875553ea38570f4c05
Gitweb:
https://git.kernel.org/tip/151a535171be6ff824a0a3875553ea38570f4c05
Author:Marc Zyngier 
AuthorDate:Thu, 15 Oct 2020 21:41:44 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 15 Oct 2020 21:41:44 +01:00

genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY

kernel/irq/ipi.c otherwise fails to compile if nothing else
selects it.

Fixes: 379b656446a3 ("genirq: Add GENERIC_IRQ_IPI Kconfig symbol")
Reported-by: Pavel Machek 
Tested-by: Pavel Machek 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20201015101222.GA32747@amd
---
 kernel/irq/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 10a5aff..164a031 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -82,6 +82,7 @@ config IRQ_FASTEOI_HIERARCHY_HANDLERS
 # Generic IRQ IPI support
 config GENERIC_IRQ_IPI
bool
+   select IRQ_DOMAIN_HIERARCHY
 
 # Generic MSI interrupt support
 config GENERIC_MSI_IRQ

[PATCH 1/2] genirq: Allow an interrupt to be marked as 'naked'

2020-11-01 Thread Marc Zyngier

Some interrupts (such as the rescheduling IPI) rely on not going through
the irq_enter()/irq_exit() calls. To distinguish such interrupts, add
a new IRQ flag that allows the low-level handling code to sidestep the
enter()/exit() calls.

Only the architecture code is expected to use this. It will do the wrong
thing on normal interrupts.

Signed-off-by: Marc Zyngier 
---
 include/linux/irq.h   |  4 +++-
 kernel/irq/debugfs.c  |  1 +
 kernel/irq/irqdesc.c  | 17 -
 kernel/irq/settings.h |  7 +++
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index c54365309e97..af5ba7336925 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -72,6 +72,7 @@ enum irqchip_irq_state;
  *   mechanism and from core side polling.
  * IRQ_DISABLE_UNLAZY  - Disable lazy irq disable
  * IRQ_HIDDEN  - Don't show up in /proc/interrupts
+ * IRQ_NAKED   - Bypass irq_enter()/irq_exit()
  */
 enum {
IRQ_TYPE_NONE   = 0x,
@@ -99,13 +100,14 @@ enum {
IRQ_IS_POLLED   = (1 << 18),
IRQ_DISABLE_UNLAZY  = (1 << 19),
IRQ_HIDDEN  = (1 << 20),
+   IRQ_NAKED   = (1 << 21),
 };
 
 #define IRQF_MODIFY_MASK   \
(IRQ_TYPE_SENSE_MASK | IRQ_NOPROBE | IRQ_NOREQUEST | \
 IRQ_NOAUTOEN | IRQ_MOVE_PCNTXT | IRQ_LEVEL | IRQ_NO_BALANCING | \
 IRQ_PER_CPU | IRQ_NESTED_THREAD | IRQ_NOTHREAD | IRQ_PER_CPU_DEVID | \
-IRQ_IS_POLLED | IRQ_DISABLE_UNLAZY | IRQ_HIDDEN)
+IRQ_IS_POLLED | IRQ_DISABLE_UNLAZY | IRQ_HIDDEN | IRQ_NAKED)
 
 #define IRQ_NO_BALANCING_MASK  (IRQ_PER_CPU | IRQ_NO_BALANCING)
 
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index e4cff358b437..e031d6afc0f8 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -140,6 +140,7 @@ static const struct irq_bit_descr irqdesc_states[] = {
BIT_MASK_DESCR(_IRQ_IS_POLLED),
BIT_MASK_DESCR(_IRQ_DISABLE_UNLAZY),
BIT_MASK_DESCR(_IRQ_HIDDEN),
+   BIT_MASK_DESCR(_IRQ_NAKED),
 };
 
 static const struct irq_bit_descr irqdesc_istates[] = {
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 1a7723604399..c08a1c19d061 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -667,10 +667,9 @@ int __handle_domain_irq(struct irq_domain *domain, 
unsigned int hwirq,
 {
struct pt_regs *old_regs = set_irq_regs(regs);
unsigned int irq = hwirq;
+   struct irq_desc *desc;
int ret = 0;
 
-   irq_enter();
-
 #ifdef CONFIG_IRQ_DOMAIN
if (lookup)
irq = irq_find_mapping(domain, hwirq);
@@ -680,14 +679,22 @@ int __handle_domain_irq(struct irq_domain *domain, 
unsigned int hwirq,
 * Some hardware gives randomly wrong interrupts.  Rather
 * than crashing, do something sensible.
 */
-   if (unlikely(!irq || irq >= nr_irqs)) {
+   desc = irq_to_desc(irq);
+   if (unlikely(!desc || irq >= nr_irqs)) {
ack_bad_irq(irq);
ret = -EINVAL;
+   goto out;
+   }
+
+   if (unlikely(irq_settings_is_naked(desc))) {
+   generic_handle_irq_desc(desc);
} else {
-   generic_handle_irq(irq);
+   irq_enter();
+   generic_handle_irq_desc(desc);
+   irq_exit();
}
 
-   irq_exit();
+out:
set_irq_regs(old_regs);
return ret;
 }
diff --git a/kernel/irq/settings.h b/kernel/irq/settings.h
index 403378b9947b..587e67f9c302 100644
--- a/kernel/irq/settings.h
+++ b/kernel/irq/settings.h
@@ -18,6 +18,7 @@ enum {
_IRQ_IS_POLLED  = IRQ_IS_POLLED,
_IRQ_DISABLE_UNLAZY = IRQ_DISABLE_UNLAZY,
_IRQ_HIDDEN = IRQ_HIDDEN,
+   _IRQ_NAKED  = IRQ_NAKED,
_IRQF_MODIFY_MASK   = IRQF_MODIFY_MASK,
 };
 
@@ -33,6 +34,7 @@ enum {
 #define IRQ_IS_POLLED  GOT_YOU_MORON
 #define IRQ_DISABLE_UNLAZY GOT_YOU_MORON
 #define IRQ_HIDDEN GOT_YOU_MORON
+#define IRQ_NAKED  GOT_YOU_MORON
 #undef IRQF_MODIFY_MASK
 #define IRQF_MODIFY_MASK   GOT_YOU_MORON
 
@@ -174,3 +176,8 @@ static inline bool irq_settings_is_hidden(struct irq_desc 
*desc)
 {
return desc->status_use_accessors & _IRQ_HIDDEN;
 }
+
+static inline bool irq_settings_is_naked(struct irq_desc *desc)
+{
+   return desc->status_use_accessors & _IRQ_NAKED;
+}
-- 
2.28.0

[PATCH 0/2] arm64: Allow the rescheduling IPI to bypass irq_enter/exit

2020-11-01 Thread Marc Zyngier

Vincent recently reported [1] that 5.10-rc1 showed a significant
regression when running "perf bench sched pipe" on arm64, and
pinpointed it to the recent move to handling IPIs as normal
interrupts.

The culprit is the use of irq_enter/irq_exit around the handling of
the rescheduling IPI, meaning that we enter the scheduler right after
the handling of the IPI instead of deferring it to the next preemption
event. This accounts for most of the overhead introduced.

On architectures that have architected IPIs at the CPU level (x86
being the obvious one), the absence of irq_enter/exit is natural. ARM
(both 32 and 64bits) mimicked this behaviour by having some
arch-specific handling for the interrupts that are used to implement
IPIs. Moving IPIs on the normal interrupt path introduced the
regression.

This couple of patches try to acknowledge the fact that some IPIs are
"special", in the sense that they don't need to follow the standard
interrupt handling flow.

The good news is that it cures the regression on arm64, and could
be similarly beneficial to both 32bit ARM, MIPS, or any other
architecture that uses a unique IRQ to represent the scheduler IPI.

The bad news is that these patches are ugly as sin, and I really don't
like them. I specially hate that they can give driver authors the idea
that they can make random interrupts "faster".

Comments, suggestions and hate mails appreciated, as always.

[1]
https://lore.kernel.org/r/cakftptdjppri5gt6klefp_b_zjuz5dyxeqtj+0vkohu-y9b...@mail.gmail.com

Marc Zyngier (2):
genirq: Allow an interrupt to be marked as 'naked'
arm64: Mark the recheduling IPI as naked interrupt

--
2.28.0

[PATCH 2/2] arm64: Mark the recheduling IPI as naked interrupt

2020-11-01 Thread Marc Zyngier

Flag the rescheduling IPI as 'naked', making sure such interrupt
doesn't trigger a rescheduling event by itself.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/smp.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 82e75fc2c903..6c11be3e74d3 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -993,6 +993,10 @@ void __init set_smp_ipi_range(int ipi_base, int n)
 
ipi_desc[i] = irq_to_desc(ipi_base + i);
irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
+
+   /* The recheduling IPI is special... */
+   if (i == IPI_RESCHEDULE)
+   irq_set_status_flags(ipi_base + i, IRQ_NAKED);
}
 
ipi_irq_base = ipi_base;
-- 
2.28.0

[GIT PULL] irqchip updates for 5.10, take #1

2020-11-01 Thread Marc Zyngier

Hi Thomas,

Here's a smallish set of fixes for 5.10. Some fixes after the
IPI-as-IRQ (I expect a couple more next week), two significant bug
fixes for the SiFive PLIC, and a TI update to handle their "unmapped
events". The rest is the usual set of cleanups and tidying up.

Please pull,

M.

The following changes since commit 63ea38a402213d8c9c16e58ee4901ff51bc8fe3c:

  Merge branch 'irq/mstar' into irq/irqchip-next (2020-10-10 12:46:54 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
tags/irqchip-fixes-5.10-1

for you to fetch changes up to d95bdca75b3fb41bf185efe164e05aed820081a5:

  irqchip/ti-sci-inta: Add support for unmapped event handling (2020-11-01 
12:00:50 +)


irqchip fixes for Linux 5.10, take #1

- A couple of fixes after the IPI as IRQ patches (Kconfig, bcm2836)
- Two SiFive PLIC fixes (irq_set_affinity, hierarchy handling)
- "unmapped events" handling for the ti-sci-inta controller
- Tidying up for the irq-mst driver (static functions, Kconfig)
- Small cleanup in the Renesas irqpin driver
- STM32 exti can now handle LP timer events


Fabrice Gasnier (1):
  irqchip/stm32-exti: Add all LP timer exti direct events support

Geert Uytterhoeven (2):
  irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7
  irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm

Greentime Hu (2):
  irqchip/sifive-plic: Fix broken irq_set_affinity() callback
  irqchip/sifive-plic: Fix chip_data access within a hierarchy

Marc Zyngier (4):
  genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
  irqchip/mst: Make mst_intc_of_init static
  irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY
  irqchip/bcm2836: Fix missing __init annotation

Peter Ujfalusi (2):
  dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling
  irqchip/ti-sci-inta: Add support for unmapped event handling

 .../bindings/interrupt-controller/ti,sci-inta.yaml | 10 +++
 drivers/irqchip/Kconfig|  3 +-
 drivers/irqchip/irq-bcm2836.c  |  2 +-
 drivers/irqchip/irq-mst-intc.c |  4 +-
 drivers/irqchip/irq-renesas-intc-irqpin.c  |  8 +--
 drivers/irqchip/irq-sifive-plic.c  | 10 +--
 drivers/irqchip/irq-stm32-exti.c   |  4 ++
 drivers/irqchip/irq-ti-sci-inta.c  | 83 +-
 kernel/irq/Kconfig |  1 +
 9 files changed, 107 insertions(+), 18 deletions(-)

Re: [RFC PATCH] irqchip/sifive-plic: Fix getting wrong chip_data when interrupt is hierarchy

2020-11-01 Thread Marc Zyngier

On Thu, 29 Oct 2020 10:37:38 +0800, Greentime Hu wrote:
> This oops is caused by a wrong chip_data and it is because plic_irq_unmask
> uses irq_get_chip_data(irq_data->irq) to get the chip_data. However it may
> get another irq_data with the same irq_data->irq if it is hierarchy.
> 
> In this case, it will get irq_data of sifive_gpio_irqchip instead of
> plic_chip so that it will get a wrong chip_data and then the wrong lmask
> of it to cause this oops.
> 
> [...]

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/sifive-plic: Fix chip_data access within a hierarchy
  commit: f9ac7bbd6e4540dcc6df621b9c9b6eb2e26ded1d

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm

2020-11-01 Thread Marc Zyngier

On Wed, 28 Oct 2020 16:39:55 +0100, Geert Uytterhoeven wrote:
> Get rid of the separate flag to indicate if the IRLM bit is present in
> the INTC/Interrupt Control Register 0, by considering -1 an invalid
> irlm_bit value.

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm
  commit: b388bdf2bac7aedac9bde5ab63eaf7646f29fc00

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v3 0/2] irqchip/ti-sci-inta: Support for unmapped events

2020-11-01 Thread Marc Zyngier

On Tue, 20 Oct 2020 10:32:41 +0300, Peter Ujfalusi wrote:
> Changes since v2:
> - Extended the block diagram of INTA in the DT documentation
> - Use less creative variable names for unmapped events in the driver
> - Short comment section to describe the unmapped event handling in driver
> - Use u16 array to store the TI-SCI device identifiers instead of u32
> - Use printk format specifier instead of_node_full_name
> 
> [...]

Applied to irq/irqchip-next, thanks!

[1/2] dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling
  commit: bb2bd7c7f3d0946acc2104db31df228d10f7b598
[2/2] irqchip/ti-sci-inta: Add support for unmapped event handling
  commit: d95bdca75b3fb41bf185efe164e05aed820081a5

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] pci: mediatek: fix warning in msi.h

2020-11-01 Thread Marc Zyngier

On Sun, 01 Nov 2020 09:25:04 +,
Frank Wunderlich  wrote:
> 
> Am 31. Oktober 2020 22:49:14 MEZ schrieb Thomas Gleixner :
> 
> >That's not a fix. It's just supressing the warning.
> 
> Ok sorry
> 
> >So it needs to be figured out why the domain association is not there.
> 
> It looks like for mt7623 there is no msi domain setup (done via
> mtk_pcie_setup_irq callback + mtk_pcie_init_irq_domain) in mtk pcie
> driver.

Does this mean that this SoC never handled MSIs the first place? Which
would explain the warning, as there is no MSI domain registered for
the device, and we end-up falling back to arch_setup_msi_irqs().

If this system truly is unable to handle MSIs, one potential
workaround would be to register a PCI-MSI domain that would always
fail its allocation with -ENOSPC. It is really ugly, but would keep
the horror localised. See the patchlet below, which I can't test.

If this situation is more common than we expect, we may need something
in core code instead.

M.

diff --git a/drivers/pci/controller/pcie-mediatek.c 
b/drivers/pci/controller/pcie-mediatek.c
index cf4c18f0c25a..52758b546d40 100644
--- a/drivers/pci/controller/pcie-mediatek.c
+++ b/drivers/pci/controller/pcie-mediatek.c
@@ -151,6 +151,7 @@ struct mtk_pcie_port;
 struct mtk_pcie_soc {
bool need_fix_class_id;
bool need_fix_device_id;
+   bool no_msi;
unsigned int device_id;
struct pci_ops *ops;
int (*startup)(struct mtk_pcie_port *port);
@@ -435,6 +436,9 @@ static int mtk_pcie_irq_domain_alloc(struct irq_domain 
*domain, unsigned int vir
struct mtk_pcie_port *port = domain->host_data;
unsigned long bit;
 
+   if (port->pcie->soc->no_msi)
+   return -ENOSPC;
+
WARN_ON(nr_irqs != 1);
mutex_lock(>lock);
 
@@ -966,11 +970,13 @@ static int mtk_pcie_parse_port(struct mtk_pcie *pcie,
port->slot = slot;
port->pcie = pcie;
 
-   if (pcie->soc->setup_irq) {
+   if (pcie->soc->setup_irq)
err = pcie->soc->setup_irq(port, node);
-   if (err)
-   return err;
-   }
+   else
+   err = mtk_pcie_allocate_msi_domains(port);
+
+   if (err)
+   return err;
 
INIT_LIST_HEAD(>list);
list_add_tail(>list, >ports);
@@ -1173,6 +1179,7 @@ static const struct dev_pm_ops mtk_pcie_pm_ops = {
 };
 
 static const struct mtk_pcie_soc mtk_pcie_soc_v1 = {
+   .no_msi = true,
.ops = _pcie_ops,
.startup = mtk_pcie_startup_port,
 };

-- 
Without deviation from the norm, progress is not possible.

Re: Using fixed LPI number for some Device ID

2020-10-31 Thread Marc Zyngier

On Sat, 31 Oct 2020 03:10:24 +,
Dongjiu Geng  wrote:

[...]

>   Sorry for the noise, Because Marc rarely uses the ARM email address,
>   so I replace to use Marc's kernel.org address instead of ARM email address.

Rarely is quite the understatement. I left ARM over a year ago, so the
likelihood of me answering at this address in vanishingly small.

Maybe in a parallel universe? ;-)

M.

-- 
Without deviation from the norm, progress is not possible.

Re: Using fixed LPI number for some Device ID

2020-10-31 Thread Marc Zyngier

Dongjiu,

On Sat, 31 Oct 2020 02:19:19 +,
Dongjiu Geng  wrote:
> 
> Hi Marc,
> Sorry to disturb you, Currently the LPI number is not fixed for the
> device. The LPI number is dynamically allocated start from 8092.
> For two OS which shares the ITS, One OS needs to configure the
> device interrupt required by another OS, and the other OS uses a
> fixed interrupt ID to respond the interrupt. Therefore, the LPI IRQ
> number of the device needed be fixed. I want to upstream this
> feature that allocate fixed LPI number for the device that is
> specified through the DTS. What is your meaning?  Thanks

I think you are starting from the wrong premises.

You can't "share" an ITS directly between two operating systems. The
ITS can only be controlled by a single operating system, because its
function goes way beyond allocating an LPI. How would you deal with
simple things such as masking an interrupt, which requires:

- Access to memory (configuration table)
- Access to the command queue (to insert an invalidation command)
- Access to MMIO registers (to kick the command queue into action)

all of which needs to be exclusive of concurrent modifications. How do
you propose this is implemented in a safe manner by two operating
systems which, by nature, distrust each other? Allocating LPIs is the
least of your problems, really.

If you need two concurrent OSs taking interrupts, use virtualisation.
That is its purpose. On your HW, you'll even get direct injection.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT

2020-10-29 Thread Marc Zyngier

On Mon, 26 Oct 2020 14:44:23 +, Will Deacon wrote:
> For consistency with the rest of the stage-2 page-table page allocations
> (performing using a kvm_mmu_memory_cache), ensure that __GFP_ACCOUNT is
> included in the GFP flags for the PGD pages.

Applied to next, thanks!

[1/1] KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
  commit: 7efe8ef274024ef1d5c495c79dfcbbff38c5f366

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v2 0/1] KVM: arm64: fix the mmio faulting

2020-10-29 Thread Marc Zyngier

On Mon, 26 Oct 2020 16:54:06 +0530, Santosh Shukla wrote:
> Description of the Reproducer scenario as asked in the thread [1].
> 
> Tried to create the reproducer scenario with vfio-pci driver using
> nvidia GPU in PT mode, As because vfio-pci driver now supports
> vma faulting (/vfio_pci_mmap_fault) so could create a crude reproducer
> situation with that.
> 
> [...]

Applied to next, thanks!

[1/1] KVM: arm64: Force PTE mapping on fault resulting in a device mapping
  commit: 91a2c34b7d6fadc9c5d9433c620ea4c32ee7cae8

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] KVM: arm64: Fix masks in stage2_pte_cacheable()

2020-10-29 Thread Marc Zyngier

On Thu, 29 Oct 2020 14:47:16 +, Will Deacon wrote:
> stage2_pte_cacheable() tries to figure out whether the mapping installed
> in its 'pte' parameter is cacheable or not. Unfortunately, it fails
> miserably because it extracts the memory attributes from the entry using
> FIELD_GET(), which returns the attributes shifted down to bit 0, but then
> compares this with the unshifted value generated by the PAGE_S2_MEMATTR()
> macro.
> 
> [...]

Applied to next, thanks!

[1/1] KVM: arm64: Fix masks in stage2_pte_cacheable()
  commit: e2fc6a9f686d037cbd9b08b9fb657685b4a722d3

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v2] KVM: arm64: Failback on unsupported huge page sizes

2020-10-29 Thread Marc Zyngier

On Mon, 26 Oct 2020 10:06:26 +1100, Gavin Shan wrote:
> The huge page could be mapped through multiple contiguous PMDs or PTEs.
> The corresponding huge page sizes aren't supported by the page table
> walker currently.
> 
> This fails the unsupported huge page sizes to the near one. Otherwise,
> the guest can't boot successfully: CONT_PMD_SHIFT and CONT_PTE_SHIFT
> fail back to PMD_SHIFT and PAGE_SHIFT separately.

Applied to next, thanks!

[1/1] KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
  commit: 2f40c46021bbb3ecd5c5f05764ecccbc276bc690

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v3] KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED

2020-10-28 Thread Marc Zyngier


On 2020-10-26 13:25, Will Deacon wrote:

On Fri, Oct 23, 2020 at 08:47:50AM -0700, Stephen Boyd wrote:

According to the SMCCC spec[1](7.5.2 Discovery) the
ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
SMCCC_RET_NOT_SUPPORTED.

 0 is "workaround required and safe to call this function"
 1 is "workaround not required but safe to call this function"
 SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who 
knows, I give up!"


SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, 
except
calling this function may not work because it isn't implemented in 
some

cases". Wonderful. We map this SMC call to

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

For KVM hypercalls (hvc), we've implemented this function id to return
SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those
isn't supposed to be there. Per the code we call
arm64_get_spectre_v2_state() to figure out what to return for this
feature discovery call.

 0 is SPECTRE_MITIGATED
 SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Let's clean this up so that KVM tells the guest this mapping:

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec

Cc: Andre Przywara 
Cc: Steven Price 
Cc: Marc Zyngier 
Cc: sta...@vger.kernel.org
Link: https://developer.arm.com/documentation/den0028/latest [1]
Fixes: c118bbb52743 ("arm64: KVM: Propagate full Spectre v2 workaround 
state to KVM guests")

Signed-off-by: Stephen Boyd 
---

I see that before commit c118bbb52743 ("arm64: KVM: Propagate full
Spectre v2 workaround state to KVM guests") we had this mapping:

 0 is SPECTRE_MITIGATED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

so the return value '1' wasn't there then. Once the commit was merged 
we

introduced the notion of NOT_REQUIRED here when it shouldn't have been
introduced.

Changes from v2:
 * Moved define to header file and used it

Changes from v1:
 * Way longer commit text, more background (sorry)
 * Dropped proton-pack part because it was wrong
 * Rebased onto other patch accepted upstream

 arch/arm64/kernel/proton-pack.c | 2 --
 arch/arm64/kvm/hypercalls.c | 2 +-
 include/linux/arm-smccc.h   | 2 ++
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/proton-pack.c 
b/arch/arm64/kernel/proton-pack.c

index 25f3c80b5ffe..c18eb7d41274 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -135,8 +135,6 @@ static enum mitigation_state 
spectre_v2_get_cpu_hw_mitigation_state(void)

return SPECTRE_VULNERABLE;
 }

-#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED   (1)
-
 static enum mitigation_state 
spectre_v2_get_cpu_fw_mitigation_state(void)

 {
int ret;
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 9824025ccc5c..25ea4ecb6449 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -31,7 +31,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
val = SMCCC_RET_SUCCESS;
break;
case SPECTRE_UNAFFECTED:
-   val = SMCCC_RET_NOT_REQUIRED;
+   val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
break;
}
break;
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 15c706fb0a37..0e50ba3e88d7 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -86,6 +86,8 @@
   ARM_SMCCC_SMC_32,\
   0, 0x7fff)

+#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED   1


I thought we'd stick this in asm/spectre.h, but here is also good:

Acked-by: Will Deacon 


Acked-by: Marc Zyngier 

Will, if you're about to send fixes to Linus, can you please pick
this one up?

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH kernel 1/2] irq: Add reference counting to IRQ mappings

2020-10-27 Thread Marc Zyngier


Hi Alexey,

On 2020-10-27 09:06, Alexey Kardashevskiy wrote:
PCI devices share 4 legacy INTx interrupts from the same PCI host 
bridge.

Device drivers map/unmap hardware interrupts via irq_create_mapping()/
irq_dispose_mapping(). The problem with that these interrupts are
shared and when performing hot unplug, we need to unmap the interrupt
only when the last device is released.

This reuses already existing irq_desc::kobj for this purpose.
The refcounter is naturally 1 when the descriptor is allocated already;
this adds kobject_get() in places where already existing mapped virq
is returned.


That's quite interesting, as I was about to revive a patch series that
rework the irqdomain subsystem to directly cache irq_desc instead of
raw interrupt numbers. And for that, I needed some form of 
refcounting...




This reorganizes irq_dispose_mapping() to release the kobj and let
the release callback do the cleanup.

If some driver or platform does its own reference counting, this 
expects

those parties to call irq_find_mapping() and call irq_dispose_mapping()
for every irq_create_fwspec_mapping()/irq_create_mapping().

Signed-off-by: Alexey Kardashevskiy 
---
 kernel/irq/irqdesc.c   | 35 +++
 kernel/irq/irqdomain.c | 27 +--
 2 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 1a7723604399..dae096238500 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -419,20 +419,39 @@ static struct irq_desc *alloc_desc(int irq, int
node, unsigned int flags,
return NULL;
 }

+static void delayed_free_desc(struct rcu_head *rhp);
 static void irq_kobj_release(struct kobject *kobj)
 {
struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
+   struct irq_domain *domain;
+   unsigned int virq = desc->irq_data.irq;

-   free_masks(desc);
-   free_percpu(desc->kstat_irqs);
-   kfree(desc);
+   domain = desc->irq_data.domain;
+   if (domain) {
+   if (irq_domain_is_hierarchy(domain)) {
+   irq_domain_free_irqs(virq, 1);


How does this work with hierarchical domains? Each domain should
contribute as a reference on the irq_desc. But if you got here,
it means the refcount has already dropped to 0.

So either there is nothing to free here, or you don't track the
references implied by the hierarchy. I suspect the latter.


+   } else {
+   irq_domain_disassociate(domain, virq);
+   irq_free_desc(virq);
+   }
+   }
+
+   /*
+* We free the descriptor, masks and stat fields via RCU. That
+* allows demultiplex interrupts to do rcu based management of
+* the child interrupts.
+* This also allows us to use rcu in kstat_irqs_usr().
+*/
+   call_rcu(>rcu, delayed_free_desc);
 }

 static void delayed_free_desc(struct rcu_head *rhp)
 {
struct irq_desc *desc = container_of(rhp, struct irq_desc, rcu);

-   kobject_put(>kobj);
+   free_masks(desc);
+   free_percpu(desc->kstat_irqs);
+   kfree(desc);
 }

 static void free_desc(unsigned int irq)
@@ -453,14 +472,6 @@ static void free_desc(unsigned int irq)
 */
irq_sysfs_del(desc);
delete_irq_desc(irq);
-
-   /*
-* We free the descriptor, masks and stat fields via RCU. That
-* allows demultiplex interrupts to do rcu based management of
-* the child interrupts.
-* This also allows us to use rcu in kstat_irqs_usr().
-*/
-   call_rcu(>rcu, delayed_free_desc);
 }

 static int alloc_descs(unsigned int start, unsigned int cnt, int node,
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cf8b374b892d..02733ddc321f 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -638,6 +638,7 @@ unsigned int irq_create_mapping(struct irq_domain 
*domain,

 {
struct device_node *of_node;
int virq;
+   struct irq_desc *desc;

pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq);

@@ -655,7 +656,9 @@ unsigned int irq_create_mapping(struct irq_domain 
*domain,

/* Check if mapping already exists */
virq = irq_find_mapping(domain, hwirq);
if (virq) {
+   desc = irq_to_desc(virq);
pr_debug("-> existing mapping on virq %d\n", virq);
+   kobject_get(>kobj);


My worry with this is that there is probably a significant amount of
code out there that relies on multiple calls to irq_create_mapping()
with the same parameters not to have any side effects. They would
expect a subsequent irq_dispose_mapping() to drop the translation
altogether, and that's obviously not the case here.

Have you audited the various call sites to see what could break?


return virq;
}

@@ -751,6 +754,7 @@ unsigned int irq_create_fwspec_mapping(struct
irq_fwspec *fwspec)

Re: [PATCH v3 03/16] arm64: Allow IPIs to be handled as normal interrupts

2020-10-27 Thread Marc Zyngier


On 2020-10-27 11:21, Vincent Guittot wrote:

On Tue, 27 Oct 2020 at 11:50, Vincent Guittot
 wrote:


On Tue, 27 Oct 2020 at 11:37, Marc Zyngier  wrote:
>
> On 2020-10-27 10:12, Vincent Guittot wrote:
> > HI Marc,
> >
> > On Mon, 19 Oct 2020 at 17:43, Vincent Guittot
> >  wrote:
> >>
> >> On Mon, 19 Oct 2020 at 15:04, Marc Zyngier  wrote:
> >> >
> >
> > ...
> >
> >> > >>
> >> > >> One of the major difference is that we end up, in some cases
> >> > >> (such as when performing IRQ time accounting on the scheduler
> >> > >> IPI), end up with nested irq_enter()/irq_exit() pairs.
> >> > >> Other than the (relatively small) overhead, there should be
> >> > >> no consequences to it (these pairs are designed to nest
> >> > >> correctly, and the accounting shouldn't be off).
> >> > >
> >> > > While rebasing on mainline, I have faced a performance regression for
> >> > > the benchmark:
> >> > > perf bench sched pipe
> >> > > on my arm64 dual quad core (hikey) and my 2 nodes x 112 CPUS (thx2)
> >> > >
> >> > > The regression comes from:
> >> > > commit: d3afc7f12987 ("arm64: Allow IPIs to be handled as normal
> >> > > interrupts")
> >> >
> >> > That's interesting, as this patch doesn't really change anything (most
> >> > of the potential overhead comes in later). The only potential overhead
> >> > I can see is that the scheduler_ipi() call is now wrapped around
> >> > irq_enter()/irq_exit().
> >> >
> >> > >
> >> > >   v5.9  + this patch
> >> > > hikey :   48818(+/- 0.31)   37503(+/- 0.15%)  -23.2%
> >> > > thx2  :  132410(+/- 1.72)  122646(+/- 1.92%)   -7.4%
> >> > >
> >> > > By + this patch,  I mean merging branch from this patch. Whereas
> >> > > merging the previous:
> >> > > commit: 83cfac95c018 ("genirq: Allow interrupts to be excluded from
> >> > > /proc/interrupts")
> >> > >  It doesn't show any regression
> >> >
> >> > Since you are running perf, can you spot where the overhead occurs?
> >
> > Any idea about the root cause of the regression ?
> > I have faced it on more arm64 platforms in the meantime
>
> two possible causes:
>
> (1) irq_enter/exit on the rescheduling IPI means we reschedule much more
> often
> (2) irq_domain lookups add some overhead.
>
> For (1), I have this series[1] which is ugly as sin and needs much more
> testing.

Ok, I'm going to test this series to see if it fixes the perf 
regression


You have spotted the root cause of the regression. We are back to ~1%
performance diff on the hikey


Yeah. Only thing is that I can't look at this hack without vomiting...

M.
--
Jazz is not dead. It just smells funny...

Re: [v2 01/11] irqchip: ls-extirq: Add LS1043A, LS1088A external interrupt

2020-10-27 Thread Marc Zyngier

On 2020-10-27 10:35, Biwen Li (OSS) wrote:

On 2020-10-27 04:46, Biwen Li wrote:
> From: Hou Zhiqiang 
>
> Add an new IRQ chip declaration for LS1043A and LS1088A
> - compatible "fsl,ls1043a-extirq" for LS1043A, LS1046A.
> SCFG_INTPCR[31:0]
>   of these SoCs is stored/read as SCFG_INTPCR[0:31] defaultly(bit
>   reverse)
> - compatible "fsl,ls1088a-extirq" for LS1088A, LS208xA, LX216xA
>
> Signed-off-by: Hou Zhiqiang 
> Signed-off-by: Biwen Li 

You clearly couldn't be bothered to read what I wrote in my earlier 
replies. I'm

thus ignoring this series...

Okay, got it.

> ---
> Change in v2:
>- add despcription of bit reverse
>- update copyright
>
>  drivers/irqchip/irq-ls-extirq.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-ls-extirq.c
> b/drivers/irqchip/irq-ls-extirq.c index 4d1179fed77c..9587bc2607fc
> 100644
> --- a/drivers/irqchip/irq-ls-extirq.c
> +++ b/drivers/irqchip/irq-ls-extirq.c
> @@ -1,5 +1,8 @@
>  // SPDX-License-Identifier: GPL-2.0
> -
> +/*
> + * Author: Rasmus Villemoes 
> + * Copyright 2020 NXP

... specially when you keep attributing someone else's copyright to 
NXP.

Then I don't know how to add the copyright, any suggestions?

Simple. You don't add anything. NXP's copyright doesn't apply to this
file before this patch, and your changes are so trivial that they don't
really warrant a mention. Furthermore, the git history already keeps 
track

of who did what.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v3 03/16] arm64: Allow IPIs to be handled as normal interrupts

2020-10-27 Thread Marc Zyngier


On 2020-10-27 10:12, Vincent Guittot wrote:

HI Marc,

On Mon, 19 Oct 2020 at 17:43, Vincent Guittot
 wrote:


On Mon, 19 Oct 2020 at 15:04, Marc Zyngier  wrote:
>


...


> >>
> >> One of the major difference is that we end up, in some cases
> >> (such as when performing IRQ time accounting on the scheduler
> >> IPI), end up with nested irq_enter()/irq_exit() pairs.
> >> Other than the (relatively small) overhead, there should be
> >> no consequences to it (these pairs are designed to nest
> >> correctly, and the accounting shouldn't be off).
> >
> > While rebasing on mainline, I have faced a performance regression for
> > the benchmark:
> > perf bench sched pipe
> > on my arm64 dual quad core (hikey) and my 2 nodes x 112 CPUS (thx2)
> >
> > The regression comes from:
> > commit: d3afc7f12987 ("arm64: Allow IPIs to be handled as normal
> > interrupts")
>
> That's interesting, as this patch doesn't really change anything (most
> of the potential overhead comes in later). The only potential overhead
> I can see is that the scheduler_ipi() call is now wrapped around
> irq_enter()/irq_exit().
>
> >
> >   v5.9  + this patch
> > hikey :   48818(+/- 0.31)   37503(+/- 0.15%)  -23.2%
> > thx2  :  132410(+/- 1.72)  122646(+/- 1.92%)   -7.4%
> >
> > By + this patch,  I mean merging branch from this patch. Whereas
> > merging the previous:
> > commit: 83cfac95c018 ("genirq: Allow interrupts to be excluded from
> > /proc/interrupts")
> >  It doesn't show any regression
>
> Since you are running perf, can you spot where the overhead occurs?


Any idea about the root cause of the regression ?
I have faced it on more arm64 platforms in the meantime


two possible causes:

(1) irq_enter/exit on the rescheduling IPI means we reschedule much more 
often

(2) irq_domain lookups add some overhead.

For (1), I have this series[1] which is ugly as sin and needs much more 
testing.


For (2), I have some ideas which need more work (let the irq domain 
resolve to
an irq_desc instead of an interrupt number, avoiding another radix-tree 
lookup).


M.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/ipi-fixes

--
Jazz is not dead. It just smells funny...

Re: [v2 01/11] irqchip: ls-extirq: Add LS1043A, LS1088A external interrupt

2020-10-27 Thread Marc Zyngier


On 2020-10-27 04:46, Biwen Li wrote:

From: Hou Zhiqiang 

Add an new IRQ chip declaration for LS1043A and LS1088A
- compatible "fsl,ls1043a-extirq" for LS1043A, LS1046A. 
SCFG_INTPCR[31:0]

  of these SoCs is stored/read as SCFG_INTPCR[0:31] defaultly(bit
  reverse)
- compatible "fsl,ls1088a-extirq" for LS1088A, LS208xA, LX216xA

Signed-off-by: Hou Zhiqiang 
Signed-off-by: Biwen Li 


You clearly couldn't be bothered to read what I wrote in my earlier
replies. I'm thus ignoring this series...


---
Change in v2:
- add despcription of bit reverse
- update copyright

 drivers/irqchip/irq-ls-extirq.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ls-extirq.c 
b/drivers/irqchip/irq-ls-extirq.c

index 4d1179fed77c..9587bc2607fc 100644
--- a/drivers/irqchip/irq-ls-extirq.c
+++ b/drivers/irqchip/irq-ls-extirq.c
@@ -1,5 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
-
+/*
+ * Author: Rasmus Villemoes 
+ * Copyright 2020 NXP


... specially when you keep attributing someone else's copyright to NXP.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH 2/2] irqchip: bcm2836: fix section mismatch warning

2020-10-27 Thread Marc Zyngier


On 2020-10-27 08:51, ba...@kernel.org wrote:

From: Felipe Balbi 

Fix the following warning:

WARNING: modpost: vmlinux.o(.text.unlikely+0x17b2c): Section mismatch
in reference from the function bcm2836_arm_irqchip_smp_init() to the
function .init.text:set_smp_ipi_range()
The function bcm2836_arm_irqchip_smp_init() references
the function __init set_smp_ipi_range().
This is often because bcm2836_arm_irqchip_smp_init lacks a __init
annotation or the annotation of set_smp_ipi_range is wrong.

Signed-off-by: Felipe Balbi 
---
 drivers/irqchip/irq-bcm2836.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-bcm2836.c 
b/drivers/irqchip/irq-bcm2836.c

index 97838eb705f9..cbc7c740e4dc 100644
--- a/drivers/irqchip/irq-bcm2836.c
+++ b/drivers/irqchip/irq-bcm2836.c
@@ -244,7 +244,7 @@ static int bcm2836_cpu_dying(unsigned int cpu)

 #define BITS_PER_MBOX  32

-static void bcm2836_arm_irqchip_smp_init(void)
+static void __init bcm2836_arm_irqchip_smp_init(void)
 {
struct irq_fwspec ipi_fwspec = {
.fwnode = intc.domain->fwnode,


I already have a fix for this one[1], which should be in -next.

Thanks,

M.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-next=57733e009f0c7e0526e10a18be12f56996c5460e

--
Jazz is not dead. It just smells funny...

Re: [PATCH AUTOSEL 5.9 087/147] genirq: Add stub for set_handle_irq() when !GENERIC_IRQ_MULTI_HANDLER

2020-10-27 Thread Marc Zyngier


On 2020-10-26 23:48, Sasha Levin wrote:

From: Zhen Lei 

[ Upstream commit ea0c80d1764449acf2f70fdb25aec33800cd0348 ]

In order to avoid compilation errors when a driver references 
set_handle_irq(),

but that the architecture doesn't select GENERIC_IRQ_MULTI_HANDLER,
add a stub function that will just WARN_ON_ONCE() if ever used.

Signed-off-by: Zhen Lei 
[maz: commit message]
Signed-off-by: Marc Zyngier 
Link: 
https://lore.kernel.org/r/20200924071754.4509-2-thunder.leiz...@huawei.com

Signed-off-by: Sasha Levin 
---
 include/linux/irq.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 1b7f4dfee35b3..b167baef88c0b 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -1252,6 +1252,12 @@ int __init set_handle_irq(void
(*handle_irq)(struct pt_regs *));
  * top-level IRQ handler.
  */
 extern void (*handle_arch_irq)(struct pt_regs *) __ro_after_init;
+#else
+#define set_handle_irq(handle_irq) \
+   do {\
+   (void)handle_irq;   \
+   WARN_ON(1); \
+   } while (0)
 #endif

 #endif /* _LINUX_IRQ_H */


What is the reason for this backport? The only user is a driver that
isn't getting backported (d59f7d159891 and following patches).

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH 0/2] irq-meson-gpio: make it possible to build as a module

2020-10-26 Thread Marc Zyngier


On 2020-10-26 17:28, Kevin Hilman wrote:

Marc Zyngier  writes:


On 2020-10-26 16:18, Kevin Hilman wrote:

Marc Zyngier  writes:


On Tue, 20 Oct 2020 08:25:30 +0100,
Neil Armstrong  wrote:


In order to reduce the kernel Image size on multi-platform
distributions,
make it possible to build the Amlogic GPIO IRQ controller as a 
module

by switching it to a platform driver.

The second patch removes MESON_IRQ_GPIO selection from ARCH_MESON 
to

allow
building the driver as module.

Neil Armstrong (2):
  irqchip: irq-meson-gpio: make it possible to build as a module
  arm64: meson: remove MESON_IRQ_GPIO selection

 arch/arm64/Kconfig.platforms |  1 -
 drivers/irqchip/Kconfig  |  5 +-
 drivers/irqchip/irq-meson-gpio.c | 89

 3 files changed, 59 insertions(+), 36 deletions(-)


I've tried this series on my vim3l with the this driver compiled as 
a

module, and lost the Ethernet interface in the process, as the phy
wasn't able to resolve its interrupt and things fail later on:

[   72.238291] meson8b-dwmac ff3f.ethernet eth1: no phy at addr 
-1

[   72.238917] meson8b-dwmac ff3f.ethernet eth1: stmmac_open:
Cannot attach to PHY (error: -19)

This is a generic problem with making DT-based interrupt controllers
modular when not *all* the drivers can deal with probing deferral.


Yes, but this series still keeps the default as built-in.

If you build as a module, and you add `fw_devlink=on` to the kernel
command-line, device-links will be created based on DT dependencies
which will ensure the right module load order.


It doesn't work here. I get the exact same error (well, with eth0
instead
of eth1). In my experience, fw_devlink isn't reliable yet. Config on
request.


Other than CONFIG_MESON_IRQ_GPIO=m, are you using default upstream
defconfig?


I use something that is much closer to a Debian configuration, given 
that

the same kernel as to run on *all* the systems I have access to.


I just double-checked with this series on top of v5.10-rc1, and I have
eth0 working with interrupts without needing fw_devlink=on.

With the default upstream defconfig all the networking for these 
devices

is already configured as modules.


dmesg: http://www.loen.fr/tmp/dmesg
config: http://www.loen.fr/tmp/Config.full-arm64

root@tiger-roach:~# lsmod
Module  Size  Used by
macvtap16384  0
macvlan32768  1 macvtap
tap32768  1 macvtap
nls_ascii  16384  1
nls_cp437  20480  1
vfat   28672  1
fat81920  1 vfat
aes_ce_blk 36864  0
crypto_simd24576  1 aes_ce_blk
cryptd 28672  1 crypto_simd
aes_ce_cipher  20480  1 aes_ce_blk
ghash_ce   24576  0
gf128mul   16384  1 ghash_ce
sha2_ce20480  0
sha256_arm64   28672  1 sha2_ce
sha1_ce20480  0
panfrost   69632  0
gpu_sched  45056  1 panfrost
meson_saradc   24576  0
industrialio   90112  1 meson_saradc
irq_meson_gpio 20480  0
pwm_meson  20480  1
meson_dw_hdmi  24576  0
meson_drm  61440  1 meson_dw_hdmi
meson_canvas   16384  1 meson_drm
dw_hdmi53248  1 meson_dw_hdmi
cec57344  1 dw_hdmi
drm_kms_helper258048  4 meson_dw_hdmi,meson_drm,dw_hdmi
meson_rng  16384  0
rng_core   24576  1 meson_rng
cpufreq_dt 20480  0
leds_gpio  16384  0
drm   606208  7 
gpu_sched,meson_dw_hdmi,meson_drm,drm_kms_helper,dw_hdmi,panfrost

ip_tables  32768  0
x_tables   45056  1 ip_tables
autofs449152  2
xhci_plat_hcd  20480  0
dwc2  249856  0
dwc3  151552  0
ulpi   20480  1 dwc3
udc_core   69632  2 dwc2,dwc3
rtc_hym856320480  0
meson_gxl  20480  0
realtek24576  0
dwmac_generic  16384  0
dwc3_meson_g12a24576  0
meson_gx_mmc   24576  0
xhci_pci   24576  0
igb   237568  0
xhci_hcd  290816  2 xhci_pci,xhci_plat_hcd
i2c_meson  20480  0
mdio_mux_meson_g12a16384  0
mdio_mux   16384  1 mdio_mux_meson_g12a
nvme   45056  2
nvme_core 110592  4 nvme
i2c_algo_bit   16384  1 igb
t10_pi 16384  1 nvme_core
usbcore   311296  4 xhci_hcd,dwc2,xhci_pci,xhci_plat_hcd
dwmac_meson8b  16384  0
stmmac_platform24576  2 dwmac_meson8b,dwmac_generic
stmmac204800  3 
dwmac_meson8b,stmmac_platform,dwmac_generic

pcs_xpcs   20480  1 stmmac
phylink45056  1 stmmac
of_mdio20480  4 stmmac_platform,mdio_mux,stmmac,phylink
fixed_phy  16384  1 of_mdio
pwm_regulator  20480  1
libphy

Re: [PATCH 0/2] irq-meson-gpio: make it possible to build as a module

2020-10-26 Thread Marc Zyngier


On 2020-10-26 16:18, Kevin Hilman wrote:

Marc Zyngier  writes:


On Tue, 20 Oct 2020 08:25:30 +0100,
Neil Armstrong  wrote:


In order to reduce the kernel Image size on multi-platform 
distributions,

make it possible to build the Amlogic GPIO IRQ controller as a module
by switching it to a platform driver.

The second patch removes MESON_IRQ_GPIO selection from ARCH_MESON to 
allow

building the driver as module.

Neil Armstrong (2):
  irqchip: irq-meson-gpio: make it possible to build as a module
  arm64: meson: remove MESON_IRQ_GPIO selection

 arch/arm64/Kconfig.platforms |  1 -
 drivers/irqchip/Kconfig  |  5 +-
 drivers/irqchip/irq-meson-gpio.c | 89 


 3 files changed, 59 insertions(+), 36 deletions(-)


I've tried this series on my vim3l with the this driver compiled as a
module, and lost the Ethernet interface in the process, as the phy
wasn't able to resolve its interrupt and things fail later on:

[   72.238291] meson8b-dwmac ff3f.ethernet eth1: no phy at addr -1
[   72.238917] meson8b-dwmac ff3f.ethernet eth1: stmmac_open: 
Cannot attach to PHY (error: -19)


This is a generic problem with making DT-based interrupt controllers
modular when not *all* the drivers can deal with probing deferral.


Yes, but this series still keeps the default as built-in.

If you build as a module, and you add `fw_devlink=on` to the kernel
command-line, device-links will be created based on DT dependencies
which will ensure the right module load order.


It doesn't work here. I get the exact same error (well, with eth0 
instead
of eth1). In my experience, fw_devlink isn't reliable yet. Config on 
request.



I've tested this series with `fw_devlink=on` on several Amlogic
platforms and it works just fine, but since it requires the extra
cmdline option, I think the default should remain built-in.

So, I'd still like to see this series merged so that at least it's an
option to enable this as a module.


I have taken similar patches in 5.9 for other SoC families (qcomm, mtk),
and ended up reverting them in -rc2, because there is simply too much
breakage. Even keeping it as built in changes the init order, which
tons of drivers depend on. I proposed a middle-of-the-road approach
(modules can break, built-in stays the same) which Rob pushed back on.

So either we fix fw_devlink to work for everything and be on by default,
or we keep the current setup.


Also, another reason to make it optional is that not all platforms need
this feature at all, but right now we select it for all Amlogic SoCs.


I understand that, but I don't want another episode of widespread
breakages, and this series definitely breaks things.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RESEND 01/11] irqchip: ls-extirq: Add LS1043A, LS1088A external interrupt

2020-10-26 Thread Marc Zyngier

On 2020-10-26 15:06, Leo Li wrote:

-Original Message-
From: Marc Zyngier 
Sent: Monday, October 26, 2020 4:23 AM
To: Rasmus Villemoes 
Cc: Biwen Li (OSS) ; shawn...@kernel.org;
robh...@kernel.org; mark.rutl...@arm.com; Leo Li ;
Z.q. Hou ; t...@linutronix.de;
ja...@lakedaemon.net; devicet...@vger.kernel.org; linux-
ker...@vger.kernel.org; Jiafei Pan ; Xiaobo Xie
; linux-arm-ker...@lists.infradead.org; Biwen Li

Subject: Re: [RESEND 01/11] irqchip: ls-extirq: Add LS1043A, LS1088A 
external

interrupt

On 2020-10-26 09:06, Rasmus Villemoes wrote:
> On 26/10/2020 09.44, Marc Zyngier wrote:
>> On 2020-10-26 08:01, Biwen Li wrote:
>>> From: Hou Zhiqiang 
>>>
>>> Add an new IRQ chip declaration for LS1043A and LS1088A
>>> - compatible "fsl,ls1043a-extirq" for LS1043A, LS1046A
>>> - compatible "fsl,ls1088a-extirq" for LS1088A, LS208xA, LX216xA
>>
>> Three things:
>> - This commit message doesn't describe the bit_reverse change
>
> Yeah, please elaborate on that, as the RM for 1043 or 1046 doesn't
> mention anything about bit reversal for the scfg registers - they
> don't seem to have the utter nonsense that is SCFG_SCFGREVCR, but
> perhaps, instead of removing it, that has just become a hard-coded
> part of the IP.
>
> Also, IANAL etc., but
>
>>> +// Copyright 2019-2020 NXP
>
> really? Seems to be a bit of a stretch.
>
> At the very least, cc'ing the original author and only person to ever
> touch that file would have been appreciated.

Huh. Well spotted. That's definitely not on.
NXP people, please talk to your legal department.

We do have an internal policy to require developer adding/updating NXP
copyright on non-trivial changes.  I'm not sure if this change should
be considered trivial, but adding copyright claim on a file without
prior copyright claims could causing confusion like in this case.

The copyright exists implicitly, and doesn't require a copyright claim
in the file itself. Please talk to your legal department.

One
potential solution is to add a more specific description on the NXP
change together with the copyright claim.  But maybe an easier
solution is to add Rasmus your Copyright claim first if you are ok
with it.

That's for Rasmus to decide whether he wants to add such a claim,
given that it exists implicitly. Adding copyright claims for any
odd change you make isn't acceptable either (your changes are already
unambiguously identified in git).

For the time being, I'm not taking any NXP patch carrying additional
copyright update until this is clarified.

M.
--
Jazz is not dead. It just smells funny...

Re: [RESEND 01/11] irqchip: ls-extirq: Add LS1043A, LS1088A external interrupt

2020-10-26 Thread Marc Zyngier


On 2020-10-26 09:06, Rasmus Villemoes wrote:

On 26/10/2020 09.44, Marc Zyngier wrote:

On 2020-10-26 08:01, Biwen Li wrote:

From: Hou Zhiqiang 

Add an new IRQ chip declaration for LS1043A and LS1088A
- compatible "fsl,ls1043a-extirq" for LS1043A, LS1046A
- compatible "fsl,ls1088a-extirq" for LS1088A, LS208xA, LX216xA


Three things:
- This commit message doesn't describe the bit_reverse change


Yeah, please elaborate on that, as the RM for 1043 or 1046 doesn't
mention anything about bit reversal for the scfg registers - they don't
seem to have the utter nonsense that is SCFG_SCFGREVCR, but perhaps,
instead of removing it, that has just become a hard-coded part of the 
IP.


Also, IANAL etc., but


+// Copyright 2019-2020 NXP


really? Seems to be a bit of a stretch.

At the very least, cc'ing the original author and only person to ever
touch that file would have been appreciated.


Huh. Well spotted. That's definitely not on.
NXP people, please talk to your legal department.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH 3/3] KVM: arm64: Failback on unsupported huge page sizes

2020-10-26 Thread Marc Zyngier


On 2020-10-25 23:04, Gavin Shan wrote:

Hi Marc,

On 10/25/20 9:48 PM, Marc Zyngier wrote:

On Sun, 25 Oct 2020 01:27:39 +0100,
Gavin Shan  wrote:


The huge page could be mapped through multiple contiguous PMDs or 
PTEs.

The corresponding huge page sizes aren't supported by the page table
walker currently.

This fails the unsupported huge page sizes to the near one. 
Otherwise,

the guest can't boot successfully: CONT_PMD_SHIFT and CONT_PTE_SHIFT
fail back to PMD_SHIFT and PAGE_SHIFT separately.

Signed-off-by: Gavin Shan 
---
  arch/arm64/kvm/mmu.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 0f51585adc04..81cbdc368246 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -793,12 +793,20 @@ static int user_mem_abort(struct kvm_vcpu 
*vcpu, phys_addr_t fault_ipa,

vma_shift = PMD_SHIFT;
  #endif
  + if (vma_shift == CONT_PMD_SHIFT)
+   vma_shift = PMD_SHIFT;
+
if (vma_shift == PMD_SHIFT &&
!fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
force_pte = true;
vma_shift = PAGE_SHIFT;
}
  + if (vma_shift == CONT_PTE_SHIFT) {
+   force_pte = true;
+   vma_shift = PAGE_SHIFT;
+   }
+
vma_pagesize = 1UL << vma_shift;
if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
fault_ipa &= ~(vma_pagesize - 1);


Yup, nice catch. However, I think we should take this opportunity to
rationalise the logic here, and catch future discrepancies (should
someone add contiguous PUD or something similarly silly). How about
something like this (untested):



Yeah, I started the work to support contiguous PMDs/PTEs, but I'm not
sure when I can post the patches for review as my time becomes a bit
fragmented recently. At least, I need focus on "async page fault" in
the coming weeks :)

Thanks for the suggested code and it worked for me. I'll post v2 to
integrate them. However, I would like to drop PATCH[1] and PATCH[2]
as I really don't have strong reasons to have them.


Yes, please drop these patches, and focus on the actual bug fix.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH 1/3] KVM: arm64: Check if 52-bits PA is enabled

2020-10-26 Thread Marc Zyngier


On 2020-10-25 22:23, Gavin Shan wrote:

Hi Marc,

On 10/25/20 8:52 PM, Marc Zyngier wrote:

On Sun, 25 Oct 2020 01:27:37 +0100,
Gavin Shan  wrote:


The 52-bits physical address is disabled until 
CONFIG_ARM64_PA_BITS_52
is chosen. This uses option for that check, to avoid the 
unconditional

check on PAGE_SHIFT in the hot path and thus save some CPU cycles.


PAGE_SHIFT is known at compile time, and this code is dropped by the
compiler if the selected page size is not 64K. This patch really only
makes the code slightly less readable and the "CPU cycles" argument
doesn't hold at all.

So what are you trying to solve exactly?



There are two points covered by the patch: (1) The 52-bits physical 
address
is visible only when CONFIG_ARM64_PA_BITS_52 is enabled in arch/arm64 
code.
The code looks consistent with this option used here. (2) I had the 
assumption
that gcc doesn't optimize the code and PAGE_SHIFT is always checked in 
order
to get higher 4 physical address bits, but you said gcc should optimize 
the
code accordingly. However, it would be still nice to make the code 
explicit.


Conditional compilation only results in more breakages, specially for 
configs
that hardly anyone uses (big-endian and 64K pages are the two that 
bitrot very

quickly).

So if anything can build without #ifdef, I'll take that anytime. If the 
compiler

doesn't optimize it away, let's fix the compiler.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RESEND 01/11] irqchip: ls-extirq: Add LS1043A, LS1088A external interrupt

2020-10-26 Thread Marc Zyngier


On 2020-10-26 08:01, Biwen Li wrote:

From: Hou Zhiqiang 

Add an new IRQ chip declaration for LS1043A and LS1088A
- compatible "fsl,ls1043a-extirq" for LS1043A, LS1046A
- compatible "fsl,ls1088a-extirq" for LS1088A, LS208xA, LX216xA


Three things:
- This commit message doesn't describe the bit_reverse change
- Please add a cover letter
- Sending the same series again after 4 days is not OK, specially when
  the initial one was during the merge window.

Thanks,

M.



Signed-off-by: Hou Zhiqiang 
Signed-off-by: Biwen Li 
---
 drivers/irqchip/irq-ls-extirq.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/irqchip/irq-ls-extirq.c 
b/drivers/irqchip/irq-ls-extirq.c

index 4d1179fed77c..564e6de0bd8e 100644
--- a/drivers/irqchip/irq-ls-extirq.c
+++ b/drivers/irqchip/irq-ls-extirq.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// Copyright 2019-2020 NXP

 #define pr_fmt(fmt) "irq-ls-extirq: " fmt

@@ -183,6 +184,9 @@ ls_extirq_of_init(struct device_node *node, struct
device_node *parent)
priv->bit_reverse = (revcr != 0);
}

+   if (of_device_is_compatible(node, "fsl,ls1043a-extirq"))
+   priv->bit_reverse = true;
+
domain = irq_domain_add_hierarchy(parent_domain, 0, priv->nirq, node,
  _domain_ops, priv);
if (!domain)
@@ -195,3 +199,5 @@ ls_extirq_of_init(struct device_node *node, struct
device_node *parent)
 }

 IRQCHIP_DECLARE(ls1021a_extirq, "fsl,ls1021a-extirq", 
ls_extirq_of_init);
+IRQCHIP_DECLARE(ls1043a_extirq, "fsl,ls1043a-extirq", 
ls_extirq_of_init);
+IRQCHIP_DECLARE(ls1088a_extirq, "fsl,ls1088a-extirq", 
ls_extirq_of_init);


--
Jazz is not dead. It just smells funny...

Re: [PATCH] irqchip/sifive-plic: Fix broken irq_set_affinity() callback

2020-10-25 Thread Marc Zyngier

On Tue, 20 Oct 2020 16:15:32 +0800, Greentime Hu wrote:
> It will always enable the interrupt after calling plic_set_affinity()
> however it should set to it previous setting. Staying disabled or enabled.
> 
> This patch can also fix this pwm hang issue in Unleashed board.
> 
> [  919.015783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [  919.020922] rcu: 0-...0: (0 ticks this GP)
> idle=7d2/1/0x4002 softirq=1424/1424 fqs=105807
> [  919.030295]  (detected by 1, t=225825 jiffies, g=1561, q=3496)
> [  919.036109] Task dump for CPU 0:
> [  919.039321] kworker/0:1 R  running task030  2 
> 0x0008
> [  919.046359] Workqueue: events set_brightness_delayed
> [  919.051302] Call Trace:
> [  919.053738] [] __schedule+0x194/0x4de
> [  982.035783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [  982.040923] rcu: 0-...0: (0 ticks this GP)
> idle=7d2/1/0x4002 softirq=1424/1424 fqs=113325
> [  982.050294]  (detected by 1, t=241580 jiffies, g=1561, q=3509)
> [  982.056108] Task dump for CPU 0:
> [  982.059321] kworker/0:1 R  running task030  2 
> 0x0008
> [  982.066359] Workqueue: events set_brightness_delayed
> [  982.071302] Call Trace:
> [  982.073739] [] __schedule+0x194/0x4de
> [..]

Applied to irq/irqchip-next, with some commit message adjustments.

[1/1] irqchip/sifive-plic: Fix broken irq_set_affinity() callback
  commit: a7480c5d725c4ecfc627e70960f249c34f5d13e8

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 0/3] Add STM32 LP timer EXTI interrupts

2020-10-25 Thread Marc Zyngier

On Fri, 16 Oct 2020 16:40:16 +0200, Fabrice Gasnier wrote:
> STM32 LP timer that's available on STM32MP15x can wakeup the platform
> using EXTI interrupts.
> 
> This series add:
> - LP timer EXTI - GIC interrupt events to EXTI driver and device-tree
> - LP timer wakeup-source to device-tree
> 
> [...]

Applied to irq/irqchip-next, thanks!

[1/3] irqchip/stm32-exti: Add all LP timer exti direct events support
  commit: a00e85b581fd5ee47e770b6b8d2038dbebbe81f9

Please route the last two patches via arm-soc.

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 0/2] irq-meson-gpio: make it possible to build as a module

2020-10-25 Thread Marc Zyngier

On Tue, 20 Oct 2020 08:25:30 +0100,
Neil Armstrong  wrote:
> 
> In order to reduce the kernel Image size on multi-platform distributions,
> make it possible to build the Amlogic GPIO IRQ controller as a module
> by switching it to a platform driver.
> 
> The second patch removes MESON_IRQ_GPIO selection from ARCH_MESON to allow
> building the driver as module.
> 
> Neil Armstrong (2):
>   irqchip: irq-meson-gpio: make it possible to build as a module
>   arm64: meson: remove MESON_IRQ_GPIO selection
> 
>  arch/arm64/Kconfig.platforms |  1 -
>  drivers/irqchip/Kconfig  |  5 +-
>  drivers/irqchip/irq-meson-gpio.c | 89 
>  3 files changed, 59 insertions(+), 36 deletions(-)

I've tried this series on my vim3l with the this driver compiled as a
module, and lost the Ethernet interface in the process, as the phy
wasn't able to resolve its interrupt and things fail later on:

[   72.238291] meson8b-dwmac ff3f.ethernet eth1: no phy at addr -1
[   72.238917] meson8b-dwmac ff3f.ethernet eth1: stmmac_open: Cannot attach 
to PHY (error: -19)

This is a generic problem with making DT-based interrupt controllers
modular when not *all* the drivers can deal with probing deferral.

M.

-- 
Without deviation from the norm, progress is not possible.

Re: WARNING: modpost: vmlinux.o(.text.unlikely+0x11494): Section mismatch in reference from the function bcm2836_arm_irqchip_smp_init() to the function .init.text:set_smp_ipi_range()

2020-10-25 Thread Marc Zyngier

On Sun, 25 Oct 2020 01:09:21 +0100,
kernel test robot  wrote:
> 
> [1  ]
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   d76913908102044f14381df865bb74df17a538cb
> commit: 0809ae724904c3c5dbdddf4169d48aac9c6fcdc8 irqchip/bcm2836: Configure 
> mailbox interrupts as standard interrupts
> date:   5 weeks ago
> config: arm64-randconfig-p001-20201025 (attached as .config)
> compiler: aarch64-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0809ae724904c3c5dbdddf4169d48aac9c6fcdc8
> git remote add linus 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout 0809ae724904c3c5dbdddf4169d48aac9c6fcdc8
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> ARCH=arm64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 

Fix queued, thanks for the report.

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 3/3] KVM: arm64: Failback on unsupported huge page sizes

2020-10-25 Thread Marc Zyngier

On Sun, 25 Oct 2020 01:27:39 +0100,
Gavin Shan  wrote:
> 
> The huge page could be mapped through multiple contiguous PMDs or PTEs.
> The corresponding huge page sizes aren't supported by the page table
> walker currently.
> 
> This fails the unsupported huge page sizes to the near one. Otherwise,
> the guest can't boot successfully: CONT_PMD_SHIFT and CONT_PTE_SHIFT
> fail back to PMD_SHIFT and PAGE_SHIFT separately.
> 
> Signed-off-by: Gavin Shan 
> ---
>  arch/arm64/kvm/mmu.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 0f51585adc04..81cbdc368246 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -793,12 +793,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   vma_shift = PMD_SHIFT;
>  #endif
>  
> + if (vma_shift == CONT_PMD_SHIFT)
> + vma_shift = PMD_SHIFT;
> +
>   if (vma_shift == PMD_SHIFT &&
>   !fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
>   force_pte = true;
>   vma_shift = PAGE_SHIFT;
>   }
>  
> + if (vma_shift == CONT_PTE_SHIFT) {
> + force_pte = true;
> + vma_shift = PAGE_SHIFT;
> + }
> +
>   vma_pagesize = 1UL << vma_shift;
>   if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
>   fault_ipa &= ~(vma_pagesize - 1);

Yup, nice catch. However, I think we should take this opportunity to
rationalise the logic here, and catch future discrepancies (should
someone add contiguous PUD or something similarly silly). How about
something like this (untested):

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index cc323d96c9d4..d9a13a8a82e0 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -787,14 +787,31 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
vma_shift = PAGE_SHIFT;
}
 
-   if (vma_shift == PUD_SHIFT &&
-   !fault_supports_stage2_huge_mapping(memslot, hva, PUD_SIZE))
-  vma_shift = PMD_SHIFT;
+   switch (vma_shift) {
+   case PUD_SHIFT:
+   if (fault_supports_stage2_huge_mapping(memslot, hva, PUD_SIZE))
+   break;
+   fallthrough;
 
-   if (vma_shift == PMD_SHIFT &&
-   !fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
-   force_pte = true;
+   case CONT_PMD_SHIFT:
+   vma_shift = PMD_SHIFT;
+   fallthrough;
+
+   case PMD_SHIFT:
+   if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE))
+   break;
+   fallthrough;
+
+   case CONT_PTE_SHIFT:
vma_shift = PAGE_SHIFT;
+   force_pte = true;
+   fallthrough;
+
+   case PAGE_SHIFT:
+   break;
+
+   default:
+   WARN_ONCE(1, "Unknown vma_shift %d", vma_shift);
}
 
vma_pagesize = 1UL << vma_shift;


Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 2/3] KVM: arm64: Don't map PUD huge page if it's not available

2020-10-25 Thread Marc Zyngier

On Sun, 25 Oct 2020 01:27:38 +0100,
Gavin Shan  wrote:
> 
> PUD huge page isn't available when CONFIG_ARM64_4K_PAGES is disabled.
> In this case, we needn't try to map the memory through PUD huge pages
> to save some CPU cycles in the hot path.
> 
> This also corrects the code style issue, which was introduced by
> commit <523b3999e5f6> ("KVM: arm64: Try PMD block mappings if PUD mappings
> are not supported").
> 
> Signed-off-by: Gavin Shan 
> ---
>  arch/arm64/kvm/mmu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index a816cb8e619b..0f51585adc04 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -787,9 +787,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   vma_shift = PAGE_SHIFT;
>   }
>  
> +#ifdef CONFIG_ARM64_4K_PAGES
>   if (vma_shift == PUD_SHIFT &&
>   !fault_supports_stage2_huge_mapping(memslot, hva, PUD_SIZE))
> -vma_shift = PMD_SHIFT;
> + vma_shift = PMD_SHIFT;
> +#endif
>  
>   if (vma_shift == PMD_SHIFT &&
>   !fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {


I really don't buy the "CPU cycles" argument here either. Can you
actually measure any difference here?

You have taken a fault, gone through a full guest exit, triaged it,
and are about to into a page mapping operation which may result in a
TLBI, and reenter the guest. It only happen a handful of times per
page over the lifetime of the guest unless you start swapping. Hot
path? I don't think so.

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 1/3] KVM: arm64: Check if 52-bits PA is enabled

2020-10-25 Thread Marc Zyngier

On Sun, 25 Oct 2020 01:27:37 +0100,
Gavin Shan  wrote:
> 
> The 52-bits physical address is disabled until CONFIG_ARM64_PA_BITS_52
> is chosen. This uses option for that check, to avoid the unconditional
> check on PAGE_SHIFT in the hot path and thus save some CPU cycles.

PAGE_SHIFT is known at compile time, and this code is dropped by the
compiler if the selected page size is not 64K. This patch really only
makes the code slightly less readable and the "CPU cycles" argument
doesn't hold at all.

So what are you trying to solve exactly?

M.

> 
> Signed-off-by: Gavin Shan 
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 0cdf6e461cbd..fd850353ee89 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -132,8 +132,9 @@ static u64 kvm_pte_to_phys(kvm_pte_t pte)
>  {
>   u64 pa = pte & KVM_PTE_ADDR_MASK;
>  
> - if (PAGE_SHIFT == 16)
> - pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
> +#ifdef CONFIG_ARM64_PA_BITS_52
> + pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
> +#endif
>  
>   return pa;
>  }
> @@ -142,8 +143,9 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
>  {
>   kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
>  
> - if (PAGE_SHIFT == 16)
> - pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
> +#ifdef CONFIG_ARM64_PA_BITS_52
> + pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
> +#endif
>  
>   return pte;
>  }
> -- 
> 2.23.0
> 
> 

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v3 22/35] genirq/irqdomain: Implement get_name() method on irqchip fwnodes

2020-10-25 Thread Marc Zyngier

Hi David,

nit: please use my kernel.org address for kernel related stuff.

On Sat, 24 Oct 2020 22:35:22 +0100,
David Woodhouse  wrote:
> 
> From: David Woodhouse 
> 
> Prerequesite to make x86 more irqdomain compliant.

Prerequisite?

> 
> Signed-off-by: David Woodhouse 
> Signed-off-by: Thomas Gleixner 
> ---
>  kernel/irq/irqdomain.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index cf8b374b892d..673fa64c1c44 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -42,7 +42,16 @@ static inline void debugfs_add_domain_dir(struct 
> irq_domain *d) { }
>  static inline void debugfs_remove_domain_dir(struct irq_domain *d) { }
>  #endif
>  
> -const struct fwnode_operations irqchip_fwnode_ops;
> +static const char *irqchip_fwnode_get_name(const struct fwnode_handle 
> *fwnode)
> +{
> + struct irqchip_fwid *fwid = container_of(fwnode, struct irqchip_fwid, 
> fwnode);
> +
> + return fwid->name;
> +}
> +
> +const struct fwnode_operations irqchip_fwnode_ops = {
> + .get_name = irqchip_fwnode_get_name,
> +};
>  EXPORT_SYMBOL_GPL(irqchip_fwnode_ops);
>  
>  /**

Acked-by: Marc Zyngier 

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] KVM: arm64: Correctly handle the mmio faulting

2020-10-23 Thread Marc Zyngier


Hi Santosh,

Thanks for this.

On 2020-10-21 17:16, Santosh Shukla wrote:

The Commit:6d674e28 introduces a notion to detect and handle the
device mapping. The commit checks for the VM_PFNMAP flag is set
in vma->flags and if set then marks force_pte to true such that
if force_pte is true then ignore the THP function check
(/transparent_hugepage_adjust()).

There could be an issue with the VM_PFNMAP flag setting and checking.
For example consider a case where the mdev vendor driver register's
the vma_fault handler named vma_mmio_fault(), which maps the
host MMIO region in-turn calls remap_pfn_range() and maps
the MMIO's vma space. Where, remap_pfn_range implicitly sets
the VM_PFNMAP flag into vma->flags.

Now lets assume a mmio fault handing flow where guest first access
the MMIO region whose 2nd stage translation is not present.
So that results to arm64-kvm hypervisor executing guest abort handler,
like below:

kvm_handle_guest_abort() -->
 user_mem_abort()--> {

...
0. checks the vma->flags for the VM_PFNMAP.
1. Since VM_PFNMAP flag is not yet set so force_pte _is_ false;
2. gfn_to_pfn_prot() -->
__gfn_to_pfn_memslot() -->
fixup_user_fault() -->
handle_mm_fault()-->
__do_fault() -->
   vma_mmio_fault() --> // vendor's mdev fault 
handler

remap_pfn_range()--> // Here sets the VM_PFNMAP
flag into vma->flags.
3. Now that force_pte is set to false in step-2),
   will execute transparent_hugepage_adjust() func and
   that lead to Oops [4].
 }


Hmmm. Nice. Any chance you could provide us with an actual reproducer?



The proposition is to check is_iomap flag before executing the THP
function transparent_hugepage_adjust().

[4] THP Oops:

pc: kvm_is_transparent_hugepage+0x18/0xb0
...
...
user_mem_abort+0x340/0x9b8
kvm_handle_guest_abort+0x248/0x468
handle_exit+0x150/0x1b0
kvm_arch_vcpu_ioctl_run+0x4d4/0x778
kvm_vcpu_ioctl+0x3c0/0x858
ksys_ioctl+0x84/0xb8
__arm64_sys_ioctl+0x28/0x38


Tested on Huawei Kunpeng Taishan-200 arm64 server, Using VFIO-mdev 
device.

Linux tip: 583090b1

Fixes: 6d674e28 ("KVM: arm/arm64: Properly handle faulting of device 
mappings")

Signed-off-by: Santosh Shukla 
---
 arch/arm64/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 3d26b47..ff15357 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1947,7 +1947,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
phys_addr_t fault_ipa,
 * If we are not forced to use page mapping, check if we are
 * backed by a THP and thus use block mapping if possible.
 */
-   if (vma_pagesize == PAGE_SIZE && !force_pte)
+   if (vma_pagesize == PAGE_SIZE && !force_pte && !is_iomap(flags))
vma_pagesize = transparent_hugepage_adjust(memslot, hva,
   , _ipa);
if (writable)


Why don't you directly set force_pte to true at the point where we 
update

the flags? It certainly would be a bit more readable:

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 3d26b47a1343..7a4ad984d54e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1920,6 +1920,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,

if (kvm_is_device_pfn(pfn)) {
mem_type = PAGE_S2_DEVICE;
flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+   force_pte = true;
} else if (logging_active) {
/*
 * Faults on pages in a memslot with logging enabled

and almost directly applies to what we have queued for 5.10.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 5/5] arm64: ipi_nmi: Add support for NMI backtrace

2020-10-21 Thread Marc Zyngier


On 2020-10-20 10:13, Sumit Garg wrote:

On Mon, 19 Oct 2020 at 17:50, Marc Zyngier  wrote:


On 2020-10-14 12:12, Sumit Garg wrote:
> Enable NMI backtrace support on arm64 using IPI turned as an NMI
> leveraging pseudo NMIs support. It is now possible for users to get a
> backtrace of a CPU stuck in hard-lockup using magic SYSRQ.
>
> Signed-off-by: Sumit Garg 
> ---
>  arch/arm64/include/asm/irq.h |  6 ++
>  arch/arm64/kernel/ipi_nmi.c  | 12 +++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/irq.h
> b/arch/arm64/include/asm/irq.h
> index b2b0c64..e840bf1 100644
> --- a/arch/arm64/include/asm/irq.h
> +++ b/arch/arm64/include/asm/irq.h
> @@ -6,6 +6,12 @@
>
>  #include 
>
> +#ifdef CONFIG_SMP
> +extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
> +bool exclude_self);
> +#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
> +#endif
> +
>  struct pt_regs;
>
>  static inline int nr_legacy_irqs(void)
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> index e0a9e03..e1dc19d 100644
> --- a/arch/arm64/kernel/ipi_nmi.c
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>
>  #include 
> @@ -25,12 +26,21 @@ void arch_send_call_nmi_func_ipi_mask(cpumask_t
> *mask)
>   __ipi_send_mask(ipi_desc, mask);
>  }
>
> +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool
> exclude_self)
> +{
> + nmi_trigger_cpumask_backtrace(mask, exclude_self,
> +   arch_send_call_nmi_func_ipi_mask);
> +}
> +
>  static irqreturn_t ipi_nmi_handler(int irq, void *data)
>  {
>   unsigned int cpu = smp_processor_id();
>
> - ipi_kgdb_nmicallback(cpu, get_irq_regs());
> + if (nmi_cpu_backtrace(get_irq_regs()))
> + goto out;
>
> + ipi_kgdb_nmicallback(cpu, get_irq_regs());
> +out:
>   return IRQ_HANDLED;
>  }

Can't you have *both* a request for a backtrace and a KGDB call?
It really shouldn't be either/or. It also outlines how well shared
interrupts work with edge triggered signalling...


Unfortunately, NMIs doesn't seem to support shared mode [1].


You are totally missing the point: shared interrupts *cannot* work
reliably with edge signalling. What I am saying here is that you need
implement the sharing yourself in this function.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-21 Thread Marc Zyngier

On 2020-10-20 12:22, Sumit Garg wrote:

On Tue, 20 Oct 2020 at 15:38, Marc Zyngier  wrote:

On 2020-10-20 07:43, Sumit Garg wrote:
> On Mon, 19 Oct 2020 at 17:07, Marc Zyngier  wrote:

[...]

>> > +{
>> > + if (!ipi_desc)
>> > + return;
>> > +
>> > + if (is_nmi) {
>> > + if (!prepare_percpu_nmi(ipi_id))
>> > + enable_percpu_nmi(ipi_id, IRQ_TYPE_NONE);
>> > + } else {
>> > + enable_percpu_irq(ipi_id, IRQ_TYPE_NONE);
>>
>> I'm not keen on this. Normal IRQs can't reliably work, so why do you
>> even bother with this?
>
> Yeah I agree but we need to support existing functionality for kgdb
> roundup and sysrq backtrace using normal IRQs as well.

When has this become a requirement? I don't really see the point in
implementing something that is known not to work.

For kgdb:

Default implementation [1] uses smp_call_function_single_async() which
in turn will invoke IPI as a normal IRQ to roundup CPUs.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/debug/debug_core.c#n244

For sysrq backtrace:

Default implementation [2] fallbacks to smp_call_function() (IPI as a
normal IRQ) to print backtrace in case architecture doesn't provide
arch_trigger_cpumask_backtrace() hook.

[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/sysrq.c#n250

So in general, IPI as a normal IRQ is still useful for debugging but
it can't debug a core which is stuck in deadlock with interrupts
disabled.

And that's not something we implement today for good reasons:
it *cannot* work reliably. What changed that we all of a sudden need it?

And since we choose override default implementations for pseudo NMI
support, we need to be backwards compatible for platforms which don't
possess pseudo NMI support.

No. There is nothing to be "backward compatible" with, because
- this isn't a userspace visible feature
- *it doesn't work*

So please drop this non-feature from this series.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH 1/2] arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED

2020-10-21 Thread Marc Zyngier


On 2020-10-21 08:57, Will Deacon wrote:

On Tue, Oct 20, 2020 at 02:45:43PM -0700, Stephen Boyd wrote:

According to the SMCCC spec (7.5.2 Discovery) the
ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
SMCCC_RET_NOT_SUPPORTED corresponding to "workaround required",
"workaround not required but implemented", and "who knows, you're on
your own" respectively. For kvm hypercalls (hvc), we've implemented 
this

function id to return SMCCC_RET_NOT_SUPPORTED, 1, and
SMCCC_RET_NOT_REQUIRED. The SMCCC_RET_NOT_REQUIRED return value is not 
a

thing for this function id, and is probably copy/pasted from the
SMCCC_ARCH_WORKAROUND_2 function id that does support it.

Clean this up by returning 0, 1, and SMCCC_RET_NOT_SUPPORTED
appropriately. Changing this exposes the problem that
spectre_v2_get_cpu_fw_mitigation_state() assumes a
SMCCC_RET_NOT_SUPPORTED return value means we are vulnerable, but 
really

it means we have no idea and should assume we can't do anything about
mitigation. Put another way, it better be unaffected because it can't 
be

mitigated in the firmware (in this case kvm) as the call isn't
implemented!

Cc: Andre Przywara 
Cc: Steven Price 
Cc: Marc Zyngier 
Cc: sta...@vger.kernel.org
Fixes: c118bbb52743 ("arm64: KVM: Propagate full Spectre v2 workaround 
state to KVM guests")
Fixes: 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or 
lack thereof")

Signed-off-by: Stephen Boyd 
---

This will require a slightly different backport to stable kernels, but
at least it looks like this is a problem given that this return value
isn't valid per the spec and we've been going around it by returning
something invalid for some time.

 arch/arm64/kernel/proton-pack.c | 3 +--
 arch/arm64/kvm/hypercalls.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/proton-pack.c 
b/arch/arm64/kernel/proton-pack.c

index 68b710f1b43f..00bd54f63f4f 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -149,10 +149,9 @@ static enum mitigation_state 
spectre_v2_get_cpu_fw_mitigation_state(void)

case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
+	case SMCCC_RET_NOT_SUPPORTED: /* Good luck w/ the Gatekeeper of 
Gozer */

return SPECTRE_UNAFFECTED;


Hmm, I'm not sure this is correct. The SMCCC spec is terrifically
unhelpful:

  NOT_SUPPORTED:
  Either:
  * None of the PEs in the system require firmware mitigation for 
CVE-2017-5715.

  * The system contains at least 1 PE affected by CVE-2017-5715 that
has no firmware
mitigation available.
  * The firmware does not provide any information about whether
firmware mitigation is
required.

so we can't tell whether the thing is vulnerable or not in this case, 
and

have to assume that it is.


default:
-   fallthrough;
-   case SMCCC_RET_NOT_SUPPORTED:
return SPECTRE_VULNERABLE;
}
 }
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 9824025ccc5c..868486957808 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -31,7 +31,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
val = SMCCC_RET_SUCCESS;
break;
case SPECTRE_UNAFFECTED:
-   val = SMCCC_RET_NOT_REQUIRED;
+   val = SMCCC_RET_NOT_SUPPORTED;


Which means we need to return SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 
here, I

suppose?


Gahh, I keep mixing Spectre-v2 and WA2. Not good. I *think* the patch
below is enough, but I need to give it a go...

M.

diff --git a/arch/arm64/kernel/proton-pack.c 
b/arch/arm64/kernel/proton-pack.c

index 68b710f1b43f..3f417d6305ef 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -134,8 +134,6 @@ static enum mitigation_state 
spectre_v2_get_cpu_hw_mitigation_state(void)

return SPECTRE_VULNERABLE;
 }

-#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED   (1)
-
 static enum mitigation_state 
spectre_v2_get_cpu_fw_mitigation_state(void)

 {
int ret;
@@ -148,7 +146,7 @@ static enum mitigation_state 
spectre_v2_get_cpu_fw_mitigation_state(void)

switch (ret) {
case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
-   case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
+   case SMCCC_RET_UNAFFECTED:
return SPECTRE_UNAFFECTED;
default:
fallthrough;
@@ -474,7 +472,7 @@ static enum mitigation_state 
spectre_v4_get_cpu_fw_mitigation_state(void)

switch (ret) {
case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
-   case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
+   case SMCCC_RET_UNAFFECTED:
fallthrough;
case SMCCC_RET_NOT_REQUIRED:

Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-20 Thread Marc Zyngier


On 2020-10-20 13:25, Daniel Thompson wrote:

On Tue, Oct 20, 2020 at 04:52:43PM +0530, Sumit Garg wrote:


[...]


So in general, IPI as a normal IRQ is still useful for debugging but
it can't debug a core which is stuck in deadlock with interrupts
disabled.

And since we choose override default implementations for pseudo NMI
support, we need to be backwards compatible for platforms which don't
possess pseudo NMI support.


Do the fallback implementations require a new IPI? The fallbacks
could rely on existing mechanisms such as the smp_call_function
family.


Indeed. I'd be worried of using that mechanism for NMIs, but normal
IPIs should stick to the normal cross-call stuff.

The jury is still out on why this is a good idea, given that it
doesn't work in the only interesting case (deadlocked CPUs).

 M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-20 Thread Marc Zyngier

On 2020-10-20 07:43, Sumit Garg wrote:

On Mon, 19 Oct 2020 at 17:07, Marc Zyngier  wrote:

[...]

> +{
> + if (!ipi_desc)
> + return;
> +
> + if (is_nmi) {
> + if (!prepare_percpu_nmi(ipi_id))
> + enable_percpu_nmi(ipi_id, IRQ_TYPE_NONE);
> + } else {
> + enable_percpu_irq(ipi_id, IRQ_TYPE_NONE);

I'm not keen on this. Normal IRQs can't reliably work, so why do you
even bother with this?

Yeah I agree but we need to support existing functionality for kgdb
roundup and sysrq backtrace using normal IRQs as well.

When has this become a requirement? I don't really see the point in
implementing something that is known not to work.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v3 03/16] arm64: Allow IPIs to be handled as normal interrupts

2020-10-19 Thread Marc Zyngier


Hi Vincent,

On 2020-10-19 13:42, Vincent Guittot wrote:

Hi Marc,

On Tue, 1 Sep 2020 at 16:44, Marc Zyngier  wrote:


In order to deal with IPIs as normal interrupts, let's add
a new way to register them with the architecture code.

set_smp_ipi_range() takes a range of interrupts, and allows
the arch code to request them as if the were normal interrupts.
A standard handler is then called by the core IRQ code to deal
with the IPI.

This means that we don't need to call irq_enter/irq_exit, and
that we don't need to deal with set_irq_regs either. So let's
move the dispatcher into its own function, and leave handle_IPI()
as a compatibility function.

On the sending side, let's make use of ipi_send_mask, which
already exists for this purpose.

One of the major difference is that we end up, in some cases
(such as when performing IRQ time accounting on the scheduler
IPI), end up with nested irq_enter()/irq_exit() pairs.
Other than the (relatively small) overhead, there should be
no consequences to it (these pairs are designed to nest
correctly, and the accounting shouldn't be off).


While rebasing on mainline, I have faced a performance regression for
the benchmark:
perf bench sched pipe
on my arm64 dual quad core (hikey) and my 2 nodes x 112 CPUS (thx2)

The regression comes from:
commit: d3afc7f12987 ("arm64: Allow IPIs to be handled as normal 
interrupts")


That's interesting, as this patch doesn't really change anything (most
of the potential overhead comes in later). The only potential overhead
I can see is that the scheduler_ipi() call is now wrapped around
irq_enter()/irq_exit().



  v5.9  + this patch
hikey :   48818(+/- 0.31)   37503(+/- 0.15%)  -23.2%
thx2  :  132410(+/- 1.72)  122646(+/- 1.92%)   -7.4%

By + this patch,  I mean merging branch from this patch. Whereas
merging the previous:
commit: 83cfac95c018 ("genirq: Allow interrupts to be excluded from
/proc/interrupts")
 It doesn't show any regression


Since you are running perf, can you spot where the overhead occurs?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] perf: arm_spe: Use Inner Shareable DSB when draining the buffer

2020-10-19 Thread Marc Zyngier

On 2020-10-19 13:24, Mark Rutland wrote:

On Tue, Oct 06, 2020 at 05:13:31PM +0100, Alexandru Elisei wrote:

Hi Marc,

Thank you for having a look at the patch!

On 10/6/20 4:32 PM, Marc Zyngier wrote:
> Hi Alex,
>
> On Tue, 06 Oct 2020 16:05:20 +0100,
> Alexandru Elisei  wrote:
>> From ARM DDI 0487F.b, page D9-2807:
>>
>> "Although the Statistical Profiling Extension acts as another observer in
>> the system, for determining the Shareability domain of the DSB
>> instructions, the writes of sample records are treated as coming from the
>> PE that is being profiled."
>>
>> Similarly, on page D9-2801:
>>
>> "The memory type and attributes that are used for a write by the
>> Statistical Profiling Extension to the Profiling Buffer is taken from the
>> translation table entries for the virtual address being written to. That
>> is:
>> - The writes are treated as coming from an observer that is coherent with
>>   all observers in the Shareability domain that is defined by the
>>   translation tables."
>>
>> All the PEs are in the Inner Shareable domain, use a DSB ISH to make sure
>> writes to the profiling buffer have completed.
> I'm a bit sceptical of this change. The SPE writes are per-CPU, and
> all we are trying to ensure is that the CPU we are running on has
> drained its own queue of accesses.
>
> The accesses being made within the IS domain doesn't invalidate the
> fact that they are still per-CPU, because "the writes of sample
> records are treated as coming from the PE that is being profiled.".
>
> So why should we have an IS-wide synchronisation for accesses that are
> purely local?

I think I might have misunderstood how perf spe works. Below is my 
original train

of thought.

In the buffer management event interrupt we drain the buffer, and if 
the buffer is
full, we call arm_spe_perf_aux_output_end() -> perf_aux_output_end(). 
The comment
for perf_aux_output_end() says "Commit the data written by hardware 
into the ring
buffer by adjusting aux_head and posting a PERF_RECORD_AUX into the 
perf buffer.
It is the pmu driver's responsibility to observe ordering rules of the 
hardware,
so that all the data is externally visible before this is called." My 
conclusion
was that after we drain the buffer, the data must be visible to all 
CPUs.

FWIW, this reasoning sounds correct to me. The DSB NSH will be
sufficient to drain the buffer, but we need the DSB ISH to ensure that
it's visbile to other CPUs at the instant we call 
perf_aux_output_end().

Right. I think I missed that last bit (and Alex's email at the same 
time).

Otherwise, if CPU x is reading the ring-buffer written by CPU y, it
might see the aux buffer pointers updated before the samples are
viisble, and hence read junk from the buffer.

We can add a comment to that effect (or rework perf_aux_output_end()
somehow to handle that ordering).

I'd rather this is done in perf_aux_output_end(), as a full blown DSB 
ISH
on guest entry is pretty harsh... It would also nicely split the 
responsibilities:

- KVM stops SPE and make sure the output is drained
- Perf makes the data visible to all CPUs

Thoughts?

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 5/5] arm64: ipi_nmi: Add support for NMI backtrace

2020-10-19 Thread Marc Zyngier


On 2020-10-14 12:12, Sumit Garg wrote:

Enable NMI backtrace support on arm64 using IPI turned as an NMI
leveraging pseudo NMIs support. It is now possible for users to get a
backtrace of a CPU stuck in hard-lockup using magic SYSRQ.

Signed-off-by: Sumit Garg 
---
 arch/arm64/include/asm/irq.h |  6 ++
 arch/arm64/kernel/ipi_nmi.c  | 12 +++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/irq.h 
b/arch/arm64/include/asm/irq.h

index b2b0c64..e840bf1 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -6,6 +6,12 @@

 #include 

+#ifdef CONFIG_SMP
+extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+  bool exclude_self);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+#endif
+
 struct pt_regs;

 static inline int nr_legacy_irqs(void)
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index e0a9e03..e1dc19d 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include 
@@ -25,12 +26,21 @@ void arch_send_call_nmi_func_ipi_mask(cpumask_t 
*mask)

__ipi_send_mask(ipi_desc, mask);
 }

+void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool 
exclude_self)

+{
+   nmi_trigger_cpumask_backtrace(mask, exclude_self,
+ arch_send_call_nmi_func_ipi_mask);
+}
+
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
unsigned int cpu = smp_processor_id();

-   ipi_kgdb_nmicallback(cpu, get_irq_regs());
+   if (nmi_cpu_backtrace(get_irq_regs()))
+   goto out;

+   ipi_kgdb_nmicallback(cpu, get_irq_regs());
+out:
return IRQ_HANDLED;
 }


Can't you have *both* a request for a backtrace and a KGDB call?
It really shouldn't be either/or. It also outlines how well shared
interrupts work with edge triggered signalling...

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 4/5] arm64: kgdb: Round up cpus using IPI as NMI

2020-10-19 Thread Marc Zyngier


On 2020-10-14 12:12, Sumit Garg wrote:

arm64 platforms with GICv3 or later supports pseudo NMIs which can be
leveraged to round up CPUs which are stuck in hard lockup state with
interrupts disabled that wouldn't be possible with a normal IPI.

So instead switch to round up CPUs using IPI turned as NMI. And in
case a particular arm64 platform doesn't supports pseudo NMIs,
this IPI will act as a normal IPI which maintains existing kgdb
functionality.

Signed-off-by: Sumit Garg 
---
 arch/arm64/include/asm/kgdb.h |  8 
 arch/arm64/kernel/ipi_nmi.c   |  5 -
 arch/arm64/kernel/kgdb.c  | 21 +
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kgdb.h 
b/arch/arm64/include/asm/kgdb.h

index 21fc85e..6f3d3af 100644
--- a/arch/arm64/include/asm/kgdb.h
+++ b/arch/arm64/include/asm/kgdb.h
@@ -24,6 +24,14 @@ static inline void arch_kgdb_breakpoint(void)
 extern void kgdb_handle_bus_error(void);
 extern int kgdb_fault_expected;

+#ifdef CONFIG_KGDB
+extern void ipi_kgdb_nmicallback(int cpu, void *regs);
+#else
+static inline void ipi_kgdb_nmicallback(int cpu, void *regs)
+{
+}
+#endif
+
 #endif /* !__ASSEMBLY__ */

 /*
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index a959256..e0a9e03 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@

 #include 
 #include 
+#include 
 #include 

 #include 
@@ -26,7 +27,9 @@ void arch_send_call_nmi_func_ipi_mask(cpumask_t 
*mask)


 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
-   /* nop, NMI handlers for special features can be added here. */
+   unsigned int cpu = smp_processor_id();
+
+   ipi_kgdb_nmicallback(cpu, get_irq_regs());


Please add a return value to ipi_kgdb_nmicallback(), and check it
before returning IRQ_HANDLED.

Thinking a bit more about the whole thing, you should have a way to
avoid requesting the NMI if there is no user for it (there is nothing
worse than an enabled interrupt without handlers...).



return IRQ_HANDLED;
 }
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 1a157ca3..0991275 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -17,6 +17,7 @@

 #include 
 #include 
+#include 
 #include 

 struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
@@ -353,3 +354,23 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt 
*bpt)

return aarch64_insn_write((void *)bpt->bpt_addr,
*(u32 *)bpt->saved_instr);
 }
+
+void ipi_kgdb_nmicallback(int cpu, void *regs)
+{
+   if (atomic_read(_active) != -1)
+   kgdb_nmicallback(cpu, regs);
+}
+
+#ifdef CONFIG_SMP


There is no such thing as an arm64 UP kernel.


+void kgdb_roundup_cpus(void)
+{
+   struct cpumask mask;
+
+   cpumask_copy(, cpu_online_mask);
+   cpumask_clear_cpu(raw_smp_processor_id(), );
+   if (cpumask_empty())
+   return;
+
+   arch_send_call_nmi_func_ipi_mask();


Surely you can come up with a less convoluted name for this function.
arm64_send_nmi() would be plenty in my opinion.


+}
+#endif


Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 2/5] irqchip/gic-v3: Enable support for SGIs to act as NMIs

2020-10-19 Thread Marc Zyngier


On 2020-10-14 12:12, Sumit Garg wrote:
Add support to handle SGIs as regular NMIs. As SGIs or IPIs defaults to 
a


There is nothing "regular" about NMIs. Drop "or IPIs". 
s/defaults/default/



special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Signed-off-by: Sumit Garg 
---
 drivers/irqchip/irq-gic-v3.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c 
b/drivers/irqchip/irq-gic-v3.c

index 16fecc0..5efc865 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -477,6 +477,11 @@ static int gic_irq_nmi_setup(struct irq_data *d)
if (WARN_ON(gic_irq(d) >= 8192))
return -EINVAL;

+   if (get_intid_range(d) == SGI_RANGE) {
+   gic_irq_set_prio(d, GICD_INT_NMI_PRI);
+   return 0;
+   }
+


Please follow the existing control flow, or rework it to be organised by 
range.



/* desc lock should already be held */
if (gic_irq_in_rdist(d)) {
u32 idx = gic_get_ppi_index(d);
@@ -514,6 +519,11 @@ static void gic_irq_nmi_teardown(struct irq_data 
*d)

if (WARN_ON(gic_irq(d) >= 8192))
return;

+   if (get_intid_range(d) == SGI_RANGE) {
+   gic_irq_set_prio(d, GICD_INT_DEF_PRI);
+   return;
+   }


Same here.


+
/* desc lock should already be held */
if (gic_irq_in_rdist(d)) {
u32 idx = gic_get_ppi_index(d);
@@ -1708,6 +1718,7 @@ static int __init gic_init_bases(void __iomem 
*dist_base,


gic_dist_init();
gic_cpu_init();
+   gic_enable_nmi_support();
gic_smp_init();
gic_cpu_pm_init();

@@ -1719,8 +1730,6 @@ static int __init gic_init_bases(void __iomem 
*dist_base,

gicv2m_init(handle, gic_data.domain);
}

-   gic_enable_nmi_support();
-
return 0;

 out_free:


Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 3/5] arm64: smp: Allocate and setup IPI as NMI

2020-10-19 Thread Marc Zyngier


On 2020-10-14 12:12, Sumit Garg wrote:
Allocate an unused IPI that can be turned as NMI using ipi_nmi 
framework.


This doesn't do any allocation, as far as I can see. It relies on
the initial grant from the interrupt controller to be larger than
what the kernel currently uses.


Also, invoke corresponding NMI setup/teardown APIs.

Signed-off-by: Sumit Garg 
---
 arch/arm64/kernel/smp.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 82e75fc..129ebfb 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -962,6 +963,8 @@ static void ipi_setup(int cpu)

for (i = 0; i < nr_ipi; i++)
enable_percpu_irq(ipi_irq_base + i, 0);
+
+   ipi_nmi_setup(cpu);
 }

 #ifdef CONFIG_HOTPLUG_CPU
@@ -974,6 +977,8 @@ static void ipi_teardown(int cpu)

for (i = 0; i < nr_ipi; i++)
disable_percpu_irq(ipi_irq_base + i);
+
+   ipi_nmi_teardown(cpu);
 }
 #endif

@@ -995,6 +1000,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
}

+   if (n > nr_ipi)
+   set_smp_ipi_nmi(ipi_base + nr_ipi);
+
ipi_irq_base = ipi_base;

/* Setup the boot CPU immediately */


Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-19 Thread Marc Zyngier


On 2020-10-14 12:12, Sumit Garg wrote:

Introduce framework to turn an IPI as NMI using pseudo NMIs. In case a
particular platform doesn't support pseudo NMIs, then request IPI as a
regular IRQ.

The main motivation for this feature is to have an IPI that can be
leveraged to invoke NMI functions on other CPUs. And current 
prospective

users are NMI backtrace and KGDB CPUs round-up whose support is added
via future patches.

Signed-off-by: Sumit Garg 
---
 arch/arm64/include/asm/nmi.h | 16 +
 arch/arm64/kernel/Makefile   |  2 +-
 arch/arm64/kernel/ipi_nmi.c  | 77 


 3 files changed, 94 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c


[...]


+   irq_set_status_flags(ipi, IRQ_HIDDEN);


Another thing is this. Why are you hiding this from /proc/interrupts?
The only reason the other IPIs are hidden is that displaying them as
"normal" interrupts would be a change in userspace ABI.

In your case, this is something new that can perfectly appear as
a standard interrupt (and I don't see how you'd display the
statistics otherwise).

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-19 Thread Marc Zyngier


On 2020-10-14 12:12, Sumit Garg wrote:

Introduce framework to turn an IPI as NMI using pseudo NMIs. In case a
particular platform doesn't support pseudo NMIs, then request IPI as a
regular IRQ.

The main motivation for this feature is to have an IPI that can be
leveraged to invoke NMI functions on other CPUs. And current 
prospective

users are NMI backtrace and KGDB CPUs round-up whose support is added
via future patches.

Signed-off-by: Sumit Garg 
---
 arch/arm64/include/asm/nmi.h | 16 +
 arch/arm64/kernel/Makefile   |  2 +-
 arch/arm64/kernel/ipi_nmi.c  | 77 


 3 files changed, 94 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

diff --git a/arch/arm64/include/asm/nmi.h 
b/arch/arm64/include/asm/nmi.h

new file mode 100644
index 000..3433c55
--- /dev/null
+++ b/arch/arm64/include/asm/nmi.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NMI_H
+#define __ASM_NMI_H
+
+#ifndef __ASSEMBLER__
+
+#include 
+
+extern void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask);
+
+void set_smp_ipi_nmi(int ipi);
+void ipi_nmi_setup(int cpu);
+void ipi_nmi_teardown(int cpu);
+
+#endif /* !__ASSEMBLER__ */
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index bbaf0bc..525a1e0 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -17,7 +17,7 @@ obj-y := debug-monitors.o entry.o irq.o 
fpsimd.o  \
   return_address.o cpuinfo.o cpu_errata.o  
\
   cpufeature.o alternative.o cacheinfo.o   
\
   smp.o smp_spin_table.o topology.o smccc-call.o   
\
-  syscall.o proton-pack.o
+  syscall.o proton-pack.o ipi_nmi.o

 targets+= efi-entry.o

diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
new file mode 100644
index 000..a959256
--- /dev/null
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NMI support for IPIs
+ *
+ * Copyright (C) 2020 Linaro Limited
+ * Author: Sumit Garg 
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+static struct irq_desc *ipi_desc __read_mostly;
+static int ipi_id __read_mostly;
+static bool is_nmi __read_mostly;
+
+void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask)
+{
+   if (WARN_ON_ONCE(!ipi_desc))
+   return;
+
+   __ipi_send_mask(ipi_desc, mask);
+}
+
+static irqreturn_t ipi_nmi_handler(int irq, void *data)
+{
+   /* nop, NMI handlers for special features can be added here. */
+
+   return IRQ_HANDLED;


This definitely is the *wrong* default. If nothing is explicitly
handling the interrupt, it should be reported as such to the core
code to be disabled if this happens too often.


+}
+
+void ipi_nmi_setup(int cpu)


The naming is awful. "ipi" is nowhere specific enough (we already have
another 7 of them), and this doesn't necessarily uses pseudo-NMIs, since
you are requesting IRQs.


+{
+   if (!ipi_desc)
+   return;
+
+   if (is_nmi) {
+   if (!prepare_percpu_nmi(ipi_id))
+   enable_percpu_nmi(ipi_id, IRQ_TYPE_NONE);
+   } else {
+   enable_percpu_irq(ipi_id, IRQ_TYPE_NONE);


I'm not keen on this. Normal IRQs can't reliably work, so why do you
even bother with this?


+   }
+}
+
+void ipi_nmi_teardown(int cpu)
+{
+   if (!ipi_desc)
+   return;
+
+   if (is_nmi) {
+   disable_percpu_nmi(ipi_id);
+   teardown_percpu_nmi(ipi_id);
+   } else {
+   disable_percpu_irq(ipi_id);
+   }
+}
+
+void __init set_smp_ipi_nmi(int ipi)
+{
+   int err;
+
+   err = request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", _number);
+   if (err) {
+   err = request_percpu_irq(ipi, ipi_nmi_handler, "IPI",
+_number);
+   WARN_ON(err);
+   is_nmi = false;
+   } else {
+   is_nmi = true;
+   }
+
+   ipi_desc = irq_to_desc(ipi);
+   irq_set_status_flags(ipi, IRQ_HIDDEN);
+   ipi_id = ipi;


Updating global state without checking for errors doesn't seem
like a great move.

M.
--
Jazz is not dead. It just smells funny...

Re: 5.10-rc0: build error in ipi.c

2020-10-16 Thread Marc Zyngier


On 2020-10-16 00:24, Thomas Gleixner wrote:

On Thu, Oct 15 2020 at 20:41, Marc Zyngier wrote:

On 2020-10-15 18:18, Pavel Machek wrote:
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 10a5aff4eecc..db923e0da162 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -81,6 +81,7 @@ config IRQ_FASTEOI_HIERARCHY_HANDLERS

  # Generic IRQ IPI support
  config GENERIC_IRQ_IPI
+   select IRQ_DOMAIN_HIERARCHY
bool


which makes some of the MIPS GENERIC_IRQ_IPI/IRQ_DOMAIN_HIERARCHY
Kconfig magic in drivers/irqchip/Kconfig obsolete.


Good point. I'll queue this on top:

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index cd734df57c42..d2a651372e15 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -180,7 +180,6 @@ config IRQ_MIPS_CPU
select GENERIC_IRQ_CHIP
select GENERIC_IRQ_IPI if SYS_SUPPORTS_MULTITHREADING
select IRQ_DOMAIN
-   select IRQ_DOMAIN_HIERARCHY if GENERIC_IRQ_IPI
select GENERIC_IRQ_EFFECTIVE_AFF_MASK

 config CLPS711X_IRQCHIP

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] irqchip: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7

2020-10-15 Thread Marc Zyngier

On Wed, 14 Oct 2020 15:17:03 +0200, Geert Uytterhoeven wrote:
> The MStar interrupt controller is only found on MStar, SigmaStar, and
> Mediatek SoCs.  Hence add dependencies on ARCH_MEDIATEK and
> ARCH_MSTARV7, to prevent asking the user about the MStar interrupt
> controller driver when configuring a kernel without support for MStar,
> SigmaStar, and Mediatek SoCs.

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7
  commit: 61b0648d569aca932eab87a67f7ca0ffd3ea2b68

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH 2/2] irqchip/sifive-plic: Fix the interrupt was enabled accidentally issue.

2020-10-15 Thread Marc Zyngier


On 2020-10-12 14:57, Greentime Hu wrote:

In commit 2ca0b460bbcb ("genirq/affinity: Make affinity setting if
activated opt-in"),
it added irqd_affinity_on_activate() checking in the function
irq_set_affinity_deactivated() so it will return false here.
In that case, it will call irq_try_set_affinity() -> plic_irq_toggle()
which will enable the interrupt to cause the CPU hang.

if (irq_set_affinity_deactivated())
return 0;
...
irq_try_set_affinity(data, mask, force);

[  919.015783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  919.020922] rcu: 0-...0: (0 ticks this GP)
idle=7d2/1/0x4002 softirq=1424/1424 fqs=105807
[  919.030295]  (detected by 1, t=225825 jiffies, g=1561, q=3496)
[  919.036109] Task dump for CPU 0:
[  919.039321] kworker/0:1 R  running task030  2 
0x0008

[  919.046359] Workqueue: events set_brightness_delayed
[  919.051302] Call Trace:
[  919.053738] [] __schedule+0x194/0x4de
[  982.035783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  982.040923] rcu: 0-...0: (0 ticks this GP)
idle=7d2/1/0x4002 softirq=1424/1424 fqs=113325
[  982.050294]  (detected by 1, t=241580 jiffies, g=1561, q=3509)
[  982.056108] Task dump for CPU 0:
[  982.059321] kworker/0:1 R  running task030  2 
0x0008

[  982.066359] Workqueue: events set_brightness_delayed
[  982.071302] Call Trace:
[  982.073739] [] __schedule+0x194/0x4de
[..]

After applying this patch, the CPU hang issue can be fixed.

Signed-off-by: Greentime Hu 
---
 drivers/irqchip/irq-sifive-plic.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/irqchip/irq-sifive-plic.c
b/drivers/irqchip/irq-sifive-plic.c
index 4cc8a2657a6d..8a673d9cff69 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -183,10 +183,14 @@ static int plic_irqdomain_map(struct irq_domain
*d, unsigned int irq,
  irq_hw_number_t hwirq)
 {
struct plic_priv *priv = d->host_data;
+   struct irq_data *irqd;

irq_domain_set_info(d, irq, hwirq, _chip, d->host_data,
handle_fasteoi_irq, NULL, NULL);
irq_set_noprobe(irq);
+   irqd = irq_get_irq_data(irq);
+   irqd_set_single_target(irqd);
+   irqd_set_affinity_on_activate(irqd);
irq_set_affinity(irq, >lmask);
return 0;
 }


How does this fix anything? The plic driver doesn't have an activate
callback, so how does it make any difference? I get the feeling this
papers over another issue.

M.
--
Jazz is not dead. It just smells funny...

Re: 5.10-rc0: build error in ipi.c

2020-10-15 Thread Marc Zyngier


On 2020-10-15 18:18, Pavel Machek wrote:

Hi!


> > I'm getting build problems in 5.10-rc0 in config for n900. ARM board.
> >
> > CONFIG_SMP=y
> > CONFIG_SMP_ON_UP=y

On its own, this doesn't break anything with multi_v7_defconfig.


I sent config off-list. Let me know if it does not arrive or if you
need more info.


Try this for size:

diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 10a5aff4eecc..db923e0da162 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -81,6 +81,7 @@ config IRQ_FASTEOI_HIERARCHY_HANDLERS

 # Generic IRQ IPI support
 config GENERIC_IRQ_IPI
+   select IRQ_DOMAIN_HIERARCHY
bool

 # Generic MSI interrupt support


N,
--
Jazz is not dead. It just smells funny...

Re: 5.10-rc0: build error in ipi.c

2020-10-15 Thread Marc Zyngier


On 2020-10-15 15:23, Thomas Gleixner wrote:

On Thu, Oct 15 2020 at 12:12, Pavel Machek wrote:

Cc+ Marc


Thanks Thomas.




I'm getting build problems in 5.10-rc0 in config for n900. ARM board.

CONFIG_SMP=y
CONFIG_SMP_ON_UP=y


On its own, this doesn't break anything with multi_v7_defconfig.




  CC  net/devres.o
  kernel/irq/ipi.c: In function ‘irq_reserve_ipi’:
  kernel/irq/ipi.c:84:9: error: implicit declaration of function
  ‘__irq_domain_alloc_irqs’; did you mean ‘irq_domain_alloc_irqs’?
  [-Werror=implicit-function-declaration]
virq = __irq_domain_alloc_irqs(domain, virq, nr_irqs,
  NUMA_NO_NODE,
   ^~~
irq_domain_alloc_irqs
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:283:
  kernel/irq/ipi.o] Error 1


That probably comes from the ipi as irq rework for arm/arm64.


Most probably.

Pawel, can you please stash your config somewhere where I can get it?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] arm64: KVM: marking pages as XN in Stage-2 does not care about CTR_EL0.DIC

2020-10-13 Thread Marc Zyngier


On 2020-10-13 13:56, limingwang (A) wrote:

Hi Li,

On 2020-10-12 02:08, l00484210 wrote:

From: MingWang Li 

When testing the ARMv8.2-TTS2UXN feature, setting bits of XN is
unavailable.
Because the control bit CTR_EL0.DIC is set by default on system.

But when CTR_EL0.DIC is set, software does not need to flush icache
actively, instead of clearing XN bits.The patch, the commit id of
which is 6ae4b6e0578886eb36cedbf99f04031d93f9e315, has implemented 
the

function of CTR_EL0.DIC.

Signed-off-by: MingWang Li 
Signed-off-by: Henglong Fan 
---
 arch/arm64/include/asm/pgtable-prot.h | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h
b/arch/arm64/include/asm/pgtable-prot.h
index 4d867c6446c4..5feb94882bf7 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -79,17 +79,7 @@ extern bool arm64_use_ng_mappings;
__val;  \
 })

-#define PAGE_S2_XN \
-   ({  \
-   u64 __val;  \
-   if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))   \
-   __val = 0;  \
-   else\
-   __val = PTE_S2_XN;  \
-   __val;  \
-   })
-
-#define PAGE_S2__pgprot(_PROT_DEFAULT | 
PAGE_S2_MEMATTR(NORMAL) |
PTE_S2_RDONLY | PAGE_S2_XN)
+#define PAGE_S2__pgprot(_PROT_DEFAULT | 
PAGE_S2_MEMATTR(NORMAL) |
PTE_S2_RDONLY | PTE_S2_XN)
 #define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT |
PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)

 #define PAGE_NONE  __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) |
PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)


I don't understand what you are trying to achieve here.

This whole point of not setting XN in the page tables when DIC is 
present is to avoid a pointless permission fault at run time. At you 
noticed above, no icache invalidation is necessary. So why would you 
ever want to take a fault on exec the first place?


M.
--
Jazz is not dead. It just smells funny...



Hi Marc,

According to ARMv8.2-TTS2UXN feature, which extends the stage 2
translation table access
permissions to provide control of whether memory is executable at EL0
independent of whether
it is executable at EL1.

Testing this feature in some security scenario, for example, if I want
to grant execute permission
to some memory only for EL0, but it will failed. Because KVM clears XN
bits at first, this means that
the memory can be executable in both EL0 and El1.


KVM currently offers no support for this, and the only use we have for
XN so far is to to ensure Data/Instruction coherency when the CPU
doesn't offer it in HW.


So the execute permission is not granted when the page table is
created for the first time, then
grant the execute permission by setting xn, based on the actual 
requirements.


And according to spec:
DIC, bit [29]
	Instruction cache invalidation requirements for data to instruction 
coherence.

0b0 Instruction cache invalidation to the Point of Unification is
required for data to instruction coherence.
0b1 Instruction cache invalidation to the Point of Unification is not
required for data to instruction coherence.
So when DIC is set, if the memory is changed to executable, the
hardware will flush icache.


No. The Icache *snoops* the Dcache at all times. Which is why we don't
need to trap on execution, and we can leave the guest run without
any intervention.


If as you said, I feel that DIC conflicts with ARMv8.2-TTS2UXN feature.


There is no conflict. KVM doesn't make use of all the bells and whistle
in the architecture, which is probably a good thing. If you feel that 
there
is a need for S2UXN as a security feature, we can discuss how to expose 
this

to the guest (because it definitely needs to know about that).

But setting XN when DIC is present for no other reason than "it may be
useful one day" doesn't make sense, and results in a massive performance
drop on the platforms that have DIC (and I really wish they all had it).

   M.
--
Jazz is not dead. It just smells funny...

[PATCH] arm64: tegra186: Add missing CPU PMUs

2020-10-13 Thread Marc Zyngier

Add the description of CPU PMUs for both the Denver and A57 clusters,
which enables the perf subsystem.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 28 +++-
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index fd44545e124d..6bb03668a8d3 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1321,7 +1321,7 @@ cpus {
#address-cells = <1>;
#size-cells = <0>;
 
-   cpu@0 {
+   denver_0: cpu@0 {
compatible = "nvidia,tegra186-denver";
device_type = "cpu";
i-cache-size = <0x2>;
@@ -1334,7 +1334,7 @@ cpu@0 {
reg = <0x000>;
};
 
-   cpu@1 {
+   denver_1: cpu@1 {
compatible = "nvidia,tegra186-denver";
device_type = "cpu";
i-cache-size = <0x2>;
@@ -1347,7 +1347,7 @@ cpu@1 {
reg = <0x001>;
};
 
-   cpu@2 {
+   ca57_0: cpu@2 {
compatible = "arm,cortex-a57";
device_type = "cpu";
i-cache-size = <0xC000>;
@@ -1360,7 +1360,7 @@ cpu@2 {
reg = <0x100>;
};
 
-   cpu@3 {
+   ca57_1: cpu@3 {
compatible = "arm,cortex-a57";
device_type = "cpu";
i-cache-size = <0xC000>;
@@ -1373,7 +1373,7 @@ cpu@3 {
reg = <0x101>;
};
 
-   cpu@4 {
+   ca57_2: cpu@4 {
compatible = "arm,cortex-a57";
device_type = "cpu";
i-cache-size = <0xC000>;
@@ -1386,7 +1386,7 @@ cpu@4 {
reg = <0x102>;
};
 
-   cpu@5 {
+   ca57_3: cpu@5 {
compatible = "arm,cortex-a57";
device_type = "cpu";
i-cache-size = <0xC000>;
@@ -1418,6 +1418,22 @@ L2_A57: l2-cache1 {
};
};
 
+   pmu_denver {
+   compatible = "nvidia,denver-pmu", "arm,armv8-pmuv3";
+   interrupts = ,
+;
+   interrupt-affinity = <_0 _1>;
+   };
+
+   pmu_a57 {
+   compatible = "arm,cortex-a57-pmu", "arm,armv8-pmuv3";
+   interrupts = ,
+,
+,
+;
+   interrupt-affinity = <_0 _1 _2 _3>;
+   };
+
thermal-zones {
a57 {
polling-delay = <0>;
-- 
2.28.0

[PATCH] phy: tegra: xusb: Fix dangling pointer on probe failure

2020-10-13 Thread Marc Zyngier

If, for some reason, the xusb PHY fails to probe, it leaves
a dangling pointer attached to the platform device structure.

This would normally be harmless, but the Tegra XHCI driver then
goes and extract that pointer from the PHY device. Things go
downhill from there:

8.752082] [004d554e5145533c] address between user and kernel address ranges
[8.752085] Internal error: Oops: 9604 [#1] PREEMPT SMP
[8.752088] Modules linked in: max77620_regulator(E+) xhci_tegra(E+) 
sdhci_tegra(E+) xhci_hcd(E) sdhci_pltfm(E) cqhci(E) fixed(E) usbcore(E) 
scsi_mod(E) sdhci(E) host1x(E+)
[8.752103] CPU: 4 PID: 158 Comm: systemd-udevd Tainted: G S  W   E 
5.9.0-rc7-00298-gf6337624c4fe #1980
[8.752105] Hardware name: NVIDIA Jetson TX2 Developer Kit (DT)
[8.752108] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
[8.752115] pc : kobject_put+0x1c/0x21c
[8.752120] lr : put_device+0x20/0x30
[8.752121] sp : ffc012eb3840
[8.752122] x29: ffc012eb3840 x28: ffc010e82638
[8.752125] x27: ffc008d56440 x26: 
[8.752128] x25: ff81eb508200 x24: 
[8.752130] x23: ff81eb538800 x22: 
[8.752132] x21: fdfb x20: ff81eb538810
[8.752134] x19: 3d4d554e51455300 x18: 0020
[8.752136] x17: ffc008d00270 x16: ffc008d00c94
[8.752138] x15: 0004 x14: ff81ebd4ae90
[8.752140] x13:  x12: ff81eb86a4e8
[8.752142] x11: ff81eb86a480 x10: ff81eb862fea
[8.752144] x9 : ffc01055fb28 x8 : ff81eb86a4a8
[8.752146] x7 : 0001 x6 : 0001
[8.752148] x5 : ff81dff8bc38 x4 : 
[8.752150] x3 : 0001 x2 : 0001
[8.752152] x1 : 0002 x0 : 3d4d554e51455300
[8.752155] Call trace:
[8.752157]  kobject_put+0x1c/0x21c
[8.752160]  put_device+0x20/0x30
[8.752164]  tegra_xusb_padctl_put+0x24/0x3c
[8.752170]  tegra_xusb_probe+0x8b0/0xd10 [xhci_tegra]
[8.752174]  platform_drv_probe+0x60/0xb4
[8.752176]  really_probe+0xf0/0x504
[8.752179]  driver_probe_device+0x100/0x170
[8.752181]  device_driver_attach+0xcc/0xd4
[8.752183]  __driver_attach+0xb0/0x17c
[8.752185]  bus_for_each_dev+0x7c/0xd4
[8.752187]  driver_attach+0x30/0x3c
[8.752189]  bus_add_driver+0x154/0x250
[8.752191]  driver_register+0x84/0x140
[8.752193]  __platform_driver_register+0x54/0x60
[8.752197]  tegra_xusb_init+0x40/0x1000 [xhci_tegra]
[8.752201]  do_one_initcall+0x54/0x2d0
[8.752205]  do_init_module+0x68/0x29c
[8.752207]  load_module+0x2178/0x26c0
[8.752209]  __do_sys_finit_module+0xb0/0x120
[8.752211]  __arm64_sys_finit_module+0x2c/0x40
[8.752215]  el0_svc_common.constprop.0+0x80/0x240
[8.752218]  do_el0_svc+0x30/0xa0
[8.752220]  el0_svc+0x18/0x50
[8.752223]  el0_sync_handler+0x90/0x318
[8.752225]  el0_sync+0x158/0x180
[8.752230] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (3940f000)
[8.752232] ---[ end trace 90f6c89d62d85ff5 ]---

Reset the pointer on probe failure fixes the issue.

Fixes: 53d2a715c2403 ("phy: Add Tegra XUSB pad controller support")
Signed-off-by: Marc Zyngier 
---
 drivers/phy/tegra/xusb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/phy/tegra/xusb.c b/drivers/phy/tegra/xusb.c
index de4a46fe1763..ad88d74c1884 100644
--- a/drivers/phy/tegra/xusb.c
+++ b/drivers/phy/tegra/xusb.c
@@ -1242,6 +1242,7 @@ static int tegra_xusb_padctl_probe(struct platform_device 
*pdev)
 reset:
reset_control_assert(padctl->rst);
 remove:
+   platform_set_drvdata(pdev, NULL);
soc->ops->remove(padctl);
return err;
 }
-- 
2.28.0

[PATCH] drm/tegra: sor: Ensure regulators are disabled on teardown

2020-10-13 Thread Marc Zyngier

The Tegra SOR driver uses the devm infrastructure to request regulators,
but enables them without registering them with the infrastructure.

This results in the following splat if probing fails for any odd resaon
(such as dependencies not being available):

[8.974187] tegra-sor 1558.sor: cannot get HDMI supply: -517
[9.414403] tegra-sor 1558.sor: failed to probe HDMI: -517
[9.421240] [ cut here ]
[9.425879] WARNING: CPU: 1 PID: 164 at drivers/regulator/core.c:2089 
_regulator_put.part.0+0x16c/0x174
[9.435259] Modules linked in: tegra_drm(E+) cec(E) ahci_tegra(E) 
drm_kms_helper(E) drm(E) libahci_platform(E) libahci(E) max77620_regulator(E) 
xhci_tegra(E+) sdhci_tegra(E) xhci_hcd(E) libata(E) sdhci_pltfm(E) cqhci(E) 
fixed(E) usbcore(E) scsi_mod(E) sdhci(E) host1x(E)
[9.459211] CPU: 1 PID: 164 Comm: systemd-udevd Tainted: G SD W   E 
5.9.0-rc7-00298-gf6337624c4fe #1980
[9.469285] Hardware name: NVIDIA Jetson TX2 Developer Kit (DT)
[9.475202] pstate: 8005 (Nzcv daif -PAN -UAO BTYPE=--)
[9.480784] pc : _regulator_put.part.0+0x16c/0x174
[9.485581] lr : regulator_put+0x44/0x60
[9.489501] sp : ffc011d837b0
[9.492814] x29: ffc011d837b0 x28: ff81dd085900
[9.498141] x27: ff81de1c8ec0 x26: ff81de1c8c10
[9.503464] x25: ff81dd085800 x24: ffc008f2c6b0
[9.508790] x23: ffc0117373f0 x22: 0005
[9.514101] x21: ff81dd085900 x20: ffc01172b098
[9.515822] ata1: SATA link down (SStatus 0 SControl 300)
[9.519426] x19: ff81dd085100 x18: 0030
[9.530122] x17:  x16: 
[9.535453] x15:  x14: 038f
[9.540777] x13: 0003 x12: 0040
[9.546105] x11: ff81eb80 x10: 0ae0
[9.551417] x9 : ffc0106fea24 x8 : ff81de83e6c0
[9.556728] x7 : 0018 x6 : 03c3
[9.562064] x5 : 5660 x4 : 
[9.567392] x3 : ffc01172b388 x2 : ff81de83db80
[9.572702] x1 :  x0 : 0001
[9.578034] Call trace:
[9.580494]  _regulator_put.part.0+0x16c/0x174
[9.584940]  regulator_put+0x44/0x60
[9.588522]  devm_regulator_release+0x20/0x2c
[9.592885]  release_nodes+0x1c8/0x2c0
[9.596636]  devres_release_all+0x44/0x6c
[9.600649]  really_probe+0x1ec/0x504
[9.604316]  driver_probe_device+0x100/0x170
[9.608589]  device_driver_attach+0xcc/0xd4
[9.612774]  __driver_attach+0xb0/0x17c
[9.616614]  bus_for_each_dev+0x7c/0xd4
[9.620450]  driver_attach+0x30/0x3c
[9.624027]  bus_add_driver+0x154/0x250
[9.627867]  driver_register+0x84/0x140
[9.631719]  __platform_register_drivers+0xa0/0x180
[9.636660]  host1x_drm_init+0x60/0x1000 [tegra_drm]
[9.641629]  do_one_initcall+0x54/0x2d0
[9.645490]  do_init_module+0x68/0x29c
[9.649244]  load_module+0x2178/0x26c0
[9.652997]  __do_sys_finit_module+0xb0/0x120
[9.657356]  __arm64_sys_finit_module+0x2c/0x40
[9.661902]  el0_svc_common.constprop.0+0x80/0x240
[9.666701]  do_el0_svc+0x30/0xa0
[9.670022]  el0_svc+0x18/0x50
[9.673081]  el0_sync_handler+0x90/0x318
[9.677006]  el0_sync+0x158/0x180
[9.680324] ---[ end trace 90f6c89d62d85ff6 ]---

Instead, let's register a callback that will disable the regulators
on teardown. This allows for the removal of the .remove callbacks,
which are not needed anymore.

Signed-off-by: Marc Zyngier 
---
 drivers/gpu/drm/tegra/sor.c | 59 +++--
 1 file changed, 24 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c
index 45b5258c77a2..39e6b32f6c10 100644
--- a/drivers/gpu/drm/tegra/sor.c
+++ b/drivers/gpu/drm/tegra/sor.c
@@ -397,7 +397,6 @@ struct tegra_sor;
 struct tegra_sor_ops {
const char *name;
int (*probe)(struct tegra_sor *sor);
-   int (*remove)(struct tegra_sor *sor);
void (*audio_enable)(struct tegra_sor *sor);
void (*audio_disable)(struct tegra_sor *sor);
 };
@@ -2942,6 +2941,24 @@ static const struct drm_encoder_helper_funcs 
tegra_sor_dp_helpers = {
.atomic_check = tegra_sor_encoder_atomic_check,
 };
 
+static void tegra_sor_disable_regulator(void *data)
+{
+   struct regulator *reg = data;
+
+   regulator_disable(reg);
+}
+
+static int tegra_sor_enable_regulator(struct tegra_sor *sor, struct regulator 
*reg)
+{
+   int err;
+
+   err = regulator_enable(reg);
+   if (err)
+   return err;
+
+   return devm_add_action_or_reset(sor->dev, tegra_sor_disable_regulator, 
reg);
+}
+
 static int tegra_sor_hdmi_probe(struct tegra_sor *sor)
 {
int err;
@@ -2953,7 +2970,7 @@ static int tegra_sor_hdmi_probe(struct tegra_sor *sor)
return PTR_ERR(sor->avdd_io_supply);
}
 
-   err = regulator_enable(sor->avdd_

Re: [PATCH v2 2/2] irqchip/ti-sci-inta: Add support for unmapped event handling

2020-10-12 Thread Marc Zyngier


On 2020-10-09 09:58, Peter Ujfalusi wrote:

Marc,



[...]


The design of irqchip/irq-ti-sci-inta.c, soc/ti/ti_sci_inta_msi.c and
irqchip/irq-ti-sci-intr.c created to handle the interrupt needs present
in K3 devices with NAVSS.
DMSS of newer K3 devices extends and simplifies the NAVSS components 
and

a big part of that change was done with the INTA and DMAs.
System Firmware also changed to adopt to these changes.

As an example, let's assume that we want an interrupt from a ring.
The high level of the events in this case are:

NAVSS:
1.1 ring generates an internal signal (up or down)
1.2 the ringacc will send a (mapped) Global Event to INTA
1.3 When INTA receives the global event, it will signal it's outgoing
VINT to INTR
1.4 INTR will trigger a host interrupt.

DMSS
1.1 ring generates an internal signal (up or down)
1.2 the DMA (ring is now part of the DMA) will send an unmapped event 
to

INTA
1.3 When INTA receives the unmapped event, it will send a (mapped)
Global Event to itself
1.4 When INTA receives the global event, it will signal it's outgoing
VINT to INTR
1.5 INTR will trigger a host interrupt.

The API from sysfw is the same to configure the global events and VINT,
but we need to use the INTA's tisci device identification number to let
sysfw know that the Global event number can be programmed into INTA's
unmapped event steering register. The DMA no longer have this register,
it sends unmapped event to INTA.

The unmapped event number is fixed per sources, they will arrive at the
specific unmapped event configuration register of INTA.

INTA itself does not know which source are allocated to be used by
Linux, the allocation is for the DMA resources and only the DMA driver
knows which channels, rings, flows can be used and can ask the INTA MSI
domain to create interrupts for those.

By handling the ti,unmapped-event-sources the INTA driver can make
decision on the correct tisci dev_id to be used for the sysfw API to
where the global event must be configured and the client drivers does
not need to know how things are working under the hood.

There are components in DMSS which use is exactly how they worked 
within

NAVSS, they are not using unmapped events. Ringacc comes to mind first.

I can add a comment block to explain the nature of unmapped events and
the reason why we need to do what we do.

Would this be acceptable?


That'd be useful, as long as it is shorter than the above.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] arm64: KVM: marking pages as XN in Stage-2 does not care about CTR_EL0.DIC

2020-10-12 Thread Marc Zyngier


Hi Li,

On 2020-10-12 02:08, l00484210 wrote:

From: MingWang Li 

When testing the ARMv8.2-TTS2UXN feature, setting bits of XN is 
unavailable.

Because the control bit CTR_EL0.DIC is set by default on system.

But when CTR_EL0.DIC is set, software does not need to flush icache 
actively,

instead of clearing XN bits.The patch, the commit id of which
is 6ae4b6e0578886eb36cedbf99f04031d93f9e315, has implemented the 
function

of CTR_EL0.DIC.

Signed-off-by: MingWang Li 
Signed-off-by: Henglong Fan 
---
 arch/arm64/include/asm/pgtable-prot.h | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h
b/arch/arm64/include/asm/pgtable-prot.h
index 4d867c6446c4..5feb94882bf7 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -79,17 +79,7 @@ extern bool arm64_use_ng_mappings;
__val;  \
 })

-#define PAGE_S2_XN \
-   ({  \
-   u64 __val;  \
-   if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))   \
-   __val = 0;  \
-   else\
-   __val = PTE_S2_XN;  \
-   __val;  \
-   })
-
-#define PAGE_S2__pgprot(_PROT_DEFAULT | 
PAGE_S2_MEMATTR(NORMAL) |
PTE_S2_RDONLY | PAGE_S2_XN)
+#define PAGE_S2__pgprot(_PROT_DEFAULT | 
PAGE_S2_MEMATTR(NORMAL) |
PTE_S2_RDONLY | PTE_S2_XN)
 #define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT |
PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)

 #define PAGE_NONE  __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) |
PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)


I don't understand what you are trying to achieve here.

This whole point of not setting XN in the page tables when DIC is 
present

is to avoid a pointless permission fault at run time. At you noticed
above, no icache invalidation is necessary. So why would you ever want
to take a fault on exec the first place?

M.
--
Jazz is not dead. It just smells funny...

[tip: irq/core] arm: Move ipi_teardown() to a CONFIG_HOTPLUG_CPU section

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: ac15a54e03d13686d2fc016a88311801b0734046
Gitweb:
https://git.kernel.org/tip/ac15a54e03d13686d2fc016a88311801b0734046
Author:Marc Zyngier 
AuthorDate:Fri, 18 Sep 2020 17:19:46 +01:00
Committer: Marc Zyngier 
CommitterDate: Fri, 18 Sep 2020 17:40:48 +01:00

arm: Move ipi_teardown() to a CONFIG_HOTPLUG_CPU section

ipi_teardown() is only used when CONFIG_HOTPLUG_CPU is enabled.
Move the function to a location guarded by this config option.

Signed-off-by: Marc Zyngier 
---
 arch/arm/kernel/smp.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 00327fa..8425da5 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -85,7 +85,6 @@ static int nr_ipi __read_mostly = NR_IPI;
 static struct irq_desc *ipi_desc[MAX_IPI] __read_mostly;
 
 static void ipi_setup(int cpu);
-static void ipi_teardown(int cpu);
 
 static DECLARE_COMPLETION(cpu_running);
 
@@ -236,6 +235,17 @@ int platform_can_hotplug_cpu(unsigned int cpu)
return cpu != 0;
 }
 
+static void ipi_teardown(int cpu)
+{
+   int i;
+
+   if (WARN_ON_ONCE(!ipi_irq_base))
+   return;
+
+   for (i = 0; i < nr_ipi; i++)
+   disable_percpu_irq(ipi_irq_base + i);
+}
+
 /*
  * __cpu_disable runs on the processor to be shutdown.
  */
@@ -707,17 +717,6 @@ static void ipi_setup(int cpu)
enable_percpu_irq(ipi_irq_base + i, 0);
 }
 
-static void ipi_teardown(int cpu)
-{
-   int i;
-
-   if (WARN_ON_ONCE(!ipi_irq_base))
-   return;
-
-   for (i = 0; i < nr_ipi; i++)
-   disable_percpu_irq(ipi_irq_base + i);
-}
-
 void __init set_smp_ipi_range(int ipi_base, int n)
 {
int i;

[tip: irq/core] gpio: tegra186: Allow optional irq parent callbacks

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 986ec63d4482292570b579ac98b151acf8bdd1de
Gitweb:
https://git.kernel.org/tip/986ec63d4482292570b579ac98b151acf8bdd1de
Author:Marc Zyngier 
AuthorDate:Mon, 05 Oct 2020 10:27:27 +01:00
Committer: Marc Zyngier 
CommitterDate: Sat, 10 Oct 2020 12:12:10 +01:00

gpio: tegra186: Allow optional irq parent callbacks

Make the tegra186 GPIO driver resistent to variable depth
interrupt hierarchy, which we are about to introduce.

No functionnal change yet.

Signed-off-by: Marc Zyngier 
---
 drivers/gpio/gpio-tegra186.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-tegra186.c b/drivers/gpio/gpio-tegra186.c
index 178e912..9500074 100644
--- a/drivers/gpio/gpio-tegra186.c
+++ b/drivers/gpio/gpio-tegra186.c
@@ -430,7 +430,18 @@ static int tegra186_irq_set_type(struct irq_data *data, 
unsigned int type)
else
irq_set_handler_locked(data, handle_edge_irq);
 
-   return irq_chip_set_type_parent(data, type);
+   if (data->parent_data)
+   return irq_chip_set_type_parent(data, type);
+
+   return 0;
+}
+
+static int tegra186_irq_set_wake(struct irq_data *data, unsigned int on)
+{
+   if (data->parent_data)
+   return irq_chip_set_wake_parent(data, on);
+
+   return 0;
 }
 
 static void tegra186_gpio_irq(struct irq_desc *desc)
@@ -678,7 +689,7 @@ static int tegra186_gpio_probe(struct platform_device *pdev)
gpio->intc.irq_mask = tegra186_irq_mask;
gpio->intc.irq_unmask = tegra186_irq_unmask;
gpio->intc.irq_set_type = tegra186_irq_set_type;
-   gpio->intc.irq_set_wake = irq_chip_set_wake_parent;
+   gpio->intc.irq_set_wake = tegra186_irq_set_wake;
 
irq = >gpio.irq;
irq->chip = >intc;

[tip: irq/core] genirq/irqdomain: Allow partial trimming of irq_data hierarchy

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 55567976629e58fde28fb70612ca73228271eef2
Gitweb:
https://git.kernel.org/tip/55567976629e58fde28fb70612ca73228271eef2
Author:Marc Zyngier 
AuthorDate:Tue, 06 Oct 2020 10:10:20 +01:00
Committer: Marc Zyngier 
CommitterDate: Sat, 10 Oct 2020 12:12:10 +01:00

genirq/irqdomain: Allow partial trimming of irq_data hierarchy

It appears that some HW is ugly enough that not all the interrupts
connected to a particular interrupt controller end up with the same
hierarchy depth (some of them are terminated early). This leaves
the irqchip hacker with only two choices, both equally bad:

- create discrete domain chains, one for each "hierarchy depth",
  which is very hard to maintain

- create fake hierarchy levels for the shallow paths, leading
  to all kind of problems (what are the safe hwirq values for these
  fake levels?)

Implement the ability to cut short a single interrupt hierarchy
from a level marked as being disconnected by using the new
irq_domain_disconnect_hierarchy() helper.

The irqdomain allocation code will then perform the trimming

Signed-off-by: Marc Zyngier 
---
 include/linux/irqdomain.h |   3 +-
 kernel/irq/irqdomain.c|  99 +++--
 2 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index b37350c..a52b095 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -509,6 +509,9 @@ extern void irq_domain_free_irqs_parent(struct irq_domain 
*domain,
unsigned int irq_base,
unsigned int nr_irqs);
 
+extern int irq_domain_disconnect_hierarchy(struct irq_domain *domain,
+  unsigned int virq);
+
 static inline bool irq_domain_is_hierarchy(struct irq_domain *domain)
 {
return domain->flags & IRQ_DOMAIN_FLAG_HIERARCHY;
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 76cd7eb..cf8b374 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1136,6 +1136,17 @@ static struct irq_data 
*irq_domain_insert_irq_data(struct irq_domain *domain,
return irq_data;
 }
 
+static void __irq_domain_free_hierarchy(struct irq_data *irq_data)
+{
+   struct irq_data *tmp;
+
+   while (irq_data) {
+   tmp = irq_data;
+   irq_data = irq_data->parent_data;
+   kfree(tmp);
+   }
+}
+
 static void irq_domain_free_irq_data(unsigned int virq, unsigned int nr_irqs)
 {
struct irq_data *irq_data, *tmp;
@@ -1147,12 +1158,83 @@ static void irq_domain_free_irq_data(unsigned int virq, 
unsigned int nr_irqs)
irq_data->parent_data = NULL;
irq_data->domain = NULL;
 
-   while (tmp) {
-   irq_data = tmp;
-   tmp = tmp->parent_data;
-   kfree(irq_data);
+   __irq_domain_free_hierarchy(tmp);
+   }
+}
+
+/**
+ * irq_domain_disconnect_hierarchy - Mark the first unused level of a hierarchy
+ * @domain:IRQ domain from which the hierarchy is to be disconnected
+ * @virq:  IRQ number where the hierarchy is to be trimmed
+ *
+ * Marks the @virq level belonging to @domain as disconnected.
+ * Returns -EINVAL if @virq doesn't have a valid irq_data pointing
+ * to @domain.
+ *
+ * Its only use is to be able to trim levels of hierarchy that do not
+ * have any real meaning for this interrupt, and that the driver marks
+ * as such from its .alloc() callback.
+ */
+int irq_domain_disconnect_hierarchy(struct irq_domain *domain,
+   unsigned int virq)
+{
+   struct irq_data *irqd;
+
+   irqd = irq_domain_get_irq_data(domain, virq);
+   if (!irqd)
+   return -EINVAL;
+
+   irqd->chip = ERR_PTR(-ENOTCONN);
+   return 0;
+}
+
+static int irq_domain_trim_hierarchy(unsigned int virq)
+{
+   struct irq_data *tail, *irqd, *irq_data;
+
+   irq_data = irq_get_irq_data(virq);
+   tail = NULL;
+
+   /* The first entry must have a valid irqchip */
+   if (!irq_data->chip || IS_ERR(irq_data->chip))
+   return -EINVAL;
+
+   /*
+* Validate that the irq_data chain is sane in the presence of
+* a hierarchy trimming marker.
+*/
+   for (irqd = irq_data->parent_data; irqd; irq_data = irqd, irqd = 
irqd->parent_data) {
+   /* Can't have a valid irqchip after a trim marker */
+   if (irqd->chip && tail)
+   return -EINVAL;
+
+   /* Can't have an empty irqchip before a trim marker */
+   if (!irqd->chip && !tail)
+   return -EINVAL;
+
+   if (IS_ERR(irqd->chip)) {
+   /* Only -ENOTCONN is a valid trim m

[tip: irq/core] irqchip/bcm2836: Configure mailbox interrupts as standard interrupts

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 0809ae724904c3c5dbdddf4169d48aac9c6fcdc8
Gitweb:
https://git.kernel.org/tip/0809ae724904c3c5dbdddf4169d48aac9c6fcdc8
Author:Marc Zyngier 
AuthorDate:Tue, 05 May 2020 12:59:04 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 17 Sep 2020 16:37:27 +01:00

irqchip/bcm2836: Configure mailbox interrupts as standard interrupts

In order to switch the bcm2836 driver to privide standard interrupts
for IPIs, it first needs to stop lying about the way things work.

The mailbox interrupt is actually a multiplexer, with enough
bits to store 32 pending interrupts per CPU. So let's turn it
into a chained irqchip.

Once this is done, we can instanciate the corresponding IPIs,
and pass them to the architecture code.

Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/irq-bcm2836.c | 151 +++--
 1 file changed, 125 insertions(+), 26 deletions(-)

diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
index 2038693..85df6dd 100644
--- a/drivers/irqchip/irq-bcm2836.c
+++ b/drivers/irqchip/irq-bcm2836.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -89,12 +90,24 @@ static struct irq_chip bcm2836_arm_irqchip_gpu = {
.irq_unmask = bcm2836_arm_irqchip_unmask_gpu_irq,
 };
 
+static void bcm2836_arm_irqchip_dummy_op(struct irq_data *d)
+{
+}
+
+static struct irq_chip bcm2836_arm_irqchip_dummy = {
+   .name   = "bcm2836-dummy",
+   .irq_eoi= bcm2836_arm_irqchip_dummy_op,
+};
+
 static int bcm2836_map(struct irq_domain *d, unsigned int irq,
   irq_hw_number_t hw)
 {
struct irq_chip *chip;
 
switch (hw) {
+   case LOCAL_IRQ_MAILBOX0:
+   chip = _arm_irqchip_dummy;
+   break;
case LOCAL_IRQ_CNTPSIRQ:
case LOCAL_IRQ_CNTPNSIRQ:
case LOCAL_IRQ_CNTHPIRQ:
@@ -127,17 +140,7 @@ __exception_irq_entry 
bcm2836_arm_irqchip_handle_irq(struct pt_regs *regs)
u32 stat;
 
stat = readl_relaxed(intc.base + LOCAL_IRQ_PENDING0 + 4 * cpu);
-   if (stat & BIT(LOCAL_IRQ_MAILBOX0)) {
-#ifdef CONFIG_SMP
-   void __iomem *mailbox0 = (intc.base +
- LOCAL_MAILBOX0_CLR0 + 16 * cpu);
-   u32 mbox_val = readl(mailbox0);
-   u32 ipi = ffs(mbox_val) - 1;
-
-   writel(1 << ipi, mailbox0);
-   handle_IPI(ipi, regs);
-#endif
-   } else if (stat) {
+   if (stat) {
u32 hwirq = ffs(stat) - 1;
 
handle_domain_irq(intc.domain, hwirq, regs);
@@ -145,8 +148,35 @@ __exception_irq_entry 
bcm2836_arm_irqchip_handle_irq(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_SMP
-static void bcm2836_arm_irqchip_send_ipi(const struct cpumask *mask,
-unsigned int ipi)
+static struct irq_domain *ipi_domain;
+
+static void bcm2836_arm_irqchip_handle_ipi(struct irq_desc *desc)
+{
+   struct irq_chip *chip = irq_desc_get_chip(desc);
+   int cpu = smp_processor_id();
+   u32 mbox_val;
+
+   chained_irq_enter(chip, desc);
+
+   mbox_val = readl_relaxed(intc.base + LOCAL_MAILBOX0_CLR0 + 16 * cpu);
+   if (mbox_val) {
+   int hwirq = ffs(mbox_val) - 1;
+   generic_handle_irq(irq_find_mapping(ipi_domain, hwirq));
+   }
+
+   chained_irq_exit(chip, desc);
+}
+
+static void bcm2836_arm_irqchip_ipi_eoi(struct irq_data *d)
+{
+   int cpu = smp_processor_id();
+
+   writel_relaxed(BIT(d->hwirq),
+  intc.base + LOCAL_MAILBOX0_CLR0 + 16 * cpu);
+}
+
+static void bcm2836_arm_irqchip_ipi_send_mask(struct irq_data *d,
+ const struct cpumask *mask)
 {
int cpu;
void __iomem *mailbox0_base = intc.base + LOCAL_MAILBOX0_SET0;
@@ -157,11 +187,45 @@ static void bcm2836_arm_irqchip_send_ipi(const struct 
cpumask *mask,
 */
smp_wmb();
 
-   for_each_cpu(cpu, mask) {
-   writel(1 << ipi, mailbox0_base + 16 * cpu);
+   for_each_cpu(cpu, mask)
+   writel_relaxed(BIT(d->hwirq), mailbox0_base + 16 * cpu);
+}
+
+static struct irq_chip bcm2836_arm_irqchip_ipi = {
+   .name   = "IPI",
+   .irq_eoi= bcm2836_arm_irqchip_ipi_eoi,
+   .ipi_send_mask  = bcm2836_arm_irqchip_ipi_send_mask,
+};
+
+static int bcm2836_arm_irqchip_ipi_alloc(struct irq_domain *d,
+unsigned int virq,
+unsigned int nr_irqs, void *args)
+{
+   int i;
+
+   for (i = 0; i < nr_irqs; i++) {
+   irq_set_percpu_devid(virq + i);
+   irq_domain_set_info(d, virq + i, i, _arm_irqchip_ipi,
+   d->host_data,
+

[tip: irq/core] irqchip/gic-v3: Configure SGIs as standard interrupts

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 64b499d8df40dadb1818ad9f74c4546951b37a8f
Gitweb:
https://git.kernel.org/tip/64b499d8df40dadb1818ad9f74c4546951b37a8f
Author:Marc Zyngier 
AuthorDate:Sat, 25 Apr 2020 15:24:01 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 17 Sep 2020 16:37:26 +01:00

irqchip/gic-v3: Configure SGIs as standard interrupts

Change the way we deal with GICv3 SGIs by turning them into proper
IRQs, and calling into the arch code to register the interrupt range
instead of a callback.

Reviewed-by: Valentin Schneider 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/irq-gic-v3.c | 94 ++-
 1 file changed, 51 insertions(+), 43 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index f7c99a3..84ceb63 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -36,6 +36,8 @@
 #define FLAGS_WORKAROUND_GICR_WAKER_MSM8996(1ULL << 0)
 #define FLAGS_WORKAROUND_CAVIUM_ERRATUM_38539  (1ULL << 1)
 
+#define GIC_IRQ_TYPE_PARTITION (GIC_IRQ_TYPE_LPI + 1)
+
 struct redist_region {
void __iomem*redist_base;
phys_addr_t phys_base;
@@ -383,7 +385,7 @@ static int gic_irq_set_irqchip_state(struct irq_data *d,
 {
u32 reg;
 
-   if (d->hwirq >= 8192) /* PPI/SPI only */
+   if (d->hwirq >= 8192) /* SGI/PPI/SPI only */
return -EINVAL;
 
switch (which) {
@@ -550,12 +552,12 @@ static int gic_set_type(struct irq_data *d, unsigned int 
type)
u32 offset, index;
int ret;
 
-   /* Interrupt configuration for SGIs can't be changed */
-   if (irq < 16)
-   return -EINVAL;
-
range = get_intid_range(d);
 
+   /* Interrupt configuration for SGIs can't be changed */
+   if (range == SGI_RANGE)
+   return type != IRQ_TYPE_EDGE_RISING ? -EINVAL : 0;
+
/* SPIs have restrictions on the supported types */
if ((range == SPI_RANGE || range == ESPI_RANGE) &&
type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_EDGE_RISING)
@@ -583,6 +585,9 @@ static int gic_set_type(struct irq_data *d, unsigned int 
type)
 
 static int gic_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu)
 {
+   if (get_intid_range(d) == SGI_RANGE)
+   return -EINVAL;
+
if (vcpu)
irqd_set_forwarded_to_vcpu(d);
else
@@ -657,38 +662,14 @@ static asmlinkage void __exception_irq_entry 
gic_handle_irq(struct pt_regs *regs
if ((irqnr >= 1020 && irqnr <= 1023))
return;
 
-   /* Treat anything but SGIs in a uniform way */
-   if (likely(irqnr > 15)) {
-   int err;
-
-   if (static_branch_likely(_deactivate_key))
-   gic_write_eoir(irqnr);
-   else
-   isb();
-
-   err = handle_domain_irq(gic_data.domain, irqnr, regs);
-   if (err) {
-   WARN_ONCE(true, "Unexpected interrupt received!\n");
-   gic_deactivate_unhandled(irqnr);
-   }
-   return;
-   }
-   if (irqnr < 16) {
+   if (static_branch_likely(_deactivate_key))
gic_write_eoir(irqnr);
-   if (static_branch_likely(_deactivate_key))
-   gic_write_dir(irqnr);
-#ifdef CONFIG_SMP
-   /*
-* Unlike GICv2, we don't need an smp_rmb() here.
-* The control dependency from gic_read_iar to
-* the ISB in gic_write_eoir is enough to ensure
-* that any shared data read by handle_IPI will
-* be read after the ACK.
-*/
-   handle_IPI(irqnr, regs);
-#else
-   WARN_ONCE(true, "Unexpected SGI received!\n");
-#endif
+   else
+   isb();
+
+   if (handle_domain_irq(gic_data.domain, irqnr, regs)) {
+   WARN_ONCE(true, "Unexpected interrupt received!\n");
+   gic_deactivate_unhandled(irqnr);
}
 }
 
@@ -1136,11 +1117,11 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, 
unsigned int irq)
gic_write_sgi1r(val);
 }
 
-static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
+static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
 {
int cpu;
 
-   if (WARN_ON(irq >= 16))
+   if (WARN_ON(d->hwirq >= 16))
return;
 
/*
@@ -1154,7 +1135,7 @@ static void gic_raise_softirq(const struct cpumask *mask, 
unsigned int irq)
u16 tlist;
 
tlist = gic_compute_target_list(, mask, cluster_id);
-   gic_send_sgi(cluster_id, tlist, irq);
+   gic_send_sgi(cluster_id, tlist, d->hwirq);
}

[tip: irq/core] irqchip/gic: Cleanup Franken-GIC handling

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 8594c3b85171b6f68e34e07b533ec2f1bf7fb065
Gitweb:
https://git.kernel.org/tip/8594c3b85171b6f68e34e07b533ec2f1bf7fb065
Author:Marc Zyngier 
AuthorDate:Tue, 15 Sep 2020 14:03:51 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 17 Sep 2020 16:37:29 +01:00

irqchip/gic: Cleanup Franken-GIC handling

Introduce a static key identifying Samsung's unique creation, allowing
to replace the indirect call to compute the base addresses with
a simple test on the static key.

Faster, cheaper, negative diffstat.

Tested-by: Marek Szyprowski 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/irq-gic.c | 41 +++---
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 66671e1..30edcca 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -83,9 +83,6 @@ struct gic_chip_data {
 #endif
struct irq_domain *domain;
unsigned int gic_irqs;
-#ifdef CONFIG_GIC_NON_BANKED
-   void __iomem *(*get_base)(union gic_base *);
-#endif
 };
 
 #ifdef CONFIG_BL_SWITCHER
@@ -127,35 +124,27 @@ static struct gic_kvm_info gic_v2_kvm_info;
 static DEFINE_PER_CPU(u32, sgi_intid);
 
 #ifdef CONFIG_GIC_NON_BANKED
-static void __iomem *gic_get_percpu_base(union gic_base *base)
-{
-   return raw_cpu_read(*base->percpu_base);
-}
+static DEFINE_STATIC_KEY_FALSE(frankengic_key);
 
-static void __iomem *gic_get_common_base(union gic_base *base)
+static void enable_frankengic(void)
 {
-   return base->common_base;
+   static_branch_enable(_key);
 }
 
-static inline void __iomem *gic_data_dist_base(struct gic_chip_data *data)
+static inline void __iomem *__get_base(union gic_base *base)
 {
-   return data->get_base(>dist_base);
-}
+   if (static_branch_unlikely(_key))
+   return raw_cpu_read(*base->percpu_base);
 
-static inline void __iomem *gic_data_cpu_base(struct gic_chip_data *data)
-{
-   return data->get_base(>cpu_base);
+   return base->common_base;
 }
 
-static inline void gic_set_base_accessor(struct gic_chip_data *data,
-void __iomem *(*f)(union gic_base *))
-{
-   data->get_base = f;
-}
+#define gic_data_dist_base(d)  __get_base(&(d)->dist_base)
+#define gic_data_cpu_base(d)   __get_base(&(d)->cpu_base)
 #else
 #define gic_data_dist_base(d)  ((d)->dist_base.common_base)
 #define gic_data_cpu_base(d)   ((d)->cpu_base.common_base)
-#define gic_set_base_accessor(d, f)
+#define enable_frankengic()do { } while(0)
 #endif
 
 static inline void __iomem *gic_dist_base(struct irq_data *d)
@@ -307,7 +296,7 @@ static int gic_set_type(struct irq_data *d, unsigned int 
type)
 
/* Interrupt configuration for SGIs can't be changed */
if (gicirq < 16)
-   return type == IRQ_TYPE_EDGE_RISING ? 0 : -EINVAL;
+   return type != IRQ_TYPE_EDGE_RISING ? -EINVAL : 0;
 
/* SPIs have restrictions on the supported types */
if (gicirq >= 32 && type != IRQ_TYPE_LEVEL_HIGH &&
@@ -720,11 +709,6 @@ static int gic_notifier(struct notifier_block *self, 
unsigned long cmd,void *v)
int i;
 
for (i = 0; i < CONFIG_ARM_GIC_MAX_NR; i++) {
-#ifdef CONFIG_GIC_NON_BANKED
-   /* Skip over unused GICs */
-   if (!gic_data[i].get_base)
-   continue;
-#endif
switch (cmd) {
case CPU_PM_ENTER:
gic_cpu_save(_data[i]);
@@ -1165,7 +1149,7 @@ static int gic_init_bases(struct gic_chip_data *gic,
gic->raw_cpu_base + offset;
}
 
-   gic_set_base_accessor(gic, gic_get_percpu_base);
+   enable_frankengic();
} else {
/* Normal, sane GIC... */
WARN(gic->percpu_offset,
@@ -1173,7 +1157,6 @@ static int gic_init_bases(struct gic_chip_data *gic,
 gic->percpu_offset);
gic->dist_base.common_base = gic->raw_dist_base;
gic->cpu_base.common_base = gic->raw_cpu_base;
-   gic_set_base_accessor(gic, gic_get_common_base);
}
 
/*

[tip: irq/core] arm64: Kill __smp_cross_call and co

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 5cebfd2d47c214f69d918e3d34ad183c061eddb2
Gitweb:
https://git.kernel.org/tip/5cebfd2d47c214f69d918e3d34ad183c061eddb2
Author:Marc Zyngier 
AuthorDate:Sat, 09 May 2020 14:00:23 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 17 Sep 2020 16:37:28 +01:00

arm64: Kill __smp_cross_call and co

The old IPI registration interface is now unused on arm64, so let's
get rid of it.

Reviewed-by: Valentin Schneider 
Acked-by: Catalin Marinas 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/irq_work.h |  4 +---
 arch/arm64/include/asm/smp.h  | 12 +-
 arch/arm64/kernel/smp.c   | 38 +-
 3 files changed, 8 insertions(+), 46 deletions(-)

diff --git a/arch/arm64/include/asm/irq_work.h 
b/arch/arm64/include/asm/irq_work.h
index 8a1ef19..a102028 100644
--- a/arch/arm64/include/asm/irq_work.h
+++ b/arch/arm64/include/asm/irq_work.h
@@ -2,11 +2,9 @@
 #ifndef __ASM_IRQ_WORK_H
 #define __ASM_IRQ_WORK_H
 
-#include 
-
 static inline bool arch_irq_work_has_interrupt(void)
 {
-   return !!__smp_cross_call;
+   return true;
 }
 
 #endif /* __ASM_IRQ_WORK_H */
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 57c5db1..c298ad0 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -61,24 +61,12 @@ struct seq_file;
 extern void show_ipi_list(struct seq_file *p, int prec);
 
 /*
- * Called from C code, this handles an IPI.
- */
-extern void handle_IPI(int ipinr, struct pt_regs *regs);
-
-/*
  * Discover the set of possible CPUs and determine their
  * SMP operations.
  */
 extern void smp_init_cpus(void);
 
 /*
- * Provide a function to raise an IPI cross call on CPUs in callmap.
- */
-extern void set_smp_cross_call(void (*)(const struct cpumask *, unsigned int));
-
-extern void (*__smp_cross_call)(const struct cpumask *, unsigned int);
-
-/*
  * Register IPI interrupts with the arch SMP code
  */
 extern void set_smp_ipi_range(int ipi_base, int nr_ipi);
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 00c9db1..58fb155 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -782,13 +782,6 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
}
 }
 
-void (*__smp_cross_call)(const struct cpumask *, unsigned int);
-
-void __init set_smp_cross_call(void (*fn)(const struct cpumask *, unsigned 
int))
-{
-   __smp_cross_call = fn;
-}
-
 static const char *ipi_types[NR_IPI] __tracepoint_string = {
 #define S(x,s) [x] = s
S(IPI_RESCHEDULE, "Rescheduling interrupts"),
@@ -800,11 +793,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = 
{
S(IPI_WAKEUP, "CPU wake-up interrupts"),
 };
 
-static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
-{
-   trace_ipi_raise(target, ipi_types[ipinr]);
-   __smp_cross_call(target, ipinr);
-}
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr);
 
 void show_ipi_list(struct seq_file *p, int prec)
 {
@@ -851,8 +840,7 @@ void arch_send_wakeup_ipi_mask(const struct cpumask *mask)
 #ifdef CONFIG_IRQ_WORK
 void arch_irq_work_raise(void)
 {
-   if (__smp_cross_call)
-   smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
+   smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
 }
 #endif
 
@@ -959,34 +947,23 @@ static void do_handle_IPI(int ipinr)
trace_ipi_exit_rcuidle(ipi_types[ipinr]);
 }
 
-/* Legacy version, should go away once all irqchips have been converted */
-void handle_IPI(int ipinr, struct pt_regs *regs)
-{
-   struct pt_regs *old_regs = set_irq_regs(regs);
-
-   irq_enter();
-   do_handle_IPI(ipinr);
-   irq_exit();
-
-   set_irq_regs(old_regs);
-}
-
 static irqreturn_t ipi_handler(int irq, void *data)
 {
do_handle_IPI(irq - ipi_irq_base);
return IRQ_HANDLED;
 }
 
-static void ipi_send(const struct cpumask *target, unsigned int ipi)
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
 {
-   __ipi_send_mask(ipi_desc[ipi], target);
+   trace_ipi_raise(target, ipi_types[ipinr]);
+   __ipi_send_mask(ipi_desc[ipinr], target);
 }
 
 static void ipi_setup(int cpu)
 {
int i;
 
-   if (!ipi_irq_base)
+   if (WARN_ON_ONCE(!ipi_irq_base))
return;
 
for (i = 0; i < nr_ipi; i++)
@@ -997,7 +974,7 @@ static void ipi_teardown(int cpu)
 {
int i;
 
-   if (!ipi_irq_base)
+   if (WARN_ON_ONCE(!ipi_irq_base))
return;
 
for (i = 0; i < nr_ipi; i++)
@@ -1023,7 +1000,6 @@ void __init set_smp_ipi_range(int ipi_base, int n)
}
 
ipi_irq_base = ipi_base;
-   __smp_cross_call = ipi_send;
 
/* Setup the boot CPU immediately */
ipi_setup(smp_processor_id());

[tip: irq/core] ARM: Kill __smp_cross_call and co

2020-10-11 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 8aa837cb7a032884c787b15de81f7d9de8af0869
Gitweb:
https://git.kernel.org/tip/8aa837cb7a032884c787b15de81f7d9de8af0869
Author:Marc Zyngier 
AuthorDate:Mon, 22 Jun 2020 22:15:54 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 17 Sep 2020 16:37:28 +01:00

ARM: Kill __smp_cross_call and co

The old IPI registration interface is now unused on arm, so let's
get rid of it.

Reviewed-by: Valentin Schneider 
Signed-off-by: Marc Zyngier 
---
 arch/arm/include/asm/smp.h |  6 --
 arch/arm/kernel/smp.c  | 26 +++---
 2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 0e29730..0ca55a6 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -39,12 +39,6 @@ void handle_IPI(int ipinr, struct pt_regs *regs);
  */
 extern void smp_init_cpus(void);
 
-
-/*
- * Provide a function to raise an IPI cross call on CPUs in callmap.
- */
-extern void set_smp_cross_call(void (*)(const struct cpumask *, unsigned int));
-
 /*
  * Register IPI interrupts with the arch SMP code
  */
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index f21f784..d51e649 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -511,14 +511,6 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
}
 }
 
-static void (*__smp_cross_call)(const struct cpumask *, unsigned int);
-
-void __init set_smp_cross_call(void (*fn)(const struct cpumask *, unsigned 
int))
-{
-   if (!__smp_cross_call)
-   __smp_cross_call = fn;
-}
-
 static const char *ipi_types[NR_IPI] __tracepoint_string = {
 #define S(x,s) [x] = s
S(IPI_WAKEUP, "CPU wakeup interrupts"),
@@ -530,11 +522,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = 
{
S(IPI_COMPLETION, "completion interrupts"),
 };
 
-static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
-{
-   trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
-   __smp_cross_call(target, ipinr);
-}
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr);
 
 void show_ipi_list(struct seq_file *p, int prec)
 {
@@ -713,16 +701,17 @@ static irqreturn_t ipi_handler(int irq, void *data)
return IRQ_HANDLED;
 }
 
-static void ipi_send(const struct cpumask *target, unsigned int ipi)
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
 {
-   __ipi_send_mask(ipi_desc[ipi], target);
+   trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
+   __ipi_send_mask(ipi_desc[ipinr], target);
 }
 
 static void ipi_setup(int cpu)
 {
int i;
 
-   if (!ipi_irq_base)
+   if (WARN_ON_ONCE(!ipi_irq_base))
return;
 
for (i = 0; i < nr_ipi; i++)
@@ -733,7 +722,7 @@ static void ipi_teardown(int cpu)
 {
int i;
 
-   if (!ipi_irq_base)
+   if (WARN_ON_ONCE(!ipi_irq_base))
return;
 
for (i = 0; i < nr_ipi; i++)
@@ -759,7 +748,6 @@ void __init set_smp_ipi_range(int ipi_base, int n)
}
 
ipi_irq_base = ipi_base;
-   set_smp_cross_call(ipi_send);
 
/* Setup the boot CPU immediately */
ipi_setup(smp_processor_id());
@@ -872,7 +860,7 @@ core_initcall(register_cpufreq_notifier);
 
 static void raise_nmi(cpumask_t *mask)
 {
-   __smp_cross_call(mask, IPI_CPU_BACKTRACE);
+   __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
 }
 
 void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)

< 4 5 6 7 8 9 10 11 12 13 >

801 - 900 of 9391 matches

Mail list logo