from:"\"Marc Zyngier\""

Re: [PATCH v2 4/8] KVM: arm64: Generate hyp relocation data

2021-01-30 Thread Marc Zyngier

On Fri, 29 Jan 2021 21:43:25 +,
Guenter Roeck  wrote:
> 
> Hi,
> 
> On Tue, Jan 05, 2021 at 06:05:37PM +, David Brazdil wrote:
> > Add a post-processing step to compilation of KVM nVHE hyp code which
> > calls a custom host tool (gen-hyprel) on the partially linked object
> > file (hyp sections' names prefixed).
> > 
> > The tool lists all R_AARCH64_ABS64 data relocations targeting hyp
> > sections and generates an assembly file that will form a new section
> > .hyp.reloc in the kernel binary. The new section contains an array of
> > 32-bit offsets to the positions targeted by these relocations.
> > 
> > Since these addresses of those positions will not be determined until
> > linking of `vmlinux`, each 32-bit entry carries a R_AARCH64_PREL32
> > relocation with addend  + . The linker of
> > `vmlinux` will therefore fill the slot accordingly.
> > 
> > This relocation data will be used at runtime to convert the kernel VAs
> > at those positions to hyp VAs.
> > 
> > Signed-off-by: David Brazdil 
> 
> This patch results in the following error for me.
> 
> error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: assertion elf.ehdr->e_ident[5] 
> == 1 failed (lhs=2, rhs=1, line=250)
> 
> The problem is seen when trying to build aarch64 images in big endian
> mode.
> 
> Te script used to reproduce the problem as well as bisect results are
> attached.

I came up with the following patch, which allows the kernel to link
and boot. I don't have any BE userspace, so I didn't verify that I
could boot a guest (the hypervisor does correctly initialise though).

It's not exactly pretty, but it does the job...

Thanks,

M.

>From d80ca05b2ed90fc30d328041692fa80f525c8d12 Mon Sep 17 00:00:00 2001
From: Marc Zyngier 
Date: Sat, 30 Jan 2021 13:07:51 +
Subject: [PATCH] KVM: arm64: Make gen-hyprel endianness agnostic

gen-hyprel is, for better or worse, a native-endian program:
it assumes that the ELF data structures are in the host's
endianness, and even assumes that the compiled kernel is
little-endian in one particular case.

None of these assumptions hold true though: people actually build
(use?) BE arm64 kernels, and seem to avoid doing so on BE hosts.
Madness!

In order to solve this, wrap each access to the ELF data structures
with the required byte-swapping magic. This requires to obtain
the kernel data structure, and provide per-endianess wrappers.

This result in a kernel that links and even boots in a model.

Fixes: 8c49b5d43d4c ("KVM: arm64: Generate hyp relocation data")
Reported-by: Guenter Roeck 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/hyp/nvhe/Makefile |  1 +
 arch/arm64/kvm/hyp/nvhe/gen-hyprel.c | 57 
 2 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 268be1376f74..09d04dd50eb8 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -7,6 +7,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__
 ccflags-y := -D__KVM_NVHE_HYPERVISOR__
 
 hostprogs := gen-hyprel
+HOST_EXTRACFLAGS += -I$(srctree)/include
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o
diff --git a/arch/arm64/kvm/hyp/nvhe/gen-hyprel.c 
b/arch/arm64/kvm/hyp/nvhe/gen-hyprel.c
index 58fe31fdba8e..ead02c6a7628 100644
--- a/arch/arm64/kvm/hyp/nvhe/gen-hyprel.c
+++ b/arch/arm64/kvm/hyp/nvhe/gen-hyprel.c
@@ -25,6 +25,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -36,6 +37,8 @@
 #include 
 #include 
 
+#include 
+
 #define HYP_SECTION_PREFIX ".hyp"
 #define HYP_RELOC_SECTION  ".hyp.reloc"
 #define HYP_SECTION_SYMBOL_PREFIX  "__hyp_section_"
@@ -121,6 +124,28 @@ static struct {
const char  *sh_string;
 } elf;
 
+#if defined(CONFIG_CPU_LITTLE_ENDIAN)
+
+#define elf16toh(x)le16toh(x)
+#define elf32toh(x)le32toh(x)
+#define elf64toh(x)le64toh(x)
+
+#define ELFENDIAN  ELFDATA2LSB
+
+#elif defined(CONFIG_CPU_BIG_ENDIAN)
+
+#define elf16toh(x)be16toh(x)
+#define elf32toh(x)be32toh(x)
+#define elf64toh(x)be64toh(x)
+
+#define ELFENDIAN  ELFDATA2MSB
+
+#else
+
+#error PDP-endian sadly unsupported...
+
+#endif
+
 #define fatal_error(fmt, ...)  \
({  \
fprintf(stderr, "error: %s: " fmt "\n", \
@@ -162,12 +187,12 @@ static struct {
 
 /* Iterate over all sections in the ELF. */
 #define for_each_section(var) \
-   for (var = elf.sh_table; var < elf.sh_table + elf.ehdr->e_shnum; ++var)
+   for (var = elf.sh_table; var &

Re: [PATCH v2 4/8] KVM: arm64: Generate hyp relocation data

2021-01-30 Thread Marc Zyngier


Hi Guenter,

Thanks a lot for the heads up.

On 2021-01-29 21:43, Guenter Roeck wrote:

Hi,

On Tue, Jan 05, 2021 at 06:05:37PM +, David Brazdil wrote:

Add a post-processing step to compilation of KVM nVHE hyp code which
calls a custom host tool (gen-hyprel) on the partially linked object
file (hyp sections' names prefixed).

The tool lists all R_AARCH64_ABS64 data relocations targeting hyp
sections and generates an assembly file that will form a new section
.hyp.reloc in the kernel binary. The new section contains an array of
32-bit offsets to the positions targeted by these relocations.

Since these addresses of those positions will not be determined until
linking of `vmlinux`, each 32-bit entry carries a R_AARCH64_PREL32
relocation with addend  + . The linker of
`vmlinux` will therefore fill the slot accordingly.

This relocation data will be used at runtime to convert the kernel VAs
at those positions to hyp VAs.

Signed-off-by: David Brazdil 


This patch results in the following error for me.

error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: assertion
elf.ehdr->e_ident[5] == 1 failed (lhs=2, rhs=1, line=250)

The problem is seen when trying to build aarch64 images in big endian
mode.


Ah, big-endian. of course, the ELF header is in native endianness,
and the sanity checks explode (still much better than generating crap).

I'll have a look shortly. It shouldn't too hard to fix, just a
bit invasive...

Thanks again,

M.
--
Jazz is not dead. It just smells funny...

[tip: irq/urgent] genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set

2021-01-30 Thread tip-bot2 for Marc Zyngier

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: 4c457e8cb75eda91906a4f89fc39bde3f9a43922
Gitweb:
https://git.kernel.org/tip/4c457e8cb75eda91906a4f89fc39bde3f9a43922
Author:Marc Zyngier 
AuthorDate:Sat, 23 Jan 2021 12:27:59 
Committer: Thomas Gleixner 
CommitterDate: Sat, 30 Jan 2021 01:22:31 +01:00

genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set

When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
__msi_domain_alloc_irqs() performs the activation of the interrupt (which
in the case of PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.

But it appears that this is only done for the first vector, introducing an
inconsistent behaviour for PCI Multi-MSI.

Fix it by iterating over the number of vectors allocated to each MSI
descriptor. This is easily achieved by introducing a new
"for_each_msi_vector" iterator, together with a tiny bit of refactoring.

Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-by: Shameer Kolothum 
Signed-off-by: Marc Zyngier 
Signed-off-by: Thomas Gleixner 
Tested-by: Shameer Kolothum 
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20210123122759.1781359-1-...@kernel.org

---
 include/linux/msi.h |  6 ++-
 kernel/irq/msi.c| 44 
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 360a0a7..aef35fd 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -178,6 +178,12 @@ struct msi_desc {
list_for_each_entry((desc), dev_to_msi_list((dev)), list)
 #define for_each_msi_entry_safe(desc, tmp, dev)\
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
+#define for_each_msi_vector(desc, __irq, dev)  \
+   for_each_msi_entry((desc), (dev))   \
+   if ((desc)->irq)\
+   for (__irq = (desc)->irq;   \
+__irq < ((desc)->irq + (desc)->nvec_used); \
+__irq++)
 
 #ifdef CONFIG_IRQ_MSI_IOMMU
 static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index dc0e2d7..b338d62 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -436,22 +436,22 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, 
struct device *dev,
 
can_reserve = msi_check_reservation_mode(domain, info, dev);
 
-   for_each_msi_entry(desc, dev) {
-   virq = desc->irq;
-   if (desc->nvec_used == 1)
-   dev_dbg(dev, "irq %d for MSI\n", virq);
-   else
+   /*
+* This flag is set by the PCI layer as we need to activate
+* the MSI entries before the PCI layer enables MSI in the
+* card. Otherwise the card latches a random msi message.
+*/
+   if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+   goto skip_activate;
+
+   for_each_msi_vector(desc, i, dev) {
+   if (desc->irq == i) {
+   virq = desc->irq;
dev_dbg(dev, "irq [%d-%d] for MSI\n",
virq, virq + desc->nvec_used - 1);
-   /*
-* This flag is set by the PCI layer as we need to activate
-* the MSI entries before the PCI layer enables MSI in the
-* card. Otherwise the card latches a random msi message.
-*/
-   if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
-   continue;
+   }
 
-   irq_data = irq_domain_get_irq_data(domain, desc->irq);
+   irq_data = irq_domain_get_irq_data(domain, i);
if (!can_reserve) {
irqd_clr_can_reserve(irq_data);
if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
@@ -462,28 +462,24 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, 
struct device *dev,
goto cleanup;
}
 
+skip_activate:
/*
 * If these interrupts use reservation mode, clear the activated bit
 * so request_irq() will assign the final vector.
 */
if (can_reserve) {
-   for_each_msi_entry(desc, dev) {
-   irq_data = irq_domain_get_irq_data(domain, desc->irq);
+   for_each_msi_vector(desc, i, dev) {
+   irq_data = irq_domain_get_irq_data(domain, i);
irqd_clr_activated(irq_data);
}
}
return 0;
 
 cleanup:
-   for_each_msi_entry(desc, dev) {
-   struct irq_data *irq

Re: [PATCH V3 10/14] arm64: nvhe: Allow TRBE access at EL1

2021-01-28 Thread Marc Zyngier


On 2021-01-28 09:34, Suzuki K Poulose wrote:

On 1/27/21 9:58 AM, Marc Zyngier wrote:

On 2021-01-27 08:55, Anshuman Khandual wrote:

From: Suzuki K Poulose 

When the kernel is booted at EL2 in a nvhe configuration,
enable the TRBE access to the EL1. The EL1 still can't trace
EL2, unless EL2 permits explicitly via TRFCR_EL2.E2TRE.

Cc: Will Deacon 
Cc: Catalin Marinas 
Cc: Marc Zyngier 
Cc: Mark Rutland 
cc: Anshuman Khandual 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Anshuman Khandual 


Acked-by: Marc Zyngier 

One comment below, though:


---
 arch/arm64/include/asm/el2_setup.h | 19 +++
 arch/arm64/include/asm/kvm_arm.h   |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/arch/arm64/include/asm/el2_setup.h
b/arch/arm64/include/asm/el2_setup.h
index a7f5a1b..05ecce9 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -72,6 +72,25 @@
 .endif

 3:
+
+.ifeqs    "\mode", "nvhe"
+    /*
+ * If the Trace Buffer is available, allow
+ * the EL1 to own it. Note that EL1 cannot
+ * trace the EL2, as it is prevented by
+ * TRFCR_EL2.E2TRE == 0.
+ */
+    ubfx    x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
+    cbz    x0, 1f
+
+    mrs_s    x0, SYS_TRBIDR_EL1
+    and    x0, x0, TRBIDR_PROG
+    cbnz    x0, 1f
+    mov    x0, #(MDCR_EL2_E2TB_EL1_OWN << MDCR_EL2_E2TB_SHIFT)
+    orr    x2, x2, x0
+.endif
+
+1:


Note that this will (badly) conflict with the late-VHE patches[1],
where this code path has been reworked.


Thanks for the heads up. We will need to see how things get merged.
Ideally this patch and the previous one (TRBE definitions could go
via the arm64 tree / kvm tree), in which case we could rebase these
two patches on the respective tree.


I think the current plan of action is to go via the arm64 tree,
given that there is nothing really KVM specific there. I'll respin
the series one last (hopefully!) time on Monday. Let me know if
you need a hand with the rebasing.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [v2] irqchip: ls-extirq: add flag IRQCHIP_SKIP_SET_WAKE to remove call trace

2021-01-28 Thread Marc Zyngier


On 2021-01-28 02:37, Biwen Li (OSS) wrote:

-Original Message-
From: Marc Zyngier 
Sent: 2021年1月27日 19:38
To: Biwen Li (OSS) 
Cc: mark.rutl...@arm.com; Leo Li ; 
t...@linutronix.de;

ja...@lakedaemon.net; linux-kernel@vger.kernel.org; Jiafei Pan
; linux-arm-ker...@lists.infradead.org; Ran Wang
; Biwen Li 
Subject: Re: [v2] irqchip: ls-extirq: add flag IRQCHIP_SKIP_SET_WAKE 
to

remove call trace

On 2021-01-27 08:58, Biwen Li wrote:
> From: Biwen Li 
>
> Add flag IRQCHIP_SKIP_SET_WAKE to remove call trace as follow, ...
> [   45.605239] Unbalanced IRQ 120 wake disable
> [   45.609445] WARNING: CPU: 0 PID: 1124 at kernel/irq/manage.c:800
> irq_set_irq_wake+0x154/0x1a0
> ...
> [   45.645141] pstate: 6085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
> [   45.651144] pc : irq_set_irq_wake+0x154/0x1a0
> [   45.655497] lr : irq_set_irq_wake+0x154/0x1a0
> ...
> [   45.742825] Call trace:
> [   45.745268]  irq_set_irq_wake+0x154/0x1a0
> [   45.749278]  ds3232_resume+0x38/0x50
>
> On ls2088ardb:
> In suspend progress(# echo mem > /sys/power/state),
>
pm_suspend()->suspend_devices_and_enter()->dpm_suspend()->device_suspe
> nd()
> ->ds3232_suspend()->enable_irq_wake()->irq_set_irq_wake()
> ->set_irq_wake_real(), return -ENXIO, there get
> "Cannot set wakeup source" in ds3232_suspend().
>
> In resume progress(wakeup by flextimer)
> dpm_resume_end()->dpm_resume()
> ->device_resume()->ds3232_resume()
> ->disable_irq_wake()->irq_set_irq_wake()
> ->set_irq_wake_real(), there get
> kernel call trace(Unbalanced IRQ 120 wake
> disable)

This is again paraphrasing the stack trace instead of explaining the 
problem it

fixes. How about:

"The ls-extirq driver doesn't implement the irq_set_wake()
  callback, while being wake-up capable. This results in
  ugly behaviours across suspend/resume cycles.

  Advertise this by adding IRQCHIP_SKIP_SET_WAKE to the irqchip
  flags"

The subject line should be fixed along the same lines, and a Fixes: 
tag added.

Okay, got it. Thanks. Will update in v3.


... and v3 still doesn't have a Fixes: tag.

Frankly, if you can't be bothered to do this, why should I worry
about your platform being broken?

M.
--
Jazz is not dead. It just smells funny...

Re: [v7,5/7] PCI: mediatek-gen3: Add MSI support

2021-01-27 Thread Marc Zyngier


On 2021-01-27 12:31, Jianjun Wang wrote:

On Tue, 2021-01-26 at 13:57 +, Marc Zyngier wrote:

On 2021-01-13 11:39, Jianjun Wang wrote:
> Add MSI support for MediaTek Gen3 PCIe controller.
>
> This PCIe controller supports up to 256 MSI vectors, the MSI hardware
> block diagram is as follows:
>
>   +-+
>   | GIC |
>   +-+
>  ^
>  |
>  port->irq
>  |
>  +-+-+-+-+-+-+-+-+
>  |0|1|2|3|4|5|6|7| (PCIe intc)
>  +-+-+-+-+-+-+-+-+
>   ^ ^   ^
>   | |...|
>   +---+ +--++---+
>   |||
> +-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
> |0|1|...|30|31|  |0|1|...|30|31|  |0|1|...|30|31| (MSI sets)
> +-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
>  ^ ^  ^  ^^ ^  ^  ^^ ^  ^  ^
>  | |  |  || |  |  || |  |  |  (MSI vectors)
>  | |  |  || |  |  || |  |  |
>
>   (MSI SET0)   (MSI SET1)  ...   (MSI SET7)
>
> With 256 MSI vectors supported, the MSI vectors are composed of 8 sets,
> each set has its own address for MSI message, and supports 32 MSI
> vectors
> to generate interrupt.
>
> Signed-off-by: Jianjun Wang 
> Acked-by: Ryder Lee 
> ---
>  drivers/pci/controller/pcie-mediatek-gen3.c | 261 
>  1 file changed, 261 insertions(+)
>
> diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c
> b/drivers/pci/controller/pcie-mediatek-gen3.c
> index 7979a2856c35..471d97cd1ef9 100644
> --- a/drivers/pci/controller/pcie-mediatek-gen3.c
> +++ b/drivers/pci/controller/pcie-mediatek-gen3.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -52,11 +53,28 @@
>  #define PCIE_LINK_STATUS_REG  0x154
>  #define PCIE_PORT_LINKUP  BIT(8)
>
> +#define PCIE_MSI_SET_NUM  8
> +#define PCIE_MSI_IRQS_PER_SET 32
> +#define PCIE_MSI_IRQS_NUM \
> +  (PCIE_MSI_IRQS_PER_SET * (PCIE_MSI_SET_NUM))

Spurious inner bracketing.

> +
>  #define PCIE_INT_ENABLE_REG   0x180
> +#define PCIE_MSI_ENABLE   GENMASK(PCIE_MSI_SET_NUM + 8 - 1, 
8)
> +#define PCIE_MSI_SHIFT8
>  #define PCIE_INTX_SHIFT   24
>  #define PCIE_INTX_MASKGENMASK(27, 24)
>
>  #define PCIE_INT_STATUS_REG   0x184
> +#define PCIE_MSI_SET_ENABLE_REG   0x190
> +#define PCIE_MSI_SET_ENABLE   GENMASK(PCIE_MSI_SET_NUM - 1, 0)
> +
> +#define PCIE_MSI_SET_BASE_REG 0xc00
> +#define PCIE_MSI_SET_OFFSET   0x10
> +#define PCIE_MSI_SET_STATUS_OFFSET0x04
> +#define PCIE_MSI_SET_ENABLE_OFFSET0x08
> +
> +#define PCIE_MSI_SET_ADDR_HI_BASE 0xc80
> +#define PCIE_MSI_SET_ADDR_HI_OFFSET   0x04
>
>  #define PCIE_TRANS_TABLE_BASE_REG 0x800
>  #define PCIE_ATR_SRC_ADDR_MSB_OFFSET  0x4
> @@ -76,6 +94,18 @@
>  #define PCIE_ATR_TLP_TYPE_MEM PCIE_ATR_TLP_TYPE(0)
>  #define PCIE_ATR_TLP_TYPE_IO  PCIE_ATR_TLP_TYPE(2)
>
> +/**
> + * struct mtk_pcie_msi - MSI information for each set
> + * @dev: pointer to PCIe device
> + * @base: IO mapped register base
> + * @msg_addr: MSI message address
> + */
> +struct mtk_msi_set {
> +  struct device *dev;
> +  void __iomem *base;
> +  phys_addr_t msg_addr;
> +};
> +
>  /**
>   * struct mtk_pcie_port - PCIe port information
>   * @dev: pointer to PCIe device
> @@ -88,6 +118,11 @@
>   * @num_clks: PCIe clocks count for this port
>   * @irq: PCIe controller interrupt number
>   * @intx_domain: legacy INTx IRQ domain
> + * @msi_domain: MSI IRQ domain
> + * @msi_bottom_domain: MSI IRQ bottom domain
> + * @msi_sets: MSI sets information
> + * @lock: lock protecting IRQ bit map
> + * @msi_irq_in_use: bit map for assigned MSI IRQ
>   */
>  struct mtk_pcie_port {
>struct device *dev;
> @@ -101,6 +136,11 @@ struct mtk_pcie_port {
>
>int irq;
>struct irq_domain *intx_domain;
> +  struct irq_domain *msi_domain;
> +  struct irq_domain *msi_bottom_domain;
> +  struct mtk_msi_set msi_sets[PCIE_MSI_SET_NUM];
> +  struct mutex lock;
> +  DECLARE_BITMAP(msi_irq_in_use, PCIE_MSI_IRQS_NUM);
>  };
>
>  /**
> @@ -243,6 +283,15 @@ static int mtk_pcie_startup_port(struct
> mtk_pcie_port *port)
>return err;
>}
>
> +  /* Enable MSI */
> +  val = readl_relaxed(port->base + PCIE_MSI_SET_ENABLE_REG);
> +  val |= PCIE_MSI_SET_ENABLE;

Re: [v2] irqchip: ls-extirq: add flag IRQCHIP_SKIP_SET_WAKE to remove call trace

2021-01-27 Thread Marc Zyngier


On 2021-01-27 08:58, Biwen Li wrote:

From: Biwen Li 

Add flag IRQCHIP_SKIP_SET_WAKE to remove call trace as follow,
...
[   45.605239] Unbalanced IRQ 120 wake disable
[   45.609445] WARNING: CPU: 0 PID: 1124 at kernel/irq/manage.c:800
irq_set_irq_wake+0x154/0x1a0
...
[   45.645141] pstate: 6085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[   45.651144] pc : irq_set_irq_wake+0x154/0x1a0
[   45.655497] lr : irq_set_irq_wake+0x154/0x1a0
...
[   45.742825] Call trace:
[   45.745268]  irq_set_irq_wake+0x154/0x1a0
[   45.749278]  ds3232_resume+0x38/0x50

On ls2088ardb:
In suspend progress(# echo mem > /sys/power/state),
pm_suspend()->suspend_devices_and_enter()->dpm_suspend()->device_suspend()
->ds3232_suspend()->enable_irq_wake()->irq_set_irq_wake()
->set_irq_wake_real(), return -ENXIO, there get
"Cannot set wakeup source" in ds3232_suspend().

In resume progress(wakeup by flextimer)
dpm_resume_end()->dpm_resume()
->device_resume()->ds3232_resume()
->disable_irq_wake()->irq_set_irq_wake()
->set_irq_wake_real(), there get
kernel call trace(Unbalanced IRQ 120 wake
disable)


This is again paraphrasing the stack trace instead of explaining
the problem it fixes. How about:

"The ls-extirq driver doesn't implement the irq_set_wake()
 callback, while being wake-up capable. This results in
 ugly behaviours across suspend/resume cycles.

 Advertise this by adding IRQCHIP_SKIP_SET_WAKE to the irqchip
 flags"

The subject line should be fixed along the same lines, and
a Fixes: tag added.

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH V3 10/14] arm64: nvhe: Allow TRBE access at EL1

2021-01-27 Thread Marc Zyngier


On 2021-01-27 08:55, Anshuman Khandual wrote:

From: Suzuki K Poulose 

When the kernel is booted at EL2 in a nvhe configuration,
enable the TRBE access to the EL1. The EL1 still can't trace
EL2, unless EL2 permits explicitly via TRFCR_EL2.E2TRE.

Cc: Will Deacon 
Cc: Catalin Marinas 
Cc: Marc Zyngier 
Cc: Mark Rutland 
cc: Anshuman Khandual 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Anshuman Khandual 


Acked-by: Marc Zyngier 

One comment below, though:


---
 arch/arm64/include/asm/el2_setup.h | 19 +++
 arch/arm64/include/asm/kvm_arm.h   |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/arch/arm64/include/asm/el2_setup.h
b/arch/arm64/include/asm/el2_setup.h
index a7f5a1b..05ecce9 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -72,6 +72,25 @@
 .endif

 3:
+
+.ifeqs "\mode", "nvhe"
+   /*
+* If the Trace Buffer is available, allow
+* the EL1 to own it. Note that EL1 cannot
+* trace the EL2, as it is prevented by
+* TRFCR_EL2.E2TRE == 0.
+*/
+   ubfxx0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
+   cbz x0, 1f
+
+   mrs_s   x0, SYS_TRBIDR_EL1
+   and x0, x0, TRBIDR_PROG
+   cbnzx0, 1f
+   mov x0, #(MDCR_EL2_E2TB_EL1_OWN << MDCR_EL2_E2TB_SHIFT)
+   orr x2, x2, x0
+.endif
+
+1:


Note that this will (badly) conflict with the late-VHE patches[1],
where this code path has been reworked.

Thanks,

M.

[1] https://lore.kernel.org/r/20210125105019.2946057-1-...@kernel.org
--
Jazz is not dead. It just smells funny...

[PATCH v5 06/21] arm64: Move VHE-specific SPE setup to mutate_to_vhe()

2021-01-26 Thread Marc Zyngier

There isn't much that a VHE kernel needs on top of whatever has
been done for nVHE, so let's move the little we need to the
VHE stub (the SPE setup), and drop the init_el2_state macro.

No expected functional change.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/hyp-stub.S | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 373ed2213e1d..6b5c73cf9d52 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -92,9 +92,6 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
msr hcr_el2, x0
isb
 
-   // Doesn't do much on VHE, but still, worth a shot
-   init_el2_state vhe
-
// Use the EL1 allocated stack, per-cpu offset
mrs x0, sp_el1
mov sp, x0
@@ -107,6 +104,31 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
mrs_s   x0, SYS_VBAR_EL12
msr vbar_el1, x0
 
+   // Fixup SPE configuration, if supported...
+   mrs x1, id_aa64dfr0_el1
+   ubfxx1, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
+   mov x2, xzr
+   cbz x1, skip_spe
+
+   // ... and not owned by EL3
+   mrs_s   x0, SYS_PMBIDR_EL1
+   and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
+   cbnzx0, skip_spe
+
+   // Let the SPE driver in control of the sampling
+   mrs_s   x0, SYS_PMSCR_EL1
+   bic x0, x0, #(1 << SYS_PMSCR_EL2_PCT_SHIFT)
+   bic x0, x0, #(1 << SYS_PMSCR_EL2_PA_SHIFT)
+   msr_s   SYS_PMSCR_EL1, x0
+   mov x2, #MDCR_EL2_TPMS
+
+skip_spe:
+   // For VHE, use EL2 translation and disable access from EL1
+   mrs x0, mdcr_el2
+   bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
+   orr x0, x0, x2
+   msr mdcr_el2, x0
+
// Transfer the MM state from EL1 to EL2
mrs_s   x0, SYS_TCR_EL12
msr tcr_el1, x0
-- 
2.29.2

[PATCH v5 09/21] arm64: cpufeature: Add global feature override facility

2021-01-26 Thread Marc Zyngier

Add a facility to globally override a feature, no matter what
the HW says. Yes, this sounds dangerous, but we do respect the
"safe" value for a given feature. This doesn't mean the user
doesn't need to know what they are doing.

Nothing uses this yet, so we are pretty safe. For now.

Signed-off-by: Marc Zyngier 
Reviewed-by: Suzuki K Poulose 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/cpufeature.h |  6 +
 arch/arm64/kernel/cpufeature.c  | 42 -
 2 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 9a555809b89c..fe469389068e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -63,6 +63,11 @@ struct arm64_ftr_bits {
s64 safe_val; /* safe value for FTR_EXACT features */
 };
 
+struct arm64_ftr_override {
+   u64 val;
+   u64 mask;
+};
+
 /*
  * @arm64_ftr_reg - Feature register
  * @strict_maskBits which should match across all CPUs for 
sanity.
@@ -74,6 +79,7 @@ struct arm64_ftr_reg {
u64 user_mask;
u64 sys_val;
u64 user_val;
+   struct arm64_ftr_override   *override;
const struct arm64_ftr_bits *ftr_bits;
 };
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e99eddec0a46..cb58c7c991ef 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -352,9 +352,12 @@ static const struct arm64_ftr_bits ftr_ctr[] = {
ARM64_FTR_END,
 };
 
+static struct arm64_ftr_override no_override = { };
+
 struct arm64_ftr_reg arm64_ftr_reg_ctrel0 = {
.name   = "SYS_CTR_EL0",
-   .ftr_bits   = ftr_ctr
+   .ftr_bits   = ftr_ctr,
+   .override   = &no_override,
 };
 
 static const struct arm64_ftr_bits ftr_id_mmfr0[] = {
@@ -544,13 +547,16 @@ static const struct arm64_ftr_bits ftr_raz[] = {
ARM64_FTR_END,
 };
 
-#define ARM64_FTR_REG(id, table) { \
-   .sys_id = id,   \
-   .reg =  &(struct arm64_ftr_reg){\
-   .name = #id,\
-   .ftr_bits = &((table)[0]),  \
+#define ARM64_FTR_REG_OVERRIDE(id, table, ovr) {   \
+   .sys_id = id,   \
+   .reg =  &(struct arm64_ftr_reg){\
+   .name = #id,\
+   .override = (ovr),  \
+   .ftr_bits = &((table)[0]),  \
}}
 
+#define ARM64_FTR_REG(id, table) ARM64_FTR_REG_OVERRIDE(id, table, 
&no_override)
+
 static const struct __ftr_reg_entry {
u32 sys_id;
struct arm64_ftr_reg*reg;
@@ -770,6 +776,30 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
u64 ftr_mask = arm64_ftr_mask(ftrp);
s64 ftr_new = arm64_ftr_value(ftrp, new);
+   s64 ftr_ovr = arm64_ftr_value(ftrp, reg->override->val);
+
+   if ((ftr_mask & reg->override->mask) == ftr_mask) {
+   s64 tmp = arm64_ftr_safe_value(ftrp, ftr_ovr, ftr_new);
+   char *str = NULL;
+
+   if (ftr_ovr != tmp) {
+   /* Unsafe, remove the override */
+   reg->override->mask &= ~ftr_mask;
+   reg->override->val &= ~ftr_mask;
+   tmp = ftr_ovr;
+   str = "ignoring override";
+   } else if (ftr_new != tmp) {
+   /* Override was valid */
+   ftr_new = tmp;
+   str = "forced";
+   }
+
+   if (str)
+   pr_warn("%s[%d:%d]: %s to %llx\n",
+   reg->name,
+   ftrp->shift + ftrp->width - 1,
+   ftrp->shift, str, tmp);
+   }
 
val = arm64_ftr_set_value(ftrp, val, ftr_new);
 
-- 
2.29.2

[PATCH v5 03/21] arm64: Turn the MMU-on sequence into a macro

2021-01-26 Thread Marc Zyngier

Turning the MMU on is a popular sport in the arm64 kernel, and
we do it more than once, or even twice. As we are about to add
even more, let's turn it into a macro.

No expected functional change.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/assembler.h | 17 +
 arch/arm64/kernel/head.S   | 19 ---
 arch/arm64/mm/proc.S   | 12 +---
 3 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index bf125c591116..8cded93f99c3 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -675,6 +675,23 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
.endif
.endm
 
+/*
+ * Set SCTLR_EL1 to the passed value, and invalidate the local icache
+ * in the process. This is called when setting the MMU on.
+ */
+.macro set_sctlr_el1, reg
+   msr sctlr_el1, \reg
+   isb
+   /*
+* Invalidate the local I-cache so that any instructions fetched
+* speculatively from the PoC are discarded, since they may have
+* been dynamically patched at the PoU.
+*/
+   ic  iallu
+   dsb nsh
+   isb
+.endm
+
 /*
  * Check whether to yield to another runnable task from kernel mode NEON code
  * (which runs with preemption disabled).
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a0dc987724ed..28e9735302df 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -703,16 +703,9 @@ SYM_FUNC_START(__enable_mmu)
offset_ttbr1 x1, x3
msr ttbr1_el1, x1   // load TTBR1
isb
-   msr sctlr_el1, x0
-   isb
-   /*
-* Invalidate the local I-cache so that any instructions fetched
-* speculatively from the PoC are discarded, since they may have
-* been dynamically patched at the PoU.
-*/
-   ic  iallu
-   dsb nsh
-   isb
+
+   set_sctlr_el1   x0
+
ret
 SYM_FUNC_END(__enable_mmu)
 
@@ -883,11 +876,7 @@ SYM_FUNC_START_LOCAL(__primary_switch)
tlbivmalle1 // Remove any stale TLB entries
dsb nsh
 
-   msr sctlr_el1, x19  // re-enable the MMU
-   isb
-   ic  iallu   // flush instructions fetched
-   dsb nsh // via old mapping
-   isb
+   set_sctlr_el1   x19 // re-enable the MMU
 
bl  __relocate_kernel
 #endif
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index ece785477bdc..c967bfd30d2b 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -291,17 +291,7 @@ skip_pgd:
/* We're done: fire up the MMU again */
mrs x17, sctlr_el1
orr x17, x17, #SCTLR_ELx_M
-   msr sctlr_el1, x17
-   isb
-
-   /*
-* Invalidate the local I-cache so that any instructions fetched
-* speculatively from the PoC are discarded, since they may have
-* been dynamically patched at the PoU.
-*/
-   ic  iallu
-   dsb nsh
-   isb
+   set_sctlr_el1   x17
 
/* Set the flag to zero to indicate that we're all done */
str wzr, [flag_ptr]
-- 
2.29.2

[PATCH v5 04/21] arm64: Provide an 'upgrade to VHE' stub hypercall

2021-01-26 Thread Marc Zyngier

As we are about to change the way a VHE system boots, let's
provide the core helper, in the form of a stub hypercall that
enables VHE and replicates the full EL1 context at EL2, thanks
to EL1 and VHE-EL2 being extremely similar.

On exception return, the kernel carries on at EL2. Fancy!

Nothing calls this new hypercall yet, so no functional change.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/virt.h |  7 +++-
 arch/arm64/kernel/hyp-stub.S  | 76 ++-
 2 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index ee6a48df89d9..7379f35ae2c6 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -35,8 +35,13 @@
  */
 #define HVC_RESET_VECTORS 2
 
+/*
+ * HVC_VHE_RESTART - Upgrade the CPU from EL1 to EL2, if possible
+ */
+#define HVC_VHE_RESTART3
+
 /* Max number of HYP stub hypercalls */
-#define HVC_STUB_HCALL_NR 3
+#define HVC_STUB_HCALL_NR 4
 
 /* Error returned when an invalid stub number is passed into x0 */
 #define HVC_STUB_ERR   0xbadca11
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 160f5881a0b7..3f3dbbe8914d 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -8,9 +8,9 @@
 
 #include 
 #include 
-#include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -47,10 +47,13 @@ SYM_CODE_END(__hyp_stub_vectors)
 
 SYM_CODE_START_LOCAL(el1_sync)
cmp x0, #HVC_SET_VECTORS
-   b.ne2f
+   b.ne1f
msr vbar_el2, x1
b   9f
 
+1: cmp x0, #HVC_VHE_RESTART
+   b.eqmutate_to_vhe
+
 2: cmp x0, #HVC_SOFT_RESTART
b.ne3f
mov x0, x2
@@ -70,6 +73,75 @@ SYM_CODE_START_LOCAL(el1_sync)
eret
 SYM_CODE_END(el1_sync)
 
+// nVHE? No way! Give me the real thing!
+SYM_CODE_START_LOCAL(mutate_to_vhe)
+   // Be prepared to fail
+   mov_q   x0, HVC_STUB_ERR
+
+   // Sanity check: MMU *must* be off
+   mrs x1, sctlr_el2
+   tbnzx1, #0, 1f
+
+   // Needs to be VHE capable, obviously
+   mrs x1, id_aa64mmfr1_el1
+   ubfxx1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4
+   cbz x1, 1f
+
+   // Engage the VHE magic!
+   mov_q   x0, HCR_HOST_VHE_FLAGS
+   msr hcr_el2, x0
+   isb
+
+   // Doesn't do much on VHE, but still, worth a shot
+   init_el2_state vhe
+
+   // Use the EL1 allocated stack, per-cpu offset
+   mrs x0, sp_el1
+   mov sp, x0
+   mrs x0, tpidr_el1
+   msr tpidr_el2, x0
+
+   // FP configuration, vectors
+   mrs_s   x0, SYS_CPACR_EL12
+   msr cpacr_el1, x0
+   mrs_s   x0, SYS_VBAR_EL12
+   msr vbar_el1, x0
+
+   // Transfer the MM state from EL1 to EL2
+   mrs_s   x0, SYS_TCR_EL12
+   msr tcr_el1, x0
+   mrs_s   x0, SYS_TTBR0_EL12
+   msr ttbr0_el1, x0
+   mrs_s   x0, SYS_TTBR1_EL12
+   msr ttbr1_el1, x0
+   mrs_s   x0, SYS_MAIR_EL12
+   msr mair_el1, x0
+   isb
+
+   // Invalidate TLBs before enabling the MMU
+   tlbivmalle1
+   dsb nsh
+
+   // Enable the EL2 S1 MMU, as set up from EL1
+   mrs_s   x0, SYS_SCTLR_EL12
+   set_sctlr_el1   x0
+
+   // Disable the EL1 S1 MMU for a good measure
+   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
+   msr_s   SYS_SCTLR_EL12, x0
+
+   // Hack the exception return to stay at EL2
+   mrs x0, spsr_el1
+   and x0, x0, #~PSR_MODE_MASK
+   mov x1, #PSR_MODE_EL2h
+   orr x0, x0, x1
+   msr spsr_el1, x0
+
+   mov x0, xzr
+
+1: eret
+SYM_CODE_END(mutate_to_vhe)
+
 .macro invalid_vector  label
 SYM_CODE_START_LOCAL(\label)
b \label
-- 
2.29.2

[PATCH v5 00/21] arm64: Early CPU feature override, and applications to VHE, BTI and PAuth

2021-01-26 Thread Marc Zyngier

It recently came to light that there is a need to be able to override
some CPU features very early on, before the kernel is fully up and
running. The reasons for this range from specific feature support
(such as using Protected KVM on VHE HW, which is the main motivation
for this work) to errata workaround (a feature is broken on a CPU and
needs to be turned off, or rather not enabled).

This series tries to offer a limited framework for this kind of
problems, by allowing a set of options to be passed on the
command-line and altering the feature set that the cpufeature
subsystem exposes to the rest of the kernel. Note that this doesn't
change anything for code that directly uses the CPU ID registers.

The series completely changes the way a VHE-capable system boots, by
*always* booting non-VHE first, and then upgrading to VHE when deemed
capable. Although it sounds scary, this is actually simple to
implement (and I wish I had done that five years ago). The "upgrade to
VHE" path is then conditioned on the VHE feature not being disabled
from the command-line.

Said command-line parsing borrows a lot from the kaslr code, and
subsequently allows the "nokaslr" option to be moved to the new
infrastructure (though it all looks a bit... odd).

Further patches now add support for disabling BTI and PAuth, the
latter being based on an initial series by Srinivas Ramana[0]. There
is some ongoing discussions about being able to disable MTE, but no
clear resolution on that subject yet.

This has been tested on multiple VHE and non-VHE systems.

* From v4 [4]:
  - Documentation fixes
  - Moved the val/mask pair into a arm64_ftr_override structure,
leading to simpler code
  - All arm64_ftr_reg now have a default override, which simplifies
the code a bit further
  - Dropped some of the "const" attributes
  - Renamed init_shadow_regs() to init_feature_override()
  - Renamed struct reg_desc to struct ftr_set_desc
  - Refactored command-line parsing
  - Simplified handling of VHE being disabled on the cmdline
  - Turn EL1 S1 MMU off on switch to VHE
  - HVC_VHE_RESTART now returns an error code on failure
  - Added missing asmlinkage and dummy prototypes
  - Collected Acks and RBs from David, Catalin and Suzuki

* From v3 [3]:
  - Fixed the VHE_RESTART stub (duh!)
  - Switched to using arm64_ftr_safe_value() instead of the user
provided value
  - Per-feature override warning

* From v2 [2]:
  - Simplify the VHE_RESTART stub
  - Fixed a number of spelling mistakes, and hopefully introduced a
few more
  - Override features in __read_sysreg_by_encoding()
  - Allow both BTI and PAuth to be overridden on the command line
  - Rebased on -rc3

* From v1 [1]:
  - Fix SPE init on VHE when EL2 doesn't own SPE
  - Fix re-init when KASLR is used
  - Handle the resume path
  - Rebased to 5.11-rc2

[0] 
https://lore.kernel.org/r/1610152163-16554-1-git-send-email-sram...@codeaurora.org
[1] https://lore.kernel.org/r/20201228104958.1848833-1-...@kernel.org
[2] https://lore.kernel.org/r/20210104135011.2063104-1-...@kernel.org
[3] https://lore.kernel.org/r/2021032811.2455113-1-...@kernel.org
[4] https://lore.kernel.org/r/20210118094533.2874082-1-...@kernel.org

Marc Zyngier (20):
  arm64: Fix labels in el2_setup macros
  arm64: Fix outdated TCR setup comment
  arm64: Turn the MMU-on sequence into a macro
  arm64: Provide an 'upgrade to VHE' stub hypercall
  arm64: Initialise as nVHE before switching to VHE
  arm64: Move VHE-specific SPE setup to mutate_to_vhe()
  arm64: Simplify init_el2_state to be non-VHE only
  arm64: Move SCTLR_EL1 initialisation to EL-agnostic code
  arm64: cpufeature: Add global feature override facility
  arm64: cpufeature: Use IDreg override in __read_sysreg_by_encoding()
  arm64: Extract early FDT mapping from kaslr_early_init()
  arm64: cpufeature: Add an early command-line cpufeature override
facility
  arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command
line
  arm64: Honor VHE being disabled from the command-line
  arm64: Add an aliasing facility for the idreg override
  arm64: Make kvm-arm.mode={nvhe, protected} an alias of
id_aa64mmfr1.vh=0
  KVM: arm64: Document HVC_VHE_RESTART stub hypercall
  arm64: Move "nokaslr" over to the early cpufeature infrastructure
  arm64: cpufeatures: Allow disabling of BTI from the command-line
  arm64: cpufeatures: Allow disabling of Pointer Auth from the
command-line

Srinivas Ramana (1):
  arm64: Defer enabling pointer authentication on boot core

 .../admin-guide/kernel-parameters.txt |   9 +
 Documentation/virt/kvm/arm/hyp-abi.rst|   9 +
 arch/arm64/include/asm/assembler.h|  17 ++
 arch/arm64/include/asm/cpufeature.h   |  11 +
 arch/arm64/include/asm/el2_setup.h|  60 ++
 arch/arm64/include/asm/pointer_auth.h |  10 +
 arch/arm64/include/asm/setup.h|  11 +
 arch/arm64/include/asm/s

Re: [PATCH 1/2] [REPOST] dt-bindings: qcom,pdc: Add compatible for SM8250

2021-01-26 Thread Marc Zyngier

On Fri, 15 Jan 2021 14:39:40 +0530, Vinod Koul wrote:
> Add the compatible string for SM8250 SoC from Qualcomm. This compatible
> is used already in DTS files but not documented yet

Applied to irq/irqchip-5.12, thanks!

[1/2] dt-bindings: qcom,pdc: Add compatible for SM8250
  commit: e6f93c0115cb24ae4b473f28a27294e99faf129a
[2/2] dt-bindings: qcom,pdc: Add compatible for SM8350
  commit: 9eaad15e5a409f59660f9fdf867f7d3e6e3db15a

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [RFC PATCH v1 0/5] Enable CPU TTRem feature for stage-2

2021-01-26 Thread Marc Zyngier


Hi Yanan,

On 2021-01-26 13:41, Yanan Wang wrote:

Hi all,
This series enable CPU TTRem feature for stage-2 page table and a RFC 
is sent

for some comments, thanks.

The ARMv8.4 TTRem feature offers 3 levels of support when changing 
block
size without changing any other parameters that are listed as requiring 
use
of break-before-make. And I found that maybe we can use this feature to 
make

some improvement for stage-2 page table and the following explains what
TTRem exactly does for the improvement.

If migration of a VM with hugepages is canceled midway, KVM will adjust 
the
stage-2 table mappings back to block mappings. We currently use BBM to 
replace
the table entry with a block entry. Take adjustment of 1G block mapping 
as an
example, with BBM procedures, we have to invalidate the old table entry 
first,
flush TLB and unmap the old table mappings, right before installing the 
new

block entry.


In all honesty, I think the amount of work that is getting added to
support this "migration cancelled mid-way" use case is getting out
of control.

This is adding a complexity and corner cases for a use case that
really shouldn't happen that often. And it is adding it at the worse
possible place, where we really should keep things as straightforward
as possible.

I would expect userspace to have a good enough knowledge of whether
the migration is likely to succeed, and not to attempt it if it is
likely to fail. And yes, it will fail sometimes. But it should be
so rare that adding this various stages of BBM support shouldn't be
that useful.

Or is there something else that I am missing?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] irqchip: ls-extirq: add flag IRQCHIP_SKIP_SET_WAKE to remove call trace

2021-01-26 Thread Marc Zyngier


On 2021-01-26 11:00, Biwen Li wrote:

From: Biwen Li 

Add flag IRQCHIP_SKIP_SET_WAKE to remove call trace as follow,


[useless trace]

More importantly, what is the bug that you are fixing?

M.
--
Jazz is not dead. It just smells funny...

Re: [v7,5/7] PCI: mediatek-gen3: Add MSI support

2021-01-26 Thread Marc Zyngier


On 2021-01-13 11:39, Jianjun Wang wrote:

Add MSI support for MediaTek Gen3 PCIe controller.

This PCIe controller supports up to 256 MSI vectors, the MSI hardware
block diagram is as follows:

  +-+
  | GIC |
  +-+
 ^
 |
 port->irq
 |
 +-+-+-+-+-+-+-+-+
 |0|1|2|3|4|5|6|7| (PCIe intc)
 +-+-+-+-+-+-+-+-+
  ^ ^   ^
  | |...|
  +---+ +--++---+
  |||
+-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
|0|1|...|30|31|  |0|1|...|30|31|  |0|1|...|30|31| (MSI sets)
+-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
 ^ ^  ^  ^^ ^  ^  ^^ ^  ^  ^
 | |  |  || |  |  || |  |  |  (MSI vectors)
 | |  |  || |  |  || |  |  |

  (MSI SET0)   (MSI SET1)  ...   (MSI SET7)

With 256 MSI vectors supported, the MSI vectors are composed of 8 sets,
each set has its own address for MSI message, and supports 32 MSI 
vectors

to generate interrupt.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 261 
 1 file changed, 261 insertions(+)

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c
b/drivers/pci/controller/pcie-mediatek-gen3.c
index 7979a2856c35..471d97cd1ef9 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -52,11 +53,28 @@
 #define PCIE_LINK_STATUS_REG   0x154
 #define PCIE_PORT_LINKUP   BIT(8)

+#define PCIE_MSI_SET_NUM   8
+#define PCIE_MSI_IRQS_PER_SET  32
+#define PCIE_MSI_IRQS_NUM \
+   (PCIE_MSI_IRQS_PER_SET * (PCIE_MSI_SET_NUM))


Spurious inner bracketing.


+
 #define PCIE_INT_ENABLE_REG0x180
+#define PCIE_MSI_ENABLEGENMASK(PCIE_MSI_SET_NUM + 8 - 
1, 8)
+#define PCIE_MSI_SHIFT 8
 #define PCIE_INTX_SHIFT24
 #define PCIE_INTX_MASK GENMASK(27, 24)

 #define PCIE_INT_STATUS_REG0x184
+#define PCIE_MSI_SET_ENABLE_REG0x190
+#define PCIE_MSI_SET_ENABLEGENMASK(PCIE_MSI_SET_NUM - 1, 0)
+
+#define PCIE_MSI_SET_BASE_REG  0xc00
+#define PCIE_MSI_SET_OFFSET0x10
+#define PCIE_MSI_SET_STATUS_OFFSET 0x04
+#define PCIE_MSI_SET_ENABLE_OFFSET 0x08
+
+#define PCIE_MSI_SET_ADDR_HI_BASE  0xc80
+#define PCIE_MSI_SET_ADDR_HI_OFFSET0x04

 #define PCIE_TRANS_TABLE_BASE_REG  0x800
 #define PCIE_ATR_SRC_ADDR_MSB_OFFSET   0x4
@@ -76,6 +94,18 @@
 #define PCIE_ATR_TLP_TYPE_MEM  PCIE_ATR_TLP_TYPE(0)
 #define PCIE_ATR_TLP_TYPE_IO   PCIE_ATR_TLP_TYPE(2)

+/**
+ * struct mtk_pcie_msi - MSI information for each set
+ * @dev: pointer to PCIe device
+ * @base: IO mapped register base
+ * @msg_addr: MSI message address
+ */
+struct mtk_msi_set {
+   struct device *dev;
+   void __iomem *base;
+   phys_addr_t msg_addr;
+};
+
 /**
  * struct mtk_pcie_port - PCIe port information
  * @dev: pointer to PCIe device
@@ -88,6 +118,11 @@
  * @num_clks: PCIe clocks count for this port
  * @irq: PCIe controller interrupt number
  * @intx_domain: legacy INTx IRQ domain
+ * @msi_domain: MSI IRQ domain
+ * @msi_bottom_domain: MSI IRQ bottom domain
+ * @msi_sets: MSI sets information
+ * @lock: lock protecting IRQ bit map
+ * @msi_irq_in_use: bit map for assigned MSI IRQ
  */
 struct mtk_pcie_port {
struct device *dev;
@@ -101,6 +136,11 @@ struct mtk_pcie_port {

int irq;
struct irq_domain *intx_domain;
+   struct irq_domain *msi_domain;
+   struct irq_domain *msi_bottom_domain;
+   struct mtk_msi_set msi_sets[PCIE_MSI_SET_NUM];
+   struct mutex lock;
+   DECLARE_BITMAP(msi_irq_in_use, PCIE_MSI_IRQS_NUM);
 };

 /**
@@ -243,6 +283,15 @@ static int mtk_pcie_startup_port(struct
mtk_pcie_port *port)
return err;
}

+   /* Enable MSI */
+   val = readl_relaxed(port->base + PCIE_MSI_SET_ENABLE_REG);
+   val |= PCIE_MSI_SET_ENABLE;
+   writel_relaxed(val, port->base + PCIE_MSI_SET_ENABLE_REG);
+
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val |= PCIE_MSI_ENABLE;
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);
+
/* Set PCIe translation windows */
resource_list_for_each_entry(entry, &host->windows) {
struct resource *res = entry->res;
@@ -286,6 +335,129 @@ static int mtk_pcie_set_affinity(struct irq_data 
*data,

return -EINVAL;
 }

+static struct irq_chip mtk_msi_irq_chip = {
+   .name = "MSI",
+   .irq_ack = irq_chip_ack_parent,
+};
+
+static struct msi_domain_info mtk_msi_domain_info = {
+   .flags  = (MS

Re: [v7,4/7] PCI: mediatek-gen3: Add INTx support

2021-01-26 Thread Marc Zyngier


On 2021-01-13 11:39, Jianjun Wang wrote:

Add INTx support for MediaTek Gen3 PCIe controller.

Signed-off-by: Jianjun Wang 
Acked-by: Ryder Lee 
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 163 
 1 file changed, 163 insertions(+)

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c
b/drivers/pci/controller/pcie-mediatek-gen3.c
index c00ea7c167de..7979a2856c35 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -9,6 +9,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +52,12 @@
 #define PCIE_LINK_STATUS_REG   0x154
 #define PCIE_PORT_LINKUP   BIT(8)

+#define PCIE_INT_ENABLE_REG0x180
+#define PCIE_INTX_SHIFT24
+#define PCIE_INTX_MASK GENMASK(27, 24)


I guess this '24' is actually PCIE_INTX_SHIFT? In this case,
please write it as

GENMASK(PCIE_INTX_SHIFT + PCI_NUM_INTX - 1, PCIE_INTX_SHIFT)

to make it clear that you are dealing with one bit per INTx.


+
+#define PCIE_INT_STATUS_REG0x184
+
 #define PCIE_TRANS_TABLE_BASE_REG  0x800
 #define PCIE_ATR_SRC_ADDR_MSB_OFFSET   0x4
 #define PCIE_ATR_TRSL_ADDR_LSB_OFFSET  0x8
@@ -77,6 +86,8 @@
  * @phy: PHY controller block
  * @clks: PCIe clocks
  * @num_clks: PCIe clocks count for this port
+ * @irq: PCIe controller interrupt number
+ * @intx_domain: legacy INTx IRQ domain
  */
 struct mtk_pcie_port {
struct device *dev;
@@ -87,6 +98,9 @@ struct mtk_pcie_port {
struct phy *phy;
struct clk_bulk_data *clks;
int num_clks;
+
+   int irq;
+   struct irq_domain *intx_domain;
 };

 /**
@@ -266,6 +280,149 @@ static int mtk_pcie_startup_port(struct
mtk_pcie_port *port)
return 0;
 }

+static int mtk_pcie_set_affinity(struct irq_data *data,
+const struct cpumask *mask, bool force)
+{
+   return -EINVAL;
+}
+
+static void mtk_intx_mask(struct irq_data *data)
+{
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   u32 val;
+
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val &= ~BIT(data->hwirq + PCIE_INTX_SHIFT);
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);


This is missing some locking. Otherwise, two concurrent mask/unmask
for different interrupts will corrupt each other's state.


+}
+
+static void mtk_intx_unmask(struct irq_data *data)
+{
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   u32 val;
+
+   val = readl_relaxed(port->base + PCIE_INT_ENABLE_REG);
+   val |= BIT(data->hwirq + PCIE_INTX_SHIFT);
+   writel_relaxed(val, port->base + PCIE_INT_ENABLE_REG);


Same thing here.


+}
+
+/**
+ * mtk_intx_eoi
+ * @data: pointer to chip specific data
+ *
+ * As an emulated level IRQ, its interrupt status will remain
+ * until the corresponding de-assert message is received; hence that
+ * the status can only be cleared when the interrupt has been 
serviced.

+ */
+static void mtk_intx_eoi(struct irq_data *data)
+{
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   unsigned long hwirq;
+
+   hwirq = data->hwirq + PCIE_INTX_SHIFT;
+   writel_relaxed(BIT(hwirq), port->base + PCIE_INT_STATUS_REG);
+}
+
+static struct irq_chip mtk_intx_irq_chip = {
+   .irq_mask   = mtk_intx_mask,
+   .irq_unmask = mtk_intx_unmask,
+   .irq_eoi= mtk_intx_eoi,
+   .irq_set_affinity   = mtk_pcie_set_affinity,
+   .name   = "PCIe",


nit: "PCIe" is not really descriptive. "INTx" would be a bit better.


+};
+
+static int mtk_pcie_intx_map(struct irq_domain *domain, unsigned int 
irq,

+irq_hw_number_t hwirq)
+{
+   irq_set_chip_and_handler_name(irq, &mtk_intx_irq_chip,
+ handle_fasteoi_irq, "INTx");
+   irq_set_chip_data(irq, domain->host_data);


You probably want to set the chip_data *before* wiring
the handler, as otherwise you could end-up with a NULL
pointer in any of the callbacks if the interrupt fires
between the two.


+
+   return 0;
+}
+
+static const struct irq_domain_ops intx_domain_ops = {
+   .map = mtk_pcie_intx_map,
+};
+
+static int mtk_pcie_init_irq_domains(struct mtk_pcie_port *port,
+struct device_node *node)
+{
+   struct device *dev = port->dev;
+   struct device_node *intc_node;
+
+   /* Setup INTx */
+   intc_node = of_get_child_by_name(node, "interrupt-controller");
+   if (!intc_node) {
+   dev_err(dev, "missing PCIe Intc node\n");
+   return -ENODEV;
+   }
+
+   port->intx_domain = irq_domain_add_linear(intc_node, PCI_NUM_INTX,
+ &intx_domain_ops, port);
+   if (!port->intx_domain) {
+   dev_err(dev, "f

Re: [PATCH v3 0/3] Some optimization for stage-2 translation

2021-01-25 Thread Marc Zyngier

On Thu, 14 Jan 2021 20:13:47 +0800, Yanan Wang wrote:
> This patch series(v3) make some optimization for stage-2 translation.
> 
> About patch-1:
> Procedures of hyp stage-1 map and guest stage-2 map are quite different,
> but they are now tied closely by function kvm_set_valid_leaf_pte().
> So adjust the relative code for ease of code maintenance in the future.
> 
> [...]

Applied to kvm-arm64/concurrent-translation-fault, thanks!

[1/3] KVM: arm64: Adjust partial code of hyp stage-1 map and guest stage-2 map
  commit: 8ed80051c8c31d1587722fdb3af16677eba9d693
[2/3] KVM: arm64: Filter out the case of only changing permissions from stage-2 
map path
  commit: 694d071f8d85d504055540a27f0dbe9dbf44584e
[3/3] KVM: arm64: Mark the page dirty only if the fault is handled successfully
  commit: 509552e65ae8287178a5cdea2d734dcd2d6380ab

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set

2021-01-25 Thread Marc Zyngier

On 2021-01-25 14:39, Shameerali Kolothum Thodi wrote:

-Original Message-
From: Marc Zyngier [mailto:m...@kernel.org]
Sent: 23 January 2021 12:28
To: linux-kernel@vger.kernel.org
Cc: Thomas Gleixner ; Bjorn Helgaas
; Shameerali Kolothum Thodi
; sta...@vger.kernel.org
Subject: [PATCH] genirq/msi: Activate Multi-MSI early when
MSI_FLAG_ACTIVATE_EARLY is set

When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
we perform the activation of the interrupt (which in the case of
PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.

But it appears that this is only done for the first vector,
introducing an inconsistent behaviour for PCI Multi-MSI.

Fix it by iterating over the number of vectors allocated to
each MSI descriptor. This is easily achieved by introducing
a new "for_each_msi_vector" iterator, together with a tiny
bit of refactoring.

Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated 
early")

Reported-by: Shameer Kolothum 
Signed-off-by: Marc Zyngier 
Cc: sta...@vger.kernel.org
---
 include/linux/msi.h |  6 ++
 kernel/irq/msi.c| 44 
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 360a0a7e7341..aef35fd1cf11 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -178,6 +178,12 @@ struct msi_desc {
list_for_each_entry((desc), dev_to_msi_list((dev)), list)
 #define for_each_msi_entry_safe(desc, tmp, dev)\
 	list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), 
list)

+#define for_each_msi_vector(desc, __irq, dev)  \
+   for_each_msi_entry((desc), (dev))   \
+   if ((desc)->irq) \
+   for (__irq = (desc)->irq;\
+__irq < ((desc)->irq + (desc)->nvec_used);\
+__irq++)

 #ifdef CONFIG_IRQ_MSI_IOMMU
 static inline const void *msi_desc_get_iommu_cookie(struct msi_desc 
*desc)

diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 2c0c4d6d0f83..d924676c8781 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -436,22 +436,22 @@ int __msi_domain_alloc_irqs(struct irq_domain
*domain, struct device *dev,

can_reserve = msi_check_reservation_mode(domain, info, dev);

-   for_each_msi_entry(desc, dev) {
-   virq = desc->irq;
-   if (desc->nvec_used == 1)
-   dev_dbg(dev, "irq %d for MSI\n", virq);
-   else
+   /*
+* This flag is set by the PCI layer as we need to activate
+* the MSI entries before the PCI layer enables MSI in the
+* card. Otherwise the card latches a random msi message.
+*/
+   if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+   goto skip_activate;

This will change the dbg print behavior. From the commit f3b0946d629c,
it looks like the below dev_dbg() code was there for 
!MSI_FLAG_ACTIVATE_EARLY

case as well. Not sure how much this matters though.

I'm not sure this matters either. We may have relied on these statements
some 6/7 years ago, as the whole hierarchy stuff was brand new, but we
now have a much better debug infrastructure thanks to Thomas. I'd be
totally in favour of dropping it.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 18/21] arm64: Move "nokaslr" over to the early cpufeature infrastructure

2021-01-25 Thread Marc Zyngier

On 2021-01-25 14:19, Ard Biesheuvel wrote:

On Mon, 25 Jan 2021 at 14:54, Marc Zyngier  wrote:

On 2021-01-25 12:54, Ard Biesheuvel wrote:

[...]

> This struct now takes up
> - ~100 bytes for the characters themselves (which btw are not emitted
> into __initdata or __initconst)
> - 6x8 bytes for the char pointers
> - 6x24 bytes for the RELA relocations that annotate these pointers as
> quantities that need to be relocated at boot (on a kernel built with
> KASLR)
>
> I know it's only a drop in the ocean, but in this case, where the
> struct is statically declared and defined only once, and in the same
> place, we could easily turn this into
>
> static const struct {
>char alias[24];
>char param[20];
> };
>
> and get rid of all the overhead. The only slightly annoying thing is
> that the array sizes need to be kept in sync with the largest instance
> appearing in the array, but this is easy when the struct type is
> declared in the same place where its only instance is defined.

Fair enough. I personally find the result butt-ugly, but I agree
that it certainly saves some memory. Does the following work for
you? I can even give symbolic names to the various constants (how
generous of me! ;-).

To be honest, I was anticipating more of a discussion, but this looks
reasonable to me.

It looked like a reasonable ask: all the strings are completely useless
once the kernel has booted, and I'm the first to moan that I can't boot
an arm64 kernel with less than 60MB of RAM (OK, it's a pretty bloated
kernel...).

Does 'charfeature[80];' really need 80 bytes though?

It really needs 75 bytes, because of this:

{ "arm64.nopauth",
  "id_aa64isar1.gpi=0 id_aa64isar1.gpa=0 "
  "id_aa64isar1.api=0 id_aa64isar1.apa=0"},

80 is a round enough number.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 18/21] arm64: Move "nokaslr" over to the early cpufeature infrastructure

2021-01-25 Thread Marc Zyngier


On 2021-01-25 12:54, Ard Biesheuvel wrote:

On Mon, 25 Jan 2021 at 11:53, Marc Zyngier  wrote:


Given that the early cpufeature infrastructure has borrowed quite
a lot of code from the kaslr implementation, let's reimplement
the matching of the "nokaslr" option with it.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/idreg-override.c | 15 +
 arch/arm64/kernel/kaslr.c  | 36 
++

 2 files changed, 17 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c

index cbb8eaa48742..3ccf51b84ba4 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -31,8 +31,22 @@ static const struct ftr_set_desc mmfr1 __initdata = 
{

},
 };

+extern struct arm64_ftr_override kaslr_feature_override;
+
+static const struct ftr_set_desc kaslr __initdata = {


This should be __initconst not __initdata (below too)


+   .name   = "kaslr",
+#ifdef CONFIG_RANDOMIZE_BASE
+   .override   = &kaslr_feature_override,
+#endif
+   .fields = {
+   { "disabled", 0 },
+   {}
+   },
+};
+
 static const struct ftr_set_desc * const regs[] __initdata = {
&mmfr1,
+   &kaslr,
 };

 static const struct {
@@ -41,6 +55,7 @@ static const struct {
 } aliases[] __initdata = {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
+   { "nokaslr","kaslr.disabled=1" },
 };



This struct now takes up
- ~100 bytes for the characters themselves (which btw are not emitted
into __initdata or __initconst)
- 6x8 bytes for the char pointers
- 6x24 bytes for the RELA relocations that annotate these pointers as
quantities that need to be relocated at boot (on a kernel built with
KASLR)

I know it's only a drop in the ocean, but in this case, where the
struct is statically declared and defined only once, and in the same
place, we could easily turn this into

static const struct {
   char alias[24];
   char param[20];
};

and get rid of all the overhead. The only slightly annoying thing is
that the array sizes need to be kept in sync with the largest instance
appearing in the array, but this is easy when the struct type is
declared in the same place where its only instance is defined.


Fair enough. I personally find the result butt-ugly, but I agree
that it certainly saves some memory. Does the following work for
you? I can even give symbolic names to the various constants (how
generous of me! ;-).

Thanks,

M.

diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c

index d1310438d95c..9e7043bdc808 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -14,15 +14,15 @@
 #include 

 struct ftr_set_desc {
-   const char  *name;
+   charname[20];
struct arm64_ftr_override   *override;
struct {
-   const char  *name;
+   charname[20];
u8  shift;
}   fields[];
 };

-static const struct ftr_set_desc mmfr1 __initdata = {
+static const struct ftr_set_desc mmfr1 __initconst = {
.name   = "id_aa64mmfr1",
.override   = &id_aa64mmfr1_override,
.fields = {
@@ -31,7 +31,7 @@ static const struct ftr_set_desc mmfr1 __initdata = {
},
 };

-static const struct ftr_set_desc pfr1 __initdata = {
+static const struct ftr_set_desc pfr1 __initconst = {
.name   = "id_aa64pfr1",
.override   = &id_aa64pfr1_override,
.fields = {
@@ -40,7 +40,7 @@ static const struct ftr_set_desc pfr1 __initdata = {
},
 };

-static const struct ftr_set_desc isar1 __initdata = {
+static const struct ftr_set_desc isar1 __initconst = {
.name   = "id_aa64isar1",
.override   = &id_aa64isar1_override,
.fields = {
@@ -54,7 +54,7 @@ static const struct ftr_set_desc isar1 __initdata = {

 extern struct arm64_ftr_override kaslr_feature_override;

-static const struct ftr_set_desc kaslr __initdata = {
+static const struct ftr_set_desc kaslr __initconst = {
.name   = "kaslr",
 #ifdef CONFIG_RANDOMIZE_BASE
.override   = &kaslr_feature_override,
@@ -65,7 +65,7 @@ static const struct ftr_set_desc kaslr __initdata = {
},
 };

-static const struct ftr_set_desc * const regs[] __initdata = {
+static const struct ftr_set_desc * const regs[] __initconst = {
&mmfr1,
&pfr1,
&isar1,
@@ -73,9 +73,9 @@ st

[PATCH v5 01/21] arm64: Fix labels in el2_setup macros

2021-01-25 Thread Marc Zyngier

If someone happens to write the following code:

b   1f
init_el2_state  vhe
1:
[...]

they will be in for a long debugging session, as the label "1f"
will be resolved *inside* the init_el2_state macro instead of
after it. Not really what one expects.

Instead, rewite the EL2 setup macros to use unambiguous labels,
thanks to the usual macro counter trick.

Acked-by: Catalin Marinas 
Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/el2_setup.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h 
b/arch/arm64/include/asm/el2_setup.h
index a7f5a1bbc8ac..540116de80bf 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -45,24 +45,24 @@
mrs x1, id_aa64dfr0_el1
sbfxx0, x1, #ID_AA64DFR0_PMUVER_SHIFT, #4
cmp x0, #1
-   b.lt1f  // Skip if no PMU present
+   b.lt.Lskip_pmu_\@   // Skip if no PMU present
mrs x0, pmcr_el0// Disable debug access traps
ubfxx0, x0, #11, #5 // to EL2 and allow access to
-1:
+.Lskip_pmu_\@:
cselx2, xzr, x0, lt // all PMU counters from EL1
 
/* Statistical profiling */
ubfxx0, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
-   cbz x0, 3f  // Skip if SPE not present
+   cbz x0, .Lskip_spe_\@   // Skip if SPE not present
 
 .ifeqs "\mode", "nvhe"
mrs_s   x0, SYS_PMBIDR_EL1  // If SPE available at EL2,
and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
-   cbnzx0, 2f  // then permit sampling of 
physical
+   cbnzx0, .Lskip_spe_el2_\@   // then permit sampling of 
physical
mov x0, #(1 << SYS_PMSCR_EL2_PCT_SHIFT | \
  1 << SYS_PMSCR_EL2_PA_SHIFT)
msr_s   SYS_PMSCR_EL2, x0   // addresses and physical 
counter
-2:
+.Lskip_spe_el2_\@:
mov x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
orr x2, x2, x0  // If we don't have VHE, then
// use EL1&0 translation.
@@ -71,7 +71,7 @@
// and disable access from EL1
 .endif
 
-3:
+.Lskip_spe_\@:
msr mdcr_el2, x2// Configure debug traps
 .endm
 
@@ -79,9 +79,9 @@
 .macro __init_el2_lor
mrs x1, id_aa64mmfr1_el1
ubfxx0, x1, #ID_AA64MMFR1_LOR_SHIFT, 4
-   cbz x0, 1f
+   cbz x0, .Lskip_lor_\@
msr_s   SYS_LORC_EL1, xzr
-1:
+.Lskip_lor_\@:
 .endm
 
 /* Stage-2 translation */
@@ -93,7 +93,7 @@
 .macro __init_el2_gicv3
mrs x0, id_aa64pfr0_el1
ubfxx0, x0, #ID_AA64PFR0_GIC_SHIFT, #4
-   cbz x0, 1f
+   cbz x0, .Lskip_gicv3_\@
 
mrs_s   x0, SYS_ICC_SRE_EL2
orr x0, x0, #ICC_SRE_EL2_SRE// Set ICC_SRE_EL2.SRE==1
@@ -103,7 +103,7 @@
mrs_s   x0, SYS_ICC_SRE_EL2 // Read SRE back,
tbz x0, #0, 1f  // and check that it sticks
msr_s   SYS_ICH_HCR_EL2, xzr// Reset ICC_HCR_EL2 to defaults
-1:
+.Lskip_gicv3_\@:
 .endm
 
 .macro __init_el2_hstr
@@ -128,14 +128,14 @@
 .macro __init_el2_nvhe_sve
mrs x1, id_aa64pfr0_el1
ubfxx1, x1, #ID_AA64PFR0_SVE_SHIFT, #4
-   cbz x1, 1f
+   cbz x1, .Lskip_sve_\@
 
bic x0, x0, #CPTR_EL2_TZ// Also disable SVE traps
msr cptr_el2, x0// Disable copro. traps to EL2
isb
mov x1, #ZCR_ELx_LEN_MASK   // SVE: Enable full vector
msr_s   SYS_ZCR_EL2, x1 // length for EL1.
-1:
+.Lskip_sve_\@:
 .endm
 
 .macro __init_el2_nvhe_prepare_eret
-- 
2.29.2

[PATCH v5 02/21] arm64: Fix outdated TCR setup comment

2021-01-25 Thread Marc Zyngier

The arm64 kernel has long be able to use more than 39bit VAs.
Since day one, actually. Let's rewrite the offending comment.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/mm/proc.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 1f7ee8c8b7b8..ece785477bdc 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -464,8 +464,8 @@ SYM_FUNC_START(__cpu_setup)
 #endif
msr mair_el1, x5
/*
-* Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
-* both user and kernel.
+* Set/prepare TCR and TTBR. TCR_EL1.T1SZ gets further
+* adjusted if the kernel is compiled with 52bit VA support.
 */
mov_q   x10, TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
-- 
2.29.2

Re: [PATCH v5 13/21] arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command line

2021-01-25 Thread Marc Zyngier


On 2021-01-25 13:15, Suzuki K Poulose wrote:

On 1/25/21 10:50 AM, Marc Zyngier wrote:

As we want to be able to disable VHE at runtime, let's match
"id_aa64mmfr1.vh=" from the command line as an override.
This doesn't have much effect yet as our boot code doesn't look
at the cpufeature, but only at the HW registers.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
  arch/arm64/include/asm/cpufeature.h |  2 ++
  arch/arm64/kernel/cpufeature.c  |  5 -
  arch/arm64/kernel/idreg-override.c  | 11 +++
  3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h

index 4179cfc8ed57..b0ed37da4067 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -818,6 +818,8 @@ static inline unsigned int get_vmid_bits(u64 
mmfr1)

return 8;
  }
  +extern struct arm64_ftr_override id_aa64mmfr1_override;
+
  u32 get_kvm_ipa_limit(void);
  void dump_cpu_features(void);
  diff --git a/arch/arm64/kernel/cpufeature.c 
b/arch/arm64/kernel/cpufeature.c

index 4b84a1e1dc51..c1d6712c4249 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -557,6 +557,8 @@ static const struct arm64_ftr_bits ftr_raz[] = {
#define ARM64_FTR_REG(id, table) ARM64_FTR_REG_OVERRIDE(id, table, 
&no_override)

  +struct arm64_ftr_override id_aa64mmfr1_override;


Does this need to be ro_after_init ?


Could do, together with the other two override targeting system 
registers.



Otherwise, looks good to me:

Acked-by: Suzuki K Poulose 


Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: (subset) [PATCH v6 0/5] ARM: arm64: Add SMCCC TRNG entropy service

2021-01-25 Thread Marc Zyngier

On Wed, 6 Jan 2021 10:34:48 +, Andre Przywara wrote:
> a fix to v5, now *really* fixing the wrong priority of SMCCC vs. RNDR
> in arch_get_random_seed_long_early(). Apologies for messing this up
> in v5 and thanks to broonie for being on the watch!
> 
> Will, Catalin: it would be much appreciated if you could consider taking
> patch 1/5. This contains the common definitions, and is a prerequisite
> for every other patch, although they are somewhat independent and likely
> will need to go through different subsystems.
> 
> [...]

Applied to kvm-arm64/rng-5.12, thanks!

[5/5] KVM: arm64: implement the TRNG hypervisor call
  commit: a8e190cdae1bf8e9e490776b8179babc1962bb25

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [BUG]: KVM: arm64: Fix symbol dependency in __hyp_call_panic_nvhe

2021-01-25 Thread Marc Zyngier

Hi all,

On Mon, 25 Jan 2021 20:07:56 +,
Oliver Upton  wrote:
> 
> > That means we have two options:
> > (a) define __hyp_panic_string in a different .c file in all pre-5.9 
> > branches, or
> > (b) revert the backported patch.
> > 
> > The patch was needed in 5.9 and should stay there. It wasn't needed in 
> > earlier
> > versions because the symbol was being kept alive by another user. It did 
> > "fix"
> > the inline asm semantics, but the problem was never triggered in pre-5.9.
> > 
> > Sasha, with this and the GCC bug in mind, would you agree that (b) is the
> > better course of action?
> 
> Sasha,
> 
> Any chance we can get this patch reverted as David has suggested? It
> was backported to 5.4 LTS in commit 653ae33b030b ("KVM: arm64: Fix
> symbol dependency in __hyp_call_panic_nvhe") and is causing build
> issues with a 4.9.4 vintage of GCC.

Absolutely. This issue has been lingering for some time now, and needs
addressing.

I'm definitely in favour of reverting this patch, although there are
two possible alternatives:

- Cherry-pick 9fd339a45be5 ("arm64: Work around broken GCC 4.9
  handling of "S" constraint"), which works around this particular GCC
  bug

- Cherry-pick dca5244d2f5b ("compiler.h: Raise minimum version of GCC
  to 5.1 for arm64"), which forbids GCC 4.9 as it has been
  demonstrated to mis-compile the kernel (and this patch is targeting
  stable anyway)

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

[PATCH v5 13/21] arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command line

2021-01-25 Thread Marc Zyngier

As we want to be able to disable VHE at runtime, let's match
"id_aa64mmfr1.vh=" from the command line as an override.
This doesn't have much effect yet as our boot code doesn't look
at the cpufeature, but only at the HW registers.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/cpufeature.h |  2 ++
 arch/arm64/kernel/cpufeature.c  |  5 -
 arch/arm64/kernel/idreg-override.c  | 11 +++
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 4179cfc8ed57..b0ed37da4067 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -818,6 +818,8 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
return 8;
 }
 
+extern struct arm64_ftr_override id_aa64mmfr1_override;
+
 u32 get_kvm_ipa_limit(void);
 void dump_cpu_features(void);
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 4b84a1e1dc51..c1d6712c4249 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -557,6 +557,8 @@ static const struct arm64_ftr_bits ftr_raz[] = {
 
 #define ARM64_FTR_REG(id, table) ARM64_FTR_REG_OVERRIDE(id, table, 
&no_override)
 
+struct arm64_ftr_override id_aa64mmfr1_override;
+
 static const struct __ftr_reg_entry {
u32 sys_id;
struct arm64_ftr_reg*reg;
@@ -604,7 +606,8 @@ static const struct __ftr_reg_entry {
 
/* Op1 = 0, CRn = 0, CRm = 7 */
ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0),
-   ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1),
+   ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1,
+  &id_aa64mmfr1_override),
ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2),
 
/* Op1 = 0, CRn = 1, CRm = 2 */
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 2d184d6674fd..489aa274e3ec 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -10,6 +10,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 struct ftr_set_desc {
@@ -21,7 +22,17 @@ struct ftr_set_desc {
}   fields[];
 };
 
+static const struct ftr_set_desc mmfr1 __initdata = {
+   .name   = "id_aa64mmfr1",
+   .override   = &id_aa64mmfr1_override,
+   .fields = {
+   { "vh", ID_AA64MMFR1_VHE_SHIFT },
+   {}
+   },
+};
+
 static const struct ftr_set_desc * const regs[] __initdata = {
+   &mmfr1,
 };
 
 static char *cmdline_contains_option(const char *cmdline, const char *option)
-- 
2.29.2

[PATCH v5 20/21] arm64: Defer enabling pointer authentication on boot core

2021-01-25 Thread Marc Zyngier

From: Srinivas Ramana 

Defer enabling pointer authentication on boot core until
after its required to be enabled by cpufeature framework.
This will help in controlling the feature dynamically
with a boot parameter.

Signed-off-by: Ajay Patil 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Srinivas Ramana 
Signed-off-by: Marc Zyngier 
Link: 
https://lore.kernel.org/r/1610152163-16554-2-git-send-email-sram...@codeaurora.org
Reviewed-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/pointer_auth.h   | 10 ++
 arch/arm64/include/asm/stackprotector.h |  1 +
 arch/arm64/kernel/head.S|  4 
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/pointer_auth.h 
b/arch/arm64/include/asm/pointer_auth.h
index c6b4f0603024..b112a11e9302 100644
--- a/arch/arm64/include/asm/pointer_auth.h
+++ b/arch/arm64/include/asm/pointer_auth.h
@@ -76,6 +76,15 @@ static inline unsigned long ptrauth_strip_insn_pac(unsigned 
long ptr)
return ptrauth_clear_pac(ptr);
 }
 
+static __always_inline void ptrauth_enable(void)
+{
+   if (!system_supports_address_auth())
+   return;
+   sysreg_clear_set(sctlr_el1, 0, (SCTLR_ELx_ENIA | SCTLR_ELx_ENIB |
+   SCTLR_ELx_ENDA | SCTLR_ELx_ENDB));
+   isb();
+}
+
 #define ptrauth_thread_init_user(tsk)  \
ptrauth_keys_init_user(&(tsk)->thread.keys_user)
 #define ptrauth_thread_init_kernel(tsk)
\
@@ -84,6 +93,7 @@ static inline unsigned long ptrauth_strip_insn_pac(unsigned 
long ptr)
ptrauth_keys_switch_kernel(&(tsk)->thread.keys_kernel)
 
 #else /* CONFIG_ARM64_PTR_AUTH */
+#define ptrauth_enable()
 #define ptrauth_prctl_reset_keys(tsk, arg) (-EINVAL)
 #define ptrauth_strip_insn_pac(lr) (lr)
 #define ptrauth_thread_init_user(tsk)
diff --git a/arch/arm64/include/asm/stackprotector.h 
b/arch/arm64/include/asm/stackprotector.h
index 7263e0bac680..33f1bb453150 100644
--- a/arch/arm64/include/asm/stackprotector.h
+++ b/arch/arm64/include/asm/stackprotector.h
@@ -41,6 +41,7 @@ static __always_inline void boot_init_stack_canary(void)
 #endif
ptrauth_thread_init_kernel(current);
ptrauth_thread_switch_kernel(current);
+   ptrauth_enable();
 }
 
 #endif /* _ASM_STACKPROTECTOR_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 3243e3ae9bd8..2e116ef255e1 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -404,10 +404,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
adr_l   x5, init_task
msr sp_el0, x5  // Save thread_info
 
-#ifdef CONFIG_ARM64_PTR_AUTH
-   __ptrauth_keys_init_cpu x5, x6, x7, x8
-#endif
-
adr_l   x8, vectors // load VBAR_EL1 with virtual
msr vbar_el1, x8// vector table address
isb
-- 
2.29.2

[PATCH v5 16/21] arm64: Make kvm-arm.mode={nvhe, protected} an alias of id_aa64mmfr1.vh=0

2021-01-25 Thread Marc Zyngier

Admitedly, passing id_aa64mmfr1.vh=0 on the command-line isn't
that easy to understand, and it is likely that users would much
prefer write "kvm-arm.mode=nvhe", or "...=protected".

So here you go. This has the added advantage that we can now
always honor the "kvm-arm.mode=protected" option, even when
booting on a VHE system.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 Documentation/admin-guide/kernel-parameters.txt | 3 +++
 arch/arm64/kernel/idreg-override.c  | 2 ++
 arch/arm64/kvm/arm.c| 3 +++
 3 files changed, 8 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 9e3cdb271d06..2786fd39a047 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2257,6 +2257,9 @@
kvm-arm.mode=
[KVM,ARM] Select one of KVM/arm64's modes of operation.
 
+   nvhe: Standard nVHE-based mode, without support for
+ protected guests.
+
protected: nVHE-based mode with support for guests whose
   state is kept private from the host.
   Not valid if the kernel is running in EL2.
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 4fad3fc4d104..cbb8eaa48742 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -39,6 +39,8 @@ static const struct {
const char  *alias;
const char  *feature;
 } aliases[] __initdata = {
+   { "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
+   { "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
 };
 
 static char *cmdline_contains_option(const char *cmdline, const char *option)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 04c44853b103..597565a65ca2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1966,6 +1966,9 @@ static int __init early_kvm_mode_cfg(char *arg)
return 0;
}
 
+   if (strcmp(arg, "nvhe") == 0 && !WARN_ON(is_kernel_in_hyp_mode()))
+   return 0;
+
return -EINVAL;
 }
 early_param("kvm-arm.mode", early_kvm_mode_cfg);
-- 
2.29.2

[PATCH v5 15/21] arm64: Add an aliasing facility for the idreg override

2021-01-25 Thread Marc Zyngier

In order to map the override of idregs to options that a user
can easily understand, let's introduce yet another option
array, which maps an option to the corresponding idreg options.

Signed-off-by: Marc Zyngier 
Reviewed-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/idreg-override.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 489aa274e3ec..4fad3fc4d104 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -35,6 +35,12 @@ static const struct ftr_set_desc * const regs[] __initdata = 
{
&mmfr1,
 };
 
+static const struct {
+   const char  *alias;
+   const char  *feature;
+} aliases[] __initdata = {
+};
+
 static char *cmdline_contains_option(const char *cmdline, const char *option)
 {
char *str = strstr(cmdline, option);
@@ -88,6 +94,15 @@ static void __init match_options(const char *cmdline)
}
 }
 
+static __init void match_aliases(const char *cmdline)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(aliases); i++)
+   if (cmdline_contains_option(cmdline, aliases[i].alias))
+   match_options(aliases[i].feature);
+}
+
 static __init void parse_cmdline(void)
 {
if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
@@ -108,6 +123,7 @@ static __init void parse_cmdline(void)
goto out;
 
match_options(prop);
+   match_aliases(prop);
 
if (!IS_ENABLED(CONFIG_CMDLINE_EXTEND))
return;
@@ -115,6 +131,7 @@ static __init void parse_cmdline(void)
 
 out:
match_options(CONFIG_CMDLINE);
+   match_aliases(CONFIG_CMDLINE);
 }
 
 /* Keep checkers quiet */
-- 
2.29.2

[PATCH v5 14/21] arm64: Honor VHE being disabled from the command-line

2021-01-25 Thread Marc Zyngier

Finally we can check whether VHE is disabled on the command line,
and not enable it if that's the user's wish.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/asm-offsets.c |  3 +++
 arch/arm64/kernel/hyp-stub.S| 11 +++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index f42fd9e33981..1add0f21bffe 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -99,6 +99,9 @@ int main(void)
   DEFINE(CPU_BOOT_STACK,   offsetof(struct secondary_data, stack));
   DEFINE(CPU_BOOT_TASK,offsetof(struct secondary_data, task));
   BLANK();
+  DEFINE(FTR_OVR_VAL_OFFSET,   offsetof(struct arm64_ftr_override, val));
+  DEFINE(FTR_OVR_MASK_OFFSET,  offsetof(struct arm64_ftr_override, mask));
+  BLANK();
 #ifdef CONFIG_KVM
   DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt));
   DEFINE(VCPU_FAULT_DISR,  offsetof(struct kvm_vcpu, arch.fault.disr_el1));
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 6b5c73cf9d52..aa7e8d592295 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -87,6 +87,17 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
ubfxx1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4
cbz x1, 1f
 
+   // Check whether VHE is disabled from the command line
+   adr_l   x1, id_aa64mmfr1_override
+   ldr x2, [x1, FTR_OVR_VAL_OFFSET]
+   ldr x1, [x1, FTR_OVR_MASK_OFFSET]
+   ubfxx2, x2, #ID_AA64MMFR1_VHE_SHIFT, #4
+   ubfxx1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4
+   cmp x1, xzr
+   and x2, x2, x1
+   csinv   x2, x2, xzr, ne
+   cbz x2, 1f
+
// Engage the VHE magic!
mov_q   x0, HCR_HOST_VHE_FLAGS
msr hcr_el2, x0
-- 
2.29.2

[PATCH v5 21/21] arm64: cpufeatures: Allow disabling of Pointer Auth from the command-line

2021-01-25 Thread Marc Zyngier

In order to be able to disable Pointer Authentication  at runtime,
whether it is for testing purposes, or to work around HW issues,
let's add support for overriding the ID_AA64ISAR1_EL1.{GPI,GPA,API,APA}
fields.

This is further mapped on the arm64.nopauth command-line alias.

Signed-off-by: Marc Zyngier 
Reviewed-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/kernel/cpufeature.c  |  4 +++-
 arch/arm64/kernel/idreg-override.c  | 16 
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 7599fd0f1ad7..f9cb28a39bd0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -376,6 +376,9 @@
arm64.nobti [ARM64] Unconditionally disable Branch Target
Identification support
 
+   arm64.nopauth   [ARM64] Unconditionally disable Pointer Authentication
+   support
+
ataflop=[HW,M68k]
 
atarimouse= [HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 4e2f2de9d0d7..ec6311903ad4 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -820,6 +820,7 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
 
 extern struct arm64_ftr_override id_aa64mmfr1_override;
 extern struct arm64_ftr_override id_aa64pfr1_override;
+extern struct arm64_ftr_override id_aa64isar1_override;
 
 u32 get_kvm_ipa_limit(void);
 void dump_cpu_features(void);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index bb99ddb212b5..954a2b7e5d45 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -559,6 +559,7 @@ static const struct arm64_ftr_bits ftr_raz[] = {
 
 struct arm64_ftr_override id_aa64mmfr1_override;
 struct arm64_ftr_override id_aa64pfr1_override;
+struct arm64_ftr_override id_aa64isar1_override;
 
 static const struct __ftr_reg_entry {
u32 sys_id;
@@ -604,7 +605,8 @@ static const struct __ftr_reg_entry {
 
/* Op1 = 0, CRn = 0, CRm = 6 */
ARM64_FTR_REG(SYS_ID_AA64ISAR0_EL1, ftr_id_aa64isar0),
-   ARM64_FTR_REG(SYS_ID_AA64ISAR1_EL1, ftr_id_aa64isar1),
+   ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ISAR1_EL1, ftr_id_aa64isar1,
+  &id_aa64isar1_override),
 
/* Op1 = 0, CRn = 0, CRm = 7 */
ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0),
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 71349b644246..d1310438d95c 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -40,6 +40,18 @@ static const struct ftr_set_desc pfr1 __initdata = {
},
 };
 
+static const struct ftr_set_desc isar1 __initdata = {
+   .name   = "id_aa64isar1",
+   .override   = &id_aa64isar1_override,
+   .fields = {
+   { "gpi", ID_AA64ISAR1_GPI_SHIFT },
+   { "gpa", ID_AA64ISAR1_GPA_SHIFT },
+   { "api", ID_AA64ISAR1_API_SHIFT },
+   { "apa", ID_AA64ISAR1_APA_SHIFT },
+   {}
+   },
+};
+
 extern struct arm64_ftr_override kaslr_feature_override;
 
 static const struct ftr_set_desc kaslr __initdata = {
@@ -56,6 +68,7 @@ static const struct ftr_set_desc kaslr __initdata = {
 static const struct ftr_set_desc * const regs[] __initdata = {
&mmfr1,
&pfr1,
+   &isar1,
&kaslr,
 };
 
@@ -66,6 +79,9 @@ static const struct {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
{ "arm64.nobti","id_aa64pfr1.bt=0" },
+   { "arm64.nopauth",
+ "id_aa64isar1.gpi=0 id_aa64isar1.gpa=0 "
+ "id_aa64isar1.api=0 id_aa64isar1.apa=0"  },
{ "nokaslr","kaslr.disabled=1" },
 };
 
-- 
2.29.2

[PATCH v5 11/21] arm64: Extract early FDT mapping from kaslr_early_init()

2021-01-25 Thread Marc Zyngier

As we want to parse more options very early in the kernel lifetime,
let's always map the FDT early. This is achieved by moving that
code out of kaslr_early_init().

No functionnal change expected.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/setup.h | 11 +++
 arch/arm64/kernel/head.S   |  3 ++-
 arch/arm64/kernel/kaslr.c  |  7 +++
 arch/arm64/kernel/setup.c  | 15 +++
 4 files changed, 31 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/include/asm/setup.h

diff --git a/arch/arm64/include/asm/setup.h b/arch/arm64/include/asm/setup.h
new file mode 100644
index ..d3320618ed14
--- /dev/null
+++ b/arch/arm64/include/asm/setup.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#ifndef __ARM64_ASM_SETUP_H
+#define __ARM64_ASM_SETUP_H
+
+#include 
+
+void *get_early_fdt_ptr(void);
+void early_fdt_map(u64 dt_phys);
+
+#endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b425d2587cdb..d74e5f84042e 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -433,6 +433,8 @@ SYM_FUNC_START_LOCAL(__primary_switched)
bl  __pi_memset
dsb ishst   // Make zero page visible to PTW
 
+   mov x0, x21 // pass FDT address in x0
+   bl  early_fdt_map   // Try mapping the FDT early
bl  switch_to_vhe
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl  kasan_early_init
@@ -440,7 +442,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
 #ifdef CONFIG_RANDOMIZE_BASE
tst x23, ~(MIN_KIMG_ALIGN - 1)  // already running randomized?
b.ne0f
-   mov x0, x21 // pass FDT address in x0
bl  kaslr_early_init// parse FDT for KASLR options
cbz x0, 0f  // KASLR disabled? just proceed
orr x23, x23, x0// record KASLR offset
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 1c74c45b9494..5fc86e7d01a1 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 enum kaslr_status {
KASLR_ENABLED,
@@ -92,12 +93,11 @@ static __init bool is_kaslr_disabled_cmdline(void *fdt)
  * containing function pointers) to be reinitialized, and zero-initialized
  * .bss variables will be reset to 0.
  */
-u64 __init kaslr_early_init(u64 dt_phys)
+u64 __init kaslr_early_init(void)
 {
void *fdt;
u64 seed, offset, mask, module_range;
unsigned long raw;
-   int size;
 
/*
 * Set a reasonable default for module_alloc_base in case
@@ -111,8 +111,7 @@ u64 __init kaslr_early_init(u64 dt_phys)
 * and proceed with KASLR disabled. We will make another
 * attempt at mapping the FDT in setup_machine()
 */
-   early_fixmap_init();
-   fdt = fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+   fdt = get_early_fdt_ptr();
if (!fdt) {
kaslr_status = KASLR_DISABLED_FDT_REMAP;
return 0;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index c18aacde8bb0..61845c0821d9 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -168,6 +168,21 @@ static void __init smp_build_mpidr_hash(void)
pr_warn("Large number of MPIDR hash buckets detected\n");
 }
 
+static void *early_fdt_ptr __initdata;
+
+void __init *get_early_fdt_ptr(void)
+{
+   return early_fdt_ptr;
+}
+
+asmlinkage void __init early_fdt_map(u64 dt_phys)
+{
+   int fdt_size;
+
+   early_fixmap_init();
+   early_fdt_ptr = fixmap_remap_fdt(dt_phys, &fdt_size, PAGE_KERNEL);
+}
+
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
 {
int size;
-- 
2.29.2

[PATCH v5 19/21] arm64: cpufeatures: Allow disabling of BTI from the command-line

2021-01-25 Thread Marc Zyngier

In order to be able to disable BTI at runtime, whether it is
for testing purposes, or to work around HW issues, let's add
support for overriding the ID_AA64PFR1_EL1.BTI field.

This is further mapped on the arm64.nobti command-line alias.

Signed-off-by: Marc Zyngier 
Reviewed-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/kernel/cpufeature.c  |  4 +++-
 arch/arm64/kernel/idreg-override.c  | 11 +++
 arch/arm64/mm/mmu.c |  2 +-
 5 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 2786fd39a047..7599fd0f1ad7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -373,6 +373,9 @@
arcrimi=[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
Format: ,,
 
+   arm64.nobti [ARM64] Unconditionally disable Branch Target
+   Identification support
+
ataflop=[HW,M68k]
 
atarimouse= [HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index b0ed37da4067..4e2f2de9d0d7 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -819,6 +819,7 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
 }
 
 extern struct arm64_ftr_override id_aa64mmfr1_override;
+extern struct arm64_ftr_override id_aa64pfr1_override;
 
 u32 get_kvm_ipa_limit(void);
 void dump_cpu_features(void);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index c1d6712c4249..bb99ddb212b5 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -558,6 +558,7 @@ static const struct arm64_ftr_bits ftr_raz[] = {
 #define ARM64_FTR_REG(id, table) ARM64_FTR_REG_OVERRIDE(id, table, 
&no_override)
 
 struct arm64_ftr_override id_aa64mmfr1_override;
+struct arm64_ftr_override id_aa64pfr1_override;
 
 static const struct __ftr_reg_entry {
u32 sys_id;
@@ -593,7 +594,8 @@ static const struct __ftr_reg_entry {
 
/* Op1 = 0, CRn = 0, CRm = 4 */
ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0),
-   ARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1),
+   ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1,
+  &id_aa64pfr1_override),
ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0),
 
/* Op1 = 0, CRn = 0, CRm = 5 */
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 3ccf51b84ba4..71349b644246 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -31,6 +31,15 @@ static const struct ftr_set_desc mmfr1 __initdata = {
},
 };
 
+static const struct ftr_set_desc pfr1 __initdata = {
+   .name   = "id_aa64pfr1",
+   .override   = &id_aa64pfr1_override,
+   .fields = {
+   { "bt", ID_AA64PFR1_BT_SHIFT },
+   {}
+   },
+};
+
 extern struct arm64_ftr_override kaslr_feature_override;
 
 static const struct ftr_set_desc kaslr __initdata = {
@@ -46,6 +55,7 @@ static const struct ftr_set_desc kaslr __initdata = {
 
 static const struct ftr_set_desc * const regs[] __initdata = {
&mmfr1,
+   &pfr1,
&kaslr,
 };
 
@@ -55,6 +65,7 @@ static const struct {
 } aliases[] __initdata = {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
+   { "arm64.nobti","id_aa64pfr1.bt=0" },
{ "nokaslr","kaslr.disabled=1" },
 };
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ae0c3d023824..617e704c980b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -628,7 +628,7 @@ static bool arm64_early_this_cpu_has_bti(void)
if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
return false;
 
-   pfr1 = read_sysreg_s(SYS_ID_AA64PFR1_EL1);
+   pfr1 = __read_sysreg_by_encoding(SYS_ID_AA64PFR1_EL1);
return cpuid_feature_extract_unsigned_field(pfr1,
ID_AA64PFR1_BT_SHIFT);
 }
-- 
2.29.2

[PATCH v5 10/21] arm64: cpufeature: Use IDreg override in __read_sysreg_by_encoding()

2021-01-25 Thread Marc Zyngier

__read_sysreg_by_encoding() is used by a bunch of cpufeature helpers,
which should take the feature override into account. Let's do that.

For a good measure (and because we are likely to need to further
down the line), make this helper available to the rest of the
non-modular kernel.

Code that needs to know the *real* features of a CPU can still
use read_sysreg_s(), and find the bare, ugly truth.

Signed-off-by: Marc Zyngier 
Reviewed-by: Suzuki K Poulose 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/kernel/cpufeature.c  | 15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index fe469389068e..4179cfc8ed57 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -606,6 +606,7 @@ void __init setup_cpu_features(void);
 void check_local_cpu_capabilities(void);
 
 u64 read_sanitised_ftr_reg(u32 id);
+u64 __read_sysreg_by_encoding(u32 sys_id);
 
 static inline bool cpu_supports_mixed_endian_el0(void)
 {
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index cb58c7c991ef..4b84a1e1dc51 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1145,14 +1145,17 @@ u64 read_sanitised_ftr_reg(u32 id)
 EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
 
 #define read_sysreg_case(r)\
-   case r: return read_sysreg_s(r)
+   case r: val = read_sysreg_s(r); break;
 
 /*
  * __read_sysreg_by_encoding() - Used by a STARTING cpu before cpuinfo is 
populated.
  * Read the system register on the current CPU
  */
-static u64 __read_sysreg_by_encoding(u32 sys_id)
+u64 __read_sysreg_by_encoding(u32 sys_id)
 {
+   struct arm64_ftr_reg *regp;
+   u64 val;
+
switch (sys_id) {
read_sysreg_case(SYS_ID_PFR0_EL1);
read_sysreg_case(SYS_ID_PFR1_EL1);
@@ -1195,6 +1198,14 @@ static u64 __read_sysreg_by_encoding(u32 sys_id)
BUG();
return 0;
}
+
+   regp  = get_arm64_ftr_reg(sys_id);
+   if (regp) {
+   val &= ~regp->override->mask;
+   val |= (regp->override->val & regp->override->mask);
+   }
+
+   return val;
 }
 
 #include 
-- 
2.29.2

[PATCH v5 17/21] KVM: arm64: Document HVC_VHE_RESTART stub hypercall

2021-01-25 Thread Marc Zyngier

For completeness, let's document the HVC_VHE_RESTART stub.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 Documentation/virt/kvm/arm/hyp-abi.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/virt/kvm/arm/hyp-abi.rst 
b/Documentation/virt/kvm/arm/hyp-abi.rst
index 83cadd8186fa..4d43fbc25195 100644
--- a/Documentation/virt/kvm/arm/hyp-abi.rst
+++ b/Documentation/virt/kvm/arm/hyp-abi.rst
@@ -58,6 +58,15 @@ these functions (see arch/arm{,64}/include/asm/virt.h):
   into place (arm64 only), and jump to the restart address while at HYP/EL2.
   This hypercall is not expected to return to its caller.
 
+* ::
+
+x0 = HVC_VHE_RESTART (arm64 only)
+
+  Attempt to upgrade the kernel's exception level from EL1 to EL2 by enabling
+  the VHE mode. This is conditioned by the CPU supporting VHE, the EL2 MMU
+  being off, and VHE not being disabled by any other means (command line
+  option, for example).
+
 Any other value of r0/x0 triggers a hypervisor-specific handling,
 which is not documented here.
 
-- 
2.29.2

[PATCH v5 18/21] arm64: Move "nokaslr" over to the early cpufeature infrastructure

2021-01-25 Thread Marc Zyngier

Given that the early cpufeature infrastructure has borrowed quite
a lot of code from the kaslr implementation, let's reimplement
the matching of the "nokaslr" option with it.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/idreg-override.c | 15 +
 arch/arm64/kernel/kaslr.c  | 36 ++
 2 files changed, 17 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index cbb8eaa48742..3ccf51b84ba4 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -31,8 +31,22 @@ static const struct ftr_set_desc mmfr1 __initdata = {
},
 };
 
+extern struct arm64_ftr_override kaslr_feature_override;
+
+static const struct ftr_set_desc kaslr __initdata = {
+   .name   = "kaslr",
+#ifdef CONFIG_RANDOMIZE_BASE
+   .override   = &kaslr_feature_override,
+#endif
+   .fields = {
+   { "disabled", 0 },
+   {}
+   },
+};
+
 static const struct ftr_set_desc * const regs[] __initdata = {
&mmfr1,
+   &kaslr,
 };
 
 static const struct {
@@ -41,6 +55,7 @@ static const struct {
 } aliases[] __initdata = {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
+   { "nokaslr","kaslr.disabled=1" },
 };
 
 static char *cmdline_contains_option(const char *cmdline, const char *option)
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 5fc86e7d01a1..27f8939deb1b 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -51,39 +51,7 @@ static __init u64 get_kaslr_seed(void *fdt)
return ret;
 }
 
-static __init bool cmdline_contains_nokaslr(const u8 *cmdline)
-{
-   const u8 *str;
-
-   str = strstr(cmdline, "nokaslr");
-   return str == cmdline || (str > cmdline && *(str - 1) == ' ');
-}
-
-static __init bool is_kaslr_disabled_cmdline(void *fdt)
-{
-   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
-   int node;
-   const u8 *prop;
-
-   node = fdt_path_offset(fdt, "/chosen");
-   if (node < 0)
-   goto out;
-
-   prop = fdt_getprop(fdt, node, "bootargs", NULL);
-   if (!prop)
-   goto out;
-
-   if (cmdline_contains_nokaslr(prop))
-   return true;
-
-   if (IS_ENABLED(CONFIG_CMDLINE_EXTEND))
-   goto out;
-
-   return false;
-   }
-out:
-   return cmdline_contains_nokaslr(CONFIG_CMDLINE);
-}
+struct arm64_ftr_override kaslr_feature_override __initdata;
 
 /*
  * This routine will be executed with the kernel mapped at its default virtual
@@ -126,7 +94,7 @@ u64 __init kaslr_early_init(void)
 * Check if 'nokaslr' appears on the command line, and
 * return 0 if that is the case.
 */
-   if (is_kaslr_disabled_cmdline(fdt)) {
+   if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) {
kaslr_status = KASLR_DISABLED_CMDLINE;
return 0;
}
-- 
2.29.2

[PATCH v5 12/21] arm64: cpufeature: Add an early command-line cpufeature override facility

2021-01-25 Thread Marc Zyngier

In order to be able to override CPU features at boot time,
let's add a command line parser that matches options of the
form "cpureg.feature=value", and store the corresponding
value into the override val/mask pair.

No features are currently defined, so no expected change in
functionality.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/Makefile |   2 +-
 arch/arm64/kernel/head.S   |   1 +
 arch/arm64/kernel/idreg-override.c | 130 +
 3 files changed, 132 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/idreg-override.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 86364ab6f13f..2262f0392857 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -17,7 +17,7 @@ obj-y := debug-monitors.o entry.o irq.o 
fpsimd.o  \
   return_address.o cpuinfo.o cpu_errata.o  
\
   cpufeature.o alternative.o cacheinfo.o   
\
   smp.o smp_spin_table.o topology.o smccc-call.o   
\
-  syscall.o proton-pack.o
+  syscall.o proton-pack.o idreg-override.o
 
 targets+= efi-entry.o
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index d74e5f84042e..3243e3ae9bd8 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -435,6 +435,7 @@ SYM_FUNC_START_LOCAL(__primary_switched)
 
mov x0, x21 // pass FDT address in x0
bl  early_fdt_map   // Try mapping the FDT early
+   bl  init_feature_override
bl  switch_to_vhe
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl  kasan_early_init
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
new file mode 100644
index ..2d184d6674fd
--- /dev/null
+++ b/arch/arm64/kernel/idreg-override.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Early cpufeature override framework
+ *
+ * Copyright (C) 2020 Google LLC
+ * Author: Marc Zyngier 
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+
+struct ftr_set_desc {
+   const char  *name;
+   struct arm64_ftr_override   *override;
+   struct {
+   const char  *name;
+   u8  shift;
+   }   fields[];
+};
+
+static const struct ftr_set_desc * const regs[] __initdata = {
+};
+
+static char *cmdline_contains_option(const char *cmdline, const char *option)
+{
+   char *str = strstr(cmdline, option);
+
+   if ((str == cmdline || (str > cmdline && *(str - 1) == ' ')))
+   return str;
+
+   return NULL;
+}
+
+static int __init find_field(const char *cmdline,
+const struct ftr_set_desc *reg, int f, u64 *v)
+{
+   char buf[256], *str;
+   size_t len;
+
+   snprintf(buf, ARRAY_SIZE(buf), "%s.%s=", reg->name, 
reg->fields[f].name);
+
+   str = cmdline_contains_option(cmdline, buf);
+   if (!str)
+   return -1;
+
+   str += strlen(buf);
+   len = strcspn(str, " ");
+   len = min(len, ARRAY_SIZE(buf) - 1);
+   strncpy(buf, str, len);
+   buf[len] = 0;
+
+   return kstrtou64(buf, 0, v);
+}
+
+static void __init match_options(const char *cmdline)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(regs); i++) {
+   int f;
+
+   if (!regs[i]->override)
+   continue;
+
+   for (f = 0; regs[i]->fields[f].name; f++) {
+   u64 v;
+
+   if (find_field(cmdline, regs[i], f, &v))
+   continue;
+
+   regs[i]->override->val  |= (v & 0xf) << 
regs[i]->fields[f].shift;
+   regs[i]->override->mask |= 0xfUL << 
regs[i]->fields[f].shift;
+   }
+   }
+}
+
+static __init void parse_cmdline(void)
+{
+   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+   const u8 *prop;
+   void *fdt;
+   int node;
+
+   fdt = get_early_fdt_ptr();
+   if (!fdt)
+   goto out;
+
+   node = fdt_path_offset(fdt, "/chosen");
+   if (node < 0)
+   goto out;
+
+   prop = fdt_getprop(fdt, node, "bootargs", NULL);
+   if (!prop)
+   goto out;
+
+   match_options(prop);
+
+   if (!IS_ENABLED(CONFIG_CMDLINE_EXTEND))
+   return;
+   }
+
+out:
+   match_options(CONFIG_CMDLINE);
+}
+
+/* Keep checkers quiet */
+void init_feature_o

[PATCH v5 07/21] arm64: Simplify init_el2_state to be non-VHE only

2021-01-25 Thread Marc Zyngier

As init_el2_state is now nVHE only, let's simplify it and drop
the VHE setup.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/include/asm/el2_setup.h | 36 +++---
 arch/arm64/kernel/head.S   |  2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S |  2 +-
 3 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h 
b/arch/arm64/include/asm/el2_setup.h
index 540116de80bf..d77d358f9395 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -32,16 +32,14 @@
  * to transparently mess with the EL0 bits via CNTKCTL_EL1 access in
  * EL2.
  */
-.macro __init_el2_timers mode
-.ifeqs "\mode", "nvhe"
+.macro __init_el2_timers
mrs x0, cnthctl_el2
orr x0, x0, #3  // Enable EL1 physical timers
msr cnthctl_el2, x0
-.endif
msr cntvoff_el2, xzr// Clear virtual offset
 .endm
 
-.macro __init_el2_debug mode
+.macro __init_el2_debug
mrs x1, id_aa64dfr0_el1
sbfxx0, x1, #ID_AA64DFR0_PMUVER_SHIFT, #4
cmp x0, #1
@@ -55,7 +53,6 @@
ubfxx0, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
cbz x0, .Lskip_spe_\@   // Skip if SPE not present
 
-.ifeqs "\mode", "nvhe"
mrs_s   x0, SYS_PMBIDR_EL1  // If SPE available at EL2,
and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
cbnzx0, .Lskip_spe_el2_\@   // then permit sampling of 
physical
@@ -66,10 +63,6 @@
mov x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
orr x2, x2, x0  // If we don't have VHE, then
// use EL1&0 translation.
-.else
-   orr x2, x2, #MDCR_EL2_TPMS  // For VHE, use EL2 translation
-   // and disable access from EL1
-.endif
 
 .Lskip_spe_\@:
msr mdcr_el2, x2// Configure debug traps
@@ -145,37 +138,24 @@
 
 /**
  * Initialize EL2 registers to sane values. This should be called early on all
- * cores that were booted in EL2.
+ * cores that were booted in EL2. Note that everything gets initialised as
+ * if VHE was not evailable. The kernel context will be upgraded to VHE
+ * if possible later on in the boot process
  *
  * Regs: x0, x1 and x2 are clobbered.
  */
-.macro init_el2_state mode
-.ifnes "\mode", "vhe"
-.ifnes "\mode", "nvhe"
-.error "Invalid 'mode' argument"
-.endif
-.endif
-
+.macro init_el2_state
__init_el2_sctlr
-   __init_el2_timers \mode
-   __init_el2_debug \mode
+   __init_el2_timers
+   __init_el2_debug
__init_el2_lor
__init_el2_stage2
__init_el2_gicv3
__init_el2_hstr
-
-   /*
-* When VHE is not in use, early init of EL2 needs to be done here.
-* When VHE _is_ in use, EL1 will not be used in the host and
-* requires no configuration, and all non-hyp-specific EL2 setup
-* will be done via the _EL1 system register aliases in __cpu_setup.
-*/
-.ifeqs "\mode", "nvhe"
__init_el2_nvhe_idregs
__init_el2_nvhe_cptr
__init_el2_nvhe_sve
__init_el2_nvhe_prepare_eret
-.endif
 .endm
 
 #endif /* __ARM_KVM_INIT_H__ */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 07445fd976ef..36212c05df42 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -501,7 +501,7 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr hcr_el2, x0
isb
 
-   init_el2_state nvhe
+   init_el2_state
 
/* Hypervisor stub */
adr_l   x0, __hyp_stub_vectors
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 31b060a44045..222cfc3e7190 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -189,7 +189,7 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
 2: msr SPsel, #1   // We want to use SP_EL{1,2}
 
/* Initialize EL2 CPU state to sane values. */
-   init_el2_state nvhe // Clobbers x0..x2
+   init_el2_state  // Clobbers x0..x2
 
/* Enable MMU, set vectors and stack. */
mov x0, x28
-- 
2.29.2

[PATCH v5 05/21] arm64: Initialise as nVHE before switching to VHE

2021-01-25 Thread Marc Zyngier

As we are aiming to be able to control whether we enable VHE or
not, let's always drop down to EL1 first, and only then upgrade
to VHE if at all possible.

This means that if the kernel is booted at EL2, we always start
with a nVHE init, drop to EL1 to initialise the the kernel, and
only then upgrade the kernel EL to EL2 if possible (the process
is obviously shortened for secondary CPUs).

The resume path is handled similarly to a secondary CPU boot.

Signed-off-by: Marc Zyngier 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/head.S | 38 ++--
 arch/arm64/kernel/hyp-stub.S | 24 +++
 arch/arm64/kernel/sleep.S|  1 +
 3 files changed, 27 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 28e9735302df..07445fd976ef 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -433,6 +433,7 @@ SYM_FUNC_START_LOCAL(__primary_switched)
bl  __pi_memset
dsb ishst   // Make zero page visible to PTW
 
+   bl  switch_to_vhe
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl  kasan_early_init
 #endif
@@ -493,42 +494,6 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
-#ifdef CONFIG_ARM64_VHE
-   /*
-* Check for VHE being present. x2 being non-zero indicates that we
-* do have VHE, and that the kernel is intended to run at EL2.
-*/
-   mrs x2, id_aa64mmfr1_el1
-   ubfxx2, x2, #ID_AA64MMFR1_VHE_SHIFT, #4
-#else
-   mov x2, xzr
-#endif
-   cbz x2, init_el2_nvhe
-
-   /*
-* When VHE _is_ in use, EL1 will not be used in the host and
-* requires no configuration, and all non-hyp-specific EL2 setup
-* will be done via the _EL1 system register aliases in __cpu_setup.
-*/
-   mov_q   x0, HCR_HOST_VHE_FLAGS
-   msr hcr_el2, x0
-   isb
-
-   init_el2_state vhe
-
-   isb
-
-   mov_q   x0, INIT_PSTATE_EL2
-   msr spsr_el2, x0
-   msr elr_el2, lr
-   mov w0, #BOOT_CPU_MODE_EL2
-   eret
-
-SYM_INNER_LABEL(init_el2_nvhe, SYM_L_LOCAL)
-   /*
-* When VHE is not in use, early init of EL2 and EL1 needs to be
-* done here.
-*/
mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
msr sctlr_el1, x0
 
@@ -623,6 +588,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
/*
 * Common entry point for secondary CPUs.
 */
+   bl  switch_to_vhe
bl  __cpu_secondary_check52bitva
bl  __cpu_setup // initialise processor
adrpx1, swapper_pg_dir
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 3f3dbbe8914d..373ed2213e1d 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -190,3 +190,27 @@ SYM_FUNC_START(__hyp_reset_vectors)
hvc #0
ret
 SYM_FUNC_END(__hyp_reset_vectors)
+
+/*
+ * Entry point to switch to VHE if deemed capable
+ */
+SYM_FUNC_START(switch_to_vhe)
+#ifdef CONFIG_ARM64_VHE
+   // Need to have booted at EL2
+   adr_l   x1, __boot_cpu_mode
+   ldr w0, [x1]
+   cmp w0, #BOOT_CPU_MODE_EL2
+   b.ne1f
+
+   // and still be at EL1
+   mrs x0, CurrentEL
+   cmp x0, #CurrentEL_EL1
+   b.ne1f
+
+   // Turn the world upside down
+   mov x0, #HVC_VHE_RESTART
+   hvc #0
+1:
+#endif
+   ret
+SYM_FUNC_END(switch_to_vhe)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 6bdef7362c0e..5bfd9b87f85d 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,6 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
bl  init_kernel_el
+   bl  switch_to_vhe
bl  __cpu_setup
/* enable the MMU early - so we can access sleep_save_stash by va */
adrpx1, swapper_pg_dir
-- 
2.29.2

[PATCH v5 08/21] arm64: Move SCTLR_EL1 initialisation to EL-agnostic code

2021-01-25 Thread Marc Zyngier

We can now move the initial SCTLR_EL1 setup to be used for both
EL1 and EL2 setup.

Signed-off-by: Marc Zyngier 
Acked-by: Catalin Marinas 
Acked-by: David Brazdil 
---
 arch/arm64/kernel/head.S | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 36212c05df42..b425d2587cdb 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -479,13 +479,14 @@ EXPORT_SYMBOL(kimage_vaddr)
  * booted in EL1 or EL2 respectively.
  */
 SYM_FUNC_START(init_kernel_el)
+   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
+   msr sctlr_el1, x0
+
mrs x0, CurrentEL
cmp x0, #CurrentEL_EL2
b.eqinit_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
-   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
-   msr sctlr_el1, x0
isb
mov_q   x0, INIT_PSTATE_EL1
msr spsr_el1, x0
@@ -494,9 +495,6 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
-   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
-   msr sctlr_el1, x0
-
mov_q   x0, HCR_HOST_NVHE_FLAGS
msr hcr_el2, x0
isb
-- 
2.29.2

Re: [PATCH v4 15/21] arm64: Add an aliasing facility for the idreg override

2021-01-24 Thread Marc Zyngier

On Mon, 18 Jan 2021 13:18:39 +,
David Brazdil  wrote:
> 
> On Mon, Jan 18, 2021 at 09:45:27AM +, Marc Zyngier wrote:
> > In order to map the override of idregs to options that a user
> > can easily understand, let's introduce yet another option
> > array, which maps an option to the corresponding idreg options.
> > 
> > Signed-off-by: Marc Zyngier 
> Acked-by: David Brazdil 
> 
> > ---
> >  arch/arm64/kernel/idreg-override.c | 20 
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/arch/arm64/kernel/idreg-override.c 
> > b/arch/arm64/kernel/idreg-override.c
> > index 75d9845f489b..16bc8b3b93ae 100644
> > --- a/arch/arm64/kernel/idreg-override.c
> > +++ b/arch/arm64/kernel/idreg-override.c
> > @@ -37,6 +37,12 @@ static const struct reg_desc * const regs[] __initdata = 
> > {
> > &mmfr1,
> >  };
> >  
> > +static const struct {
> > +   const char * const  alias;
> > +   const char * const  feature;
> > +} aliases[] __initdata = {
> > +};
> > +
> >  static int __init find_field(const char *cmdline, const struct reg_desc 
> > *reg,
> >  int f, u64 *v)
> >  {
> > @@ -80,6 +86,18 @@ static void __init match_options(const char *cmdline)
> > }
> >  }
> >  
> > +static __init void match_aliases(const char *cmdline)
> > +{
> > +   int i;
> > +
> > +   for (i = 0; i < ARRAY_SIZE(aliases); i++) {
> > +   char *str = strstr(cmdline, aliases[i].alias);
> > +
> > +   if ((str == cmdline || (str > cmdline && *(str - 1) == ' ')))
> 
> nit: Extract to a 'cmdline_contains' helper? Took me a good few seconds to
> parse this in the previous patch. Giving it a name would help, and now it's
> also shared.

Good point. Adopted!

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v4 04/21] arm64: Provide an 'upgrade to VHE' stub hypercall

2021-01-24 Thread Marc Zyngier

On Mon, 18 Jan 2021 11:25:16 +,
David Brazdil  wrote:
> 
> On Mon, Jan 18, 2021 at 09:45:16AM +, Marc Zyngier wrote:
> > As we are about to change the way a VHE system boots, let's
> > provide the core helper, in the form of a stub hypercall that
> > enables VHE and replicates the full EL1 context at EL2, thanks
> > to EL1 and VHE-EL2 being extremely similar.
> > 
> > On exception return, the kernel carries on at EL2. Fancy!
> > 
> > Nothing calls this new hypercall yet, so no functional change.
> > 
> > Signed-off-by: Marc Zyngier 
> > ---
> >  arch/arm64/include/asm/virt.h |  7 +++-
> >  arch/arm64/kernel/hyp-stub.S  | 67 +--
> >  2 files changed, 71 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> > index ee6a48df89d9..7379f35ae2c6 100644
> > --- a/arch/arm64/include/asm/virt.h
> > +++ b/arch/arm64/include/asm/virt.h
> > @@ -35,8 +35,13 @@
> >   */
> >  #define HVC_RESET_VECTORS 2
> >  
> > +/*
> > + * HVC_VHE_RESTART - Upgrade the CPU from EL1 to EL2, if possible
> > + */
> > +#define HVC_VHE_RESTART3
> > +
> >  /* Max number of HYP stub hypercalls */
> > -#define HVC_STUB_HCALL_NR 3
> > +#define HVC_STUB_HCALL_NR 4
> >  
> >  /* Error returned when an invalid stub number is passed into x0 */
> >  #define HVC_STUB_ERR   0xbadca11
> > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> > index 160f5881a0b7..fb12398b5c28 100644
> > --- a/arch/arm64/kernel/hyp-stub.S
> > +++ b/arch/arm64/kernel/hyp-stub.S
> > @@ -8,9 +8,9 @@
> >  
> >  #include 
> >  #include 
> > -#include 
> >  
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -47,10 +47,13 @@ SYM_CODE_END(__hyp_stub_vectors)
> >  
> >  SYM_CODE_START_LOCAL(el1_sync)
> > cmp x0, #HVC_SET_VECTORS
> > -   b.ne2f
> > +   b.ne1f
> > msr vbar_el2, x1
> > b   9f
> >  
> > +1: cmp x0, #HVC_VHE_RESTART
> > +   b.eqmutate_to_vhe
> > +
> >  2: cmp x0, #HVC_SOFT_RESTART
> > b.ne3f
> > mov x0, x2
> > @@ -70,6 +73,66 @@ SYM_CODE_START_LOCAL(el1_sync)
> > eret
> >  SYM_CODE_END(el1_sync)
> >  
> > +// nVHE? No way! Give me the real thing!
> > +SYM_CODE_START_LOCAL(mutate_to_vhe)
> > +   // Sanity check: MMU *must* be off
> > +   mrs x0, sctlr_el2
> > +   tbnzx0, #0, 1f
> > +
> > +   // Needs to be VHE capable, obviously
> > +   mrs x0, id_aa64mmfr1_el1
> > +   ubfxx0, x0, #ID_AA64MMFR1_VHE_SHIFT, #4
> > +   cbz x0, 1f
> 
> nit: There is a HVC_STUB_ERR that you could return if these sanity
> checks fail.  The documentation also states that it should be
> returned on error.

Good point. I've now added it, but how the error can be handled is
still up in the air. For now, I've decided to let the kernel continue
its (probably doomed) course.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v4 18/21] arm64: Move "nokaslr" over to the early cpufeature infrastructure

2021-01-24 Thread Marc Zyngier

On Mon, 18 Jan 2021 14:46:36 +,
David Brazdil  wrote:
> 
> On Mon, Jan 18, 2021 at 09:45:30AM +, Marc Zyngier wrote:
> > Given that the early cpufeature infrastructure has borrowed quite
> > a lot of code from the kaslr implementation, let's reimplement
> > the matching of the "nokaslr" option with it.
> > 
> > Signed-off-by: Marc Zyngier 
> Acked-by: David Brazdil 

[...]

> > @@ -126,7 +95,7 @@ u64 __init kaslr_early_init(void)
> >  * Check if 'nokaslr' appears on the command line, and
> >  * return 0 if that is the case.
> >  */
> > -   if (is_kaslr_disabled_cmdline(fdt)) {
> > +   if (kaslr_feature_val & kaslr_feature_mask & 0xf) {
> 
> nit: Isn't the 0xf redundant here? You don't re-mask for VH either.

Actually, I do. See the two back to back ubfx that extract both the
mask and the feature. The "& 0xf" here serves the same purpose.

Is it redundant? At the moment, quite possibly. But since we have
space for 16 "features", this is an indication that we are only using
the first one. I expect that eventually, we'll use it for other
things.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v4 12/21] arm64: cpufeature: Add an early command-line cpufeature override facility

2021-01-24 Thread Marc Zyngier

On Sat, 23 Jan 2021 13:43:52 +,
Catalin Marinas  wrote:
> 
> On Mon, Jan 18, 2021 at 09:45:24AM +, Marc Zyngier wrote:
> > +struct reg_desc {
> > +   const char * const  name;
> > +   u64 * const val;
> > +   u64 * const mask;
> > +   struct {
> > +   const char * const  name;
> > +   u8   shift;
> > +   }   fields[];
> > +};
> 
> Sorry, I didn't see this earlier. Do we need to add all these consts
> here? So you want the pointers to be const but why is 'shift' special
> and not a const then? Is it modified later?
> 
> Would this not work:
> 
> struct reg_desc {
>   const char  *name;
>   u64 *val;
>   u64 *mask;
>   struct {
>   const char  *name;
>   u8  shift;
>   } fields[];
> };
> 
> > +static const struct reg_desc * const regs[] __initdata = {
> 
> as we already declare the whole struct reg_desc pointers here as const.
> I may have confused myself...

It definitely is better. Specially given that we throw all of this
away right after boot, there is no harm in keeping it simple.

I've also renamed "reg_desc" to "ftr_set_desc", because we don't
always describe a register (like for kaslr).

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v4 14/21] arm64: Honor VHE being disabled from the command-line

2021-01-24 Thread Marc Zyngier

On Sat, 23 Jan 2021 14:07:53 +,
Catalin Marinas  wrote:
> 
> On Mon, Jan 18, 2021 at 09:45:26AM +, Marc Zyngier wrote:
> > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> > index 59820f9b8522..bbab2148a2a2 100644
> > --- a/arch/arm64/kernel/hyp-stub.S
> > +++ b/arch/arm64/kernel/hyp-stub.S
> > @@ -77,13 +77,24 @@ SYM_CODE_END(el1_sync)
> >  SYM_CODE_START_LOCAL(mutate_to_vhe)
> > // Sanity check: MMU *must* be off
> > mrs x0, sctlr_el2
> > -   tbnzx0, #0, 1f
> > +   tbnzx0, #0, 2f
> >  
> > // Needs to be VHE capable, obviously
> > mrs x0, id_aa64mmfr1_el1
> > ubfxx0, x0, #ID_AA64MMFR1_VHE_SHIFT, #4
> > -   cbz x0, 1f
> > +   cbz x0, 2f
> >  
> > +   // Check whether VHE is disabled from the command line
> > +   adr_l   x1, id_aa64mmfr1_val
> > +   ldr x0, [x1]
> > +   adr_l   x1, id_aa64mmfr1_mask
> > +   ldr x1, [x1]
> > +   ubfxx0, x0, #ID_AA64MMFR1_VHE_SHIFT, #4
> > +   ubfxx1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4
> > +   cbz x1, 1f
> > +   and x0, x0, x1
> > +   cbz x0, 2f
> > +1:
> 
> I can see the advantage here in separate id_aa64mmfr1_val/mask but we
> could use some asm offsets here and keep the pointer indirection simpler
> in C code. You'd just need something like 'adr_l mmfr1_ovrd + VAL_OFFSET'.
> 
> Anyway, if you have a strong preference for the current approach, leave
> it as is.

I've now moved over to a structure containing both val/mask, meaning
that we only need to keep a single pointer around in the various
feature descriptors. It certainly looks better.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v2 0/8] arm64: Relocate absolute hyp VAs

2021-01-23 Thread Marc Zyngier

On Tue, 5 Jan 2021 18:05:33 +, David Brazdil wrote:
> nVHE hyp code is linked into the same kernel binary but executes under
> different memory mappings. If the compiler of hyp code chooses absolute
> addressing for accessing a symbol, the kernel linker will relocate that
> address to a kernel image virtual address, causing a runtime exception.
> 
> So far the strategy has been to force PC-relative addressing by wrapping
> all symbol references with the hyp_symbol_addr macro. This is error
> prone and developer unfriendly.
> 
> [...]

Applied to kvm-arm64/hyp-reloc, thanks!

[1/8] KVM: arm64: Rename .idmap.text in hyp linker script
  commit: eceaf38f521982bad6dbac1c02becdd80fd6af7c
[2/8] KVM: arm64: Set up .hyp.rodata ELF section
  commit: 16174eea2e4fe8247e04c17da682f2034fec0369
[3/8] KVM: arm64: Add symbol at the beginning of each hyp section
  commit: f7a4825d9569593b9a81f0768313b86175691ef1
[4/8] KVM: arm64: Generate hyp relocation data
  commit: 8c49b5d43d4c45ca0bb0d1faa23feef2e76e89fa
[5/8] KVM: arm64: Apply hyp relocations at runtime
  commit: 6ec6259d7084ed32e164c9f7b69049464dd90fa5
[6/8] KVM: arm64: Fix constant-pool users in hyp
  commit: 97cbd2fc0257c6af7036a9a6415ca8ad43535d6b
[7/8] KVM: arm64: Remove patching of fn pointers in hyp
  commit: 537db4af26e3f2e0f304f2032bc593f7e2a54938
[8/8] KVM: arm64: Remove hyp_symbol_addr
  commit: 247bc166e6b3b1e4068f120f55582a3aa210cc2d

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v2 8/8] KVM: arm64: Remove hyp_symbol_addr

2021-01-23 Thread Marc Zyngier

On Tue, 05 Jan 2021 18:05:41 +,
David Brazdil  wrote:
> 
> Hyp code used the hyp_symbol_addr helper to force PC-relative addressing
> because absolute addressing results in kernel VAs due to the way hyp
> code is linked. This is not true anymore, so remove the helper and
> update all of its users.
> 
> Acked-by: Ard Biesheuvel 
> Signed-off-by: David Brazdil 
> ---
>  arch/arm64/include/asm/kvm_asm.h | 26 
>  arch/arm64/kvm/hyp/include/hyp/switch.h  |  4 ++--
>  arch/arm64/kvm/hyp/nvhe/hyp-smp.c|  4 ++--
>  arch/arm64/kvm/hyp/nvhe/psci-relay.c | 24 +++---
>  arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c |  2 +-
>  5 files changed, 17 insertions(+), 43 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h 
> b/arch/arm64/include/asm/kvm_asm.h
> index 8a33d83ea843..22d933e9b59e 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -199,32 +199,6 @@ extern void __vgic_v3_init_lrs(void);
>  
>  extern u32 __kvm_get_mdcr_el2(void);
>  
> -#if defined(GCC_VERSION) && GCC_VERSION < 5
> -#define SYM_CONSTRAINT   "i"
> -#else
> -#define SYM_CONSTRAINT   "S"
> -#endif
> -
> -/*
> - * Obtain the PC-relative address of a kernel symbol
> - * s: symbol
> - *
> - * The goal of this macro is to return a symbol's address based on a
> - * PC-relative computation, as opposed to a loading the VA from a
> - * constant pool or something similar. This works well for HYP, as an
> - * absolute VA is guaranteed to be wrong. Only use this if trying to
> - * obtain the address of a symbol (i.e. not something you obtained by
> - * following a pointer).
> - */
> -#define hyp_symbol_addr(s)   \
> - ({  \
> - typeof(s) *addr;\
> - asm("adrp   %0, %1\n"   \
> - "add%0, %0, :lo12:%1\n" \
> - : "=r" (addr) : SYM_CONSTRAINT (&s));   \
> - addr;   \
> - })
> -

This hunk is going to conflict in a fairly benign way with the
removal of the GCC workaround which I think Will queued for 5.12.
I'll work something out...

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v2 3/8] KVM: arm64: Add symbol at the beginning of each hyp section

2021-01-23 Thread Marc Zyngier

On Tue, 05 Jan 2021 18:05:36 +,
David Brazdil  wrote:
> 
> Generating hyp relocations will require referencing positions at a given
> offset from the beginning of hyp sections. Since the final layout will
> not be determined until the linking of `vmlinux`, modify the hyp linker
> script to insert a symbol at the first byte of each hyp section to use
> as an anchor. The linker of `vmlinux` will place the symbols together
> with the sections.
> 
> Signed-off-by: David Brazdil 
> ---
>  arch/arm64/include/asm/hyp_image.h | 29 +++--
>  arch/arm64/kvm/hyp/nvhe/hyp.lds.S  |  4 ++--
>  2 files changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/hyp_image.h 
> b/arch/arm64/include/asm/hyp_image.h
> index daa1a1da539e..f97b774b58f4 100644
> --- a/arch/arm64/include/asm/hyp_image.h
> +++ b/arch/arm64/include/asm/hyp_image.h
> @@ -7,6 +7,9 @@
>  #ifndef __ARM64_HYP_IMAGE_H__
>  #define __ARM64_HYP_IMAGE_H__
>  
> +#define HYP_CONCAT(a, b) __HYP_CONCAT(a, b)
> +#define __HYP_CONCAT(a, b)   a ## b
> +

OK, this may seem like a completely pointless comment, but I have a
total mental block when I see macros written upside down

I'll fix that myself, no need to resend just for that.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] genirq/msi: Make sure early activation of all PCI MSIs

2021-01-23 Thread Marc Zyngier


Hi Shameer,

On 2021-01-22 09:21, Shameerali Kolothum Thodi wrote:

Hi Marc,



[...]

I find this pretty complicated, and the I'd like to avoid injecting 
the PCI

MSI-vs-MSI-X concept in something that is supposed to be bus-agnostic.


Agree. That’s mainly because I was very skeptical(TBH, very limited in
my knowledge
on this msi core code) about changing the MSI-X activation logic here
and just thought
of limiting the impact to MSI case as a first attempt.

What's wrong with the following (untested) patch, which looks much 
simpler?


Yes, had tried this as one of the early fix, but as said above was not
very sure of
the impact on other platforms. Tested this again and it works.

Please send it.


Actually, there is a better way, which is to have a proper iterator on
vectors instead of descriptors. With that and a bit of cleanup, the
patch[1] looks pretty neat.

Please give it a go.

Thanks,

M.

[1] https://lore.kernel.org/r/20210123122759.1781359-1-...@kernel.org
--
Jazz is not dead. It just smells funny...

[PATCH] genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set

2021-01-23 Thread Marc Zyngier

When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
we perform the activation of the interrupt (which in the case of
PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.

But it appears that this is only done for the first vector,
introducing an inconsistent behaviour for PCI Multi-MSI.

Fix it by iterating over the number of vectors allocated to
each MSI descriptor. This is easily achieved by introducing
a new "for_each_msi_vector" iterator, together with a tiny
bit of refactoring.

Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-by: Shameer Kolothum 
Signed-off-by: Marc Zyngier 
Cc: sta...@vger.kernel.org
---
 include/linux/msi.h |  6 ++
 kernel/irq/msi.c| 44 
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 360a0a7e7341..aef35fd1cf11 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -178,6 +178,12 @@ struct msi_desc {
list_for_each_entry((desc), dev_to_msi_list((dev)), list)
 #define for_each_msi_entry_safe(desc, tmp, dev)\
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
+#define for_each_msi_vector(desc, __irq, dev)  \
+   for_each_msi_entry((desc), (dev))   \
+   if ((desc)->irq)\
+   for (__irq = (desc)->irq;   \
+__irq < ((desc)->irq + (desc)->nvec_used); \
+__irq++)
 
 #ifdef CONFIG_IRQ_MSI_IOMMU
 static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 2c0c4d6d0f83..d924676c8781 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -436,22 +436,22 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, 
struct device *dev,
 
can_reserve = msi_check_reservation_mode(domain, info, dev);
 
-   for_each_msi_entry(desc, dev) {
-   virq = desc->irq;
-   if (desc->nvec_used == 1)
-   dev_dbg(dev, "irq %d for MSI\n", virq);
-   else
+   /*
+* This flag is set by the PCI layer as we need to activate
+* the MSI entries before the PCI layer enables MSI in the
+* card. Otherwise the card latches a random msi message.
+*/
+   if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+   goto skip_activate;
+
+   for_each_msi_vector(desc, i, dev) {
+   if (desc->irq == i) {
+   virq = desc->irq;
dev_dbg(dev, "irq [%d-%d] for MSI\n",
virq, virq + desc->nvec_used - 1);
-   /*
-* This flag is set by the PCI layer as we need to activate
-* the MSI entries before the PCI layer enables MSI in the
-* card. Otherwise the card latches a random msi message.
-*/
-   if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
-   continue;
+   }
 
-   irq_data = irq_domain_get_irq_data(domain, desc->irq);
+   irq_data = irq_domain_get_irq_data(domain, i);
if (!can_reserve) {
irqd_clr_can_reserve(irq_data);
if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
@@ -462,28 +462,24 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, 
struct device *dev,
goto cleanup;
}
 
+skip_activate:
/*
 * If these interrupts use reservation mode, clear the activated bit
 * so request_irq() will assign the final vector.
 */
if (can_reserve) {
-   for_each_msi_entry(desc, dev) {
-   irq_data = irq_domain_get_irq_data(domain, desc->irq);
+   for_each_msi_vector(desc, i, dev) {
+   irq_data = irq_domain_get_irq_data(domain, i);
irqd_clr_activated(irq_data);
}
}
return 0;
 
 cleanup:
-   for_each_msi_entry(desc, dev) {
-   struct irq_data *irqd;
-
-   if (desc->irq == virq)
-   break;
-
-   irqd = irq_domain_get_irq_data(domain, desc->irq);
-   if (irqd_is_activated(irqd))
-   irq_domain_deactivate_irq(irqd);
+   for_each_msi_vector(desc, i, dev) {
+   irq_data = irq_domain_get_irq_data(domain, i);
+   if (irqd_is_activated(irq_data))
+   irq_domain_deactivate_irq(irq_data);
}
msi_domain_free_irqs(domain, dev);
return ret;
-- 
2.29.2

Re: [RFC PATCH v4 0/2] Some optimization for stage-2 translation

2021-01-22 Thread Marc Zyngier


Hi Yanan,

On 2021-01-22 10:13, Yanan Wang wrote:

Hi, Will, Marc,
Is there any further comment on the v3 series I post previously?


None, I was planning to queue them for 5.12 over the weekend.


If they are not fine to you, then I think maybe we should just turn
back to the original solution in v1, where I suggestted to filter out
the case of only updating access permissions in the map handler and
handle it right there.

Here are the reasons for my current opinion:
With an errno returned from the map handler for this single case, there
will be one more vcpu exit from guest and we also have to consider the
spurious dirty pages. Besides, it seems that the EAGAIN errno has been
chosen specially for this case and can not be used elsewhere for other
reasons, as we will change this errno to zero at the end of the 
function.


The v1 solution looks like more concise at last, so I refine the diff
and post the v4 with two patches here, just for a contrast.

Which solution will you prefer now? Could you please let me know.


I'm still very much opposed to mixing mapping and permission changes.
How bad is the spurious return to a vcpu?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH] kvm: arm64: Try stage2 block mapping for host device MMIO

2021-01-22 Thread Marc Zyngier


On 2021-01-22 08:36, Keqian Zhu wrote:

The MMIO region of a device maybe huge (GB level), try to use block
mapping in stage2 to speedup both map and unmap.

Especially for unmap, it performs TLBI right after each invalidation
of PTE. If all mapping is of PAGE_SIZE, it takes much time to handle
GB level range.


This is only on VM teardown, right? Or do you unmap the device more 
ofet?

Can you please quantify the speedup and the conditions this occurs in?

I have the feeling that we are just circling around another problem,
which is that we could rely on a VM-wide TLBI when tearing down the
guest. I worked on something like that[1] a long while ago, and parked
it for some reason. Maybe it is worth reviving.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/elide-cmo-tlbi




Signed-off-by: Keqian Zhu 
---
 arch/arm64/include/asm/kvm_pgtable.h | 11 +++
 arch/arm64/kvm/hyp/pgtable.c | 15 +++
 arch/arm64/kvm/mmu.c | 12 
 3 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h
b/arch/arm64/include/asm/kvm_pgtable.h
index 52ab38db04c7..2266ac45f10c 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -82,6 +82,17 @@ struct kvm_pgtable_walker {
const enum kvm_pgtable_walk_flags   flags;
 };

+/**
+ * kvm_supported_pgsize() - Get the max supported page size of a 
mapping.

+ * @pgt:   Initialised page-table structure.
+ * @addr:  Virtual address at which to place the mapping.
+ * @end:   End virtual address of the mapping.
+ * @phys:  Physical address of the memory to map.
+ *
+ * The smallest return value is PAGE_SIZE.
+ */
+u64 kvm_supported_pgsize(struct kvm_pgtable *pgt, u64 addr, u64 end, 
u64 phys);

+
 /**
  * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 
page-table.

  * @pgt:   Uninitialised page-table structure to initialise.
diff --git a/arch/arm64/kvm/hyp/pgtable.c 
b/arch/arm64/kvm/hyp/pgtable.c

index bdf8e55ed308..ab11609b9b13 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -81,6 +81,21 @@ static bool kvm_block_mapping_supported(u64 addr,
u64 end, u64 phys, u32 level)
return IS_ALIGNED(addr, granule) && IS_ALIGNED(phys, granule);
 }

+u64 kvm_supported_pgsize(struct kvm_pgtable *pgt, u64 addr, u64 end, 
u64 phys)

+{
+   u32 lvl;
+   u64 pgsize = PAGE_SIZE;
+
+   for (lvl = pgt->start_level; lvl < KVM_PGTABLE_MAX_LEVELS; lvl++) {
+   if (kvm_block_mapping_supported(addr, end, phys, lvl)) {
+   pgsize = kvm_granule_size(lvl);
+   break;
+   }
+   }
+
+   return pgsize;
+}
+
 static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 
level)

 {
u64 shift = kvm_granule_shift(level);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7d2257cc5438..80b403fc8e64 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -499,7 +499,8 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
  phys_addr_t pa, unsigned long size, bool writable)
 {
-   phys_addr_t addr;
+   phys_addr_t addr, end;
+   unsigned long pgsize;
int ret = 0;
struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, };
struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
@@ -509,21 +510,24 @@ int kvm_phys_addr_ioremap(struct kvm *kvm,
phys_addr_t guest_ipa,

size += offset_in_page(guest_ipa);
guest_ipa &= PAGE_MASK;
+   end = guest_ipa + size;

-   for (addr = guest_ipa; addr < guest_ipa + size; addr += PAGE_SIZE) {
+   for (addr = guest_ipa; addr < end; addr += pgsize) {
ret = kvm_mmu_topup_memory_cache(&cache,
 kvm_mmu_cache_min_pages(kvm));
if (ret)
break;

+   pgsize = kvm_supported_pgsize(pgt, addr, end, pa);
+
spin_lock(&kvm->mmu_lock);
-   ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot,
+   ret = kvm_pgtable_stage2_map(pgt, addr, pgsize, pa, prot,
 &cache);
spin_unlock(&kvm->mmu_lock);
if (ret)
break;

-   pa += PAGE_SIZE;
+   pa += pgsize;
}

kvm_mmu_free_memory_cache(&cache);


This otherwise looks neat enough.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] genirq/msi: Make sure early activation of all PCI MSIs

2021-01-21 Thread Marc Zyngier

Hi Shameer,

On Thu, 21 Jan 2021 11:02:47 +,
Shameer Kolothum  wrote:
> 
> We currently do early activation of MSI irqs for PCI/MSI based on
> the MSI_FLAG_ACTIVATE_EARLY flag. Though this activates all the
> allocated MSIs in the case of MSI-X, it only does so for the
> base irq in the case of MSI. This is because, for MSI, there
> is only one msi_desc entry for all the 32 irqs it can support
> and the current implementation iterates over the msi entries and
> ends up activating the base irq only.
> 
> The above creates an issue on platforms where the msi controller
> supports direct injection of vLPIs(eg: ARM GICv4 ITS). On these
> platforms, upon irq activation, ITS driver maps the event to an
> ITT entry. And for Guest pass-through to work, early mapping of
> all the dev MSI vectors is required. Otherwise, the vfio irq
> bypass manager registration will fail. eg, On a HiSilicon D06
> platform with GICv4 enabled, Guest boot with zip dev pass-through
> reports,
> 
> "vfio-pci :75:00.1: irq bypass producer (token 06e5176a)
> registration fails: 66311"
> 
> and Guest boot fails.
> 
> This is traced to,
>    kvm_arch_irq_bypass_add_producer
>      kvm_vgic_v4_set_forwarding
>        vgic_its_resolve_lpi --> returns E_ITS_INT_UNMAPPED_INTERRUPT
> 
> Hence make sure we activate all the irqs for both MSI and MSI-x cases.
> 
> Signed-off-by: Shameer Kolothum 
> ---
> It is unclear to me whether not performing the early activation of all
> MSI irqs was deliberate and has consequences on any other platforms.
> Please let me know.

Probably just an oversight.

> 
> Thanks,
> Shameer 
> ---
>  kernel/irq/msi.c | 114 +--
>  1 file changed, 90 insertions(+), 24 deletions(-)
> 
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index 2c0c4d6d0f83..eec187fc32a9 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -395,6 +395,78 @@ static bool msi_check_reservation_mode(struct irq_domain 
> *domain,
>   return desc->msi_attrib.is_msix || desc->msi_attrib.maskbit;
>  }
>  
> +static void msi_domain_deactivate_irq(struct irq_domain *domain, int irq)
> +{
> + struct irq_data *irqd;
> +
> + irqd = irq_domain_get_irq_data(domain, irq);
> + if (irqd_is_activated(irqd))
> + irq_domain_deactivate_irq(irqd);
> +}
> +
> +static int msi_domain_activate_irq(struct irq_domain *domain,
> +int irq, bool can_reserve)
> +{
> + struct irq_data *irqd;
> +
> + irqd = irq_domain_get_irq_data(domain, irq);
> + if (!can_reserve) {
> + irqd_clr_can_reserve(irqd);
> + if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
> + irqd_set_msi_nomask_quirk(irqd);
> + }
> + return irq_domain_activate_irq(irqd, can_reserve);
> +}
> +
> +static int msi_domain_activate_msix_irqs(struct irq_domain *domain,
> +  struct device *dev, bool can_reserve)
> +{
> + struct msi_desc *desc;
> + int ret, irq;
> +
> + for_each_msi_entry(desc, dev) {
> + irq = desc->irq;
> + ret = msi_domain_activate_irq(domain, irq, can_reserve);
> + if (ret)
> + goto out;
> + }
> + return 0;
> +
> +out:
> + for_each_msi_entry(desc, dev) {
> + if (irq == desc->irq)
> + break;
> + msi_domain_deactivate_irq(domain, desc->irq);
> + }
> + return ret;
> +}
> +
> +static int msi_domain_activate_msi_irqs(struct irq_domain *domain,
> + struct device *dev, bool can_reserve)
> +{
> + struct msi_desc *desc;
> + int i, ret, base, irq;
> +
> + desc = first_msi_entry(dev);
> + base = desc->irq;
> +
> + for (i = 0; i < desc->nvec_used; i++) {
> + irq = base + i;
> + ret = msi_domain_activate_irq(domain, irq, can_reserve);
> + if (ret)
> + goto out;
> + }
> + return 0;
> +
> +out:
> + for (i = 0; i < desc->nvec_used; i++) {
> + if (irq == base + i)
> + break;
> + msi_domain_deactivate_irq(domain, base + i);
> + }
> + return ret;
> +}
> +
>  int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
>   int nvec)
>  {
> @@ -443,21 +515,25 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, 
> struct device *dev,
>   else
>   dev_dbg(dev, "irq [%d-%d] for MSI\n",
>   virq, virq + desc->nvec_used - 1);
> - /*
> -  * This flag is set by the PCI layer as we need to activate
> -  * the MSI entries before the PCI layer enables MSI in the
> -  * card. Otherwise the card latches a random msi message.
> -  */
> - if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
> - continue;
> + }
>  
> -

Re: [PATCH 0/2] irqchip: Remove obsolete drivers

2021-01-21 Thread Marc Zyngier

On Wed, 20 Jan 2021 14:30:06 +0100, Arnd Bergmann wrote:
> A few Arm platforms are getting removed in v5.12, this removes
> the corresponding irqchip drivers.
> 
> Link: 
> https://lore.kernel.org/linux-arm-kernel/20210120124812.2800027-1-a...@kernel.org/T/
> 
> 
> Arnd Bergmann (2):
>   irqchip: remove sigma tango driver
>   irqchip: remove sirfsoc driver
> 
> [...]

Applied to irq/irqchip-5.12, thanks!

[1/2] irqchip: remove sigma tango driver
  commit: 00e772c4929257b11b51d47e4645f67826ded0fc
[2/2] irqchip: remove sirfsoc driver
  commit: 5c1ea0d842b1e73ae04870527ec29d5479c35041

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v5 00/10] sunxi: Support IRQ wakeup from deep sleep

2021-01-21 Thread Marc Zyngier

On Sun, 17 Jan 2021 23:50:30 -0600, Samuel Holland wrote:
> Allwinner sun6i/sun8i/sun50i SoCs (A31 and newer) have two interrupt
> controllers: GIC and R_INTC. GIC does not support wakeup. R_INTC handles
> the external NMI pin, and provides 32+ IRQs to the ARISC. The first 16
> of these correspond 1:1 to a block of GIC IRQs starting with the NMI.
> The last 13-16 multiplex the first (up to) 128 GIC SPIs.
> 
> This series replaces the existing chained irqchip driver that could only
> control the NMI, with a stacked irqchip driver that also provides wakeup
> capability for those multiplexed SPI IRQs. The idea is to preconfigure
> the ARISC's IRQ controller, and then the ARISC firmware knows to wake up
> as soon as it receives an IRQ. It can also decide how deep it can
> suspend based on the enabled wakeup IRQs.
> 
> [...]

Applied to irq/irqchip-5.12, thanks!

[01/10] dt-bindings: irq: sun6i-r: Split the binding from sun7i-nmi
commit: ad6b47cdef760410311f41876b21eb0c6fda4717
[02/10] dt-bindings: irq: sun6i-r: Add a compatible for the H3
commit: 6436eb4417094ea3308b33d8392fc02a1068dc78
[03/10] irqchip/sun6i-r: Use a stacked irqchip driver
commit: 4e34614636b31747b190488240a95647c227021f
[04/10] irqchip/sun6i-r: Add wakeup support
commit: 7ab365f6cd6de1e2b0cb1e1e3873dbf68e6f1003

Please route the dts patches via the soc tree. Also, I had to
manually fix the first patch as it wouldn't apply on top of
5.11-rc4 (which tree has it been diffed against?). Please
check that the resolution is correct.

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v3 2/2] irqchip: Add support for Realtek RTL838x/RTL839x interrupt controller

2021-01-21 Thread Marc Zyngier

On Wed, 20 Jan 2021 10:10:18 +,
Bert Vermeulen  wrote:
> 
> Signed-off-by: Bert Vermeulen 

Please write a decent commit message.

> ---
>  drivers/irqchip/Makefile  |   1 +
>  drivers/irqchip/irq-realtek-rtl.c | 180 ++
>  2 files changed, 181 insertions(+)
>  create mode 100644 drivers/irqchip/irq-realtek-rtl.c
> 
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 0ac93bfaec61..4fc1086bed7e 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -113,3 +113,4 @@ obj-$(CONFIG_LOONGSON_PCH_PIC)+= 
> irq-loongson-pch-pic.o
>  obj-$(CONFIG_LOONGSON_PCH_MSI)   += irq-loongson-pch-msi.o
>  obj-$(CONFIG_MST_IRQ)+= irq-mst-intc.o
>  obj-$(CONFIG_SL28CPLD_INTC)  += irq-sl28cpld.o
> +obj-$(CONFIG_MACH_REALTEK_RTL)   += irq-realtek-rtl.o
> diff --git a/drivers/irqchip/irq-realtek-rtl.c 
> b/drivers/irqchip/irq-realtek-rtl.c
> new file mode 100644
> index ..bafe9ee4a85a
> --- /dev/null
> +++ b/drivers/irqchip/irq-realtek-rtl.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2006-2012 Tony Wu 
> + * Copyright (C) 2020 Birger Koblitz 
> + * Copyright (C) 2020 Bert Vermeulen 
> + * Copyright (C) 2020 John Crispin 

Given the list of copyright owners, shouldn't this be reflected in the
SoB chain one way or another?

> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/* Global Interrupt Mask Register */
> +#define RTL_ICTL_GIMR0x00
> +/* Global Interrupt Status Register */
> +#define RTL_ICTL_GISR0x04
> +/* Interrupt Routing Registers */
> +#define RTL_ICTL_IRR00x08
> +#define RTL_ICTL_IRR10x0c
> +#define RTL_ICTL_IRR20x10
> +#define RTL_ICTL_IRR30x14
> +
> +#define REG(x)   (realtek_ictl_base + x)
> +
> +static DEFINE_RAW_SPINLOCK(irq_lock);
> +static void __iomem *realtek_ictl_base;
> +
> +static void realtek_ictl_unmask_irq(struct irq_data *i)
> +{
> + unsigned long flags;
> + u32 value;
> +
> + raw_spin_lock_irqsave(&irq_lock, flags);
> +
> + value = readl(REG(RTL_ICTL_GIMR));
> + value |= BIT(i->hwirq);
> + writel(value, REG(RTL_ICTL_GIMR));
> +
> + raw_spin_unlock_irqrestore(&irq_lock, flags);
> +}
> +
> +static void realtek_ictl_mask_irq(struct irq_data *i)
> +{
> + unsigned long flags;
> + u32 value;
> +
> + raw_spin_lock_irqsave(&irq_lock, flags);
> +
> + value = readl(REG(RTL_ICTL_GIMR));
> + value &= ~BIT(i->hwirq);
> + writel(value, REG(RTL_ICTL_GIMR));
> +
> + raw_spin_unlock_irqrestore(&irq_lock, flags);
> +}
> +
> +static struct irq_chip realtek_ictl_irq = {
> + .name = "realtek-rtl-intc",
> + .irq_mask = realtek_ictl_mask_irq,
> + .irq_unmask = realtek_ictl_unmask_irq,
> +};
> +
> +static int intc_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t 
> hw)
> +{
> + irq_set_chip_and_handler(hw, &realtek_ictl_irq, handle_level_irq);
> +
> + return 0;
> +}
> +
> +static const struct irq_domain_ops irq_domain_ops = {
> + .map = intc_map,
> + .xlate = irq_domain_xlate_onecell,
> +};
> +
> +static void realtek_irq_dispatch(struct irq_desc *desc)
> +{
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + struct irq_domain *domain;
> + unsigned int pending;
> +
> + chained_irq_enter(chip, desc);
> + pending = readl(REG(RTL_ICTL_GIMR)) & readl(REG(RTL_ICTL_GISR));
> + if (unlikely(!pending)) {
> + spurious_interrupt();
> + goto out;
> + }
> + domain = irq_desc_get_handler_data(desc);
> + generic_handle_irq(irq_find_mapping(domain, __ffs(pending)));
> +
> +out:
> + chained_irq_exit(chip, desc);
> +}
> +
> +/*
> + * SoC interrupts are cascaded to MIPS CPU interrupts according to the
> + * interrupt-map in the device tree. Each SoC interrupt gets 4 bits for
> + * the CPU interrupt in an Interrupt Routing Register. Max 32 SoC interrupts
> + * thus go into 4 IRRs.
> + */
> +static int __init map_interrupts(struct device_node *node)
> +{
> + struct device_node *cpu_ictl;
> + const __be32 *imap;
> + u32 imaplen, soc_int, cpu_int, tmp, regs[4];
> + int ret, i, irr_regs[] = {
> + RTL_ICTL_IRR3,
> + RTL_ICTL_IRR2,
> + RTL_ICTL_IRR1,
> + RTL_ICTL_IRR0,
> + };
> +
> + ret = of_property_read_u32(node, "#address-cells", &tmp);
> + if (ret || tmp)
> + return -EINVAL;
> +
> + imap = of_get_property(node, "interrupt-map", &imaplen);
> + if (!imap || imaplen % 3)
> + return -EINVAL;
> +
> + memset(regs, 0, sizeof(regs));
> + for (i = 0; i < imaplen; i += 3 * sizeof(u32)) {
> + soc_int = be32_to_cpup(imap);
> + if (soc_int > 31)
> + return -EINVAL;
> +
> + cpu_ictl = o

Re: [RFC PATCH 3/7] arm64: mm: use nGnRnE instead of nGnRE on Apple processors

2021-01-21 Thread Marc Zyngier

On 2021-01-21 17:55, Will Deacon wrote:

On Thu, Jan 21, 2021 at 04:25:54PM +, Marc Zyngier wrote:

On 2021-01-21 15:12, Mohamed Mediouni wrote:
> Please ignore that patch.
>
> It turns out that the PCIe controller on Apple M1 expects posted
> writes and so the memory range for it ought to be set nGnRE.
> So, we need to use nGnRnE for on-chip MMIO and nGnRE for PCIe BARs.
>
> The MAIR approach isn’t adequate for such a thing, so we’ll have to
> look elsewhere.

Well, there isn't many alternative to having a memory type defined
in MAIR if you want to access your PCIe devices with specific
semantics.

It probably means defining a memory type for PCI only, but:
- we only have a single free MT entry, and I'm not sure we can
  afford to waste this on a specific platform (can we re-purpose
  GRE instead?),

We already have an nGnRnE MAIR for config space accesses.

I'm confused. If M1 needs nGnRE for PCI, and overrides nGnRE to nE
for its in-SoC accesses, where does nGnRE goes?

Or do you propose that it is the page tables that get a different
MT index?

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 3/7] arm64: mm: use nGnRnE instead of nGnRE on Apple processors

2021-01-21 Thread Marc Zyngier


On 2021-01-21 15:12, Mohamed Mediouni wrote:

Please ignore that patch.

It turns out that the PCIe controller on Apple M1 expects posted
writes and so the memory range for it ought to be set nGnRE.
So, we need to use nGnRnE for on-chip MMIO and nGnRE for PCIe BARs.

The MAIR approach isn’t adequate for such a thing, so we’ll have to
look elsewhere.


Well, there isn't many alternative to having a memory type defined
in MAIR if you want to access your PCIe devices with specific
semantics.

It probably means defining a memory type for PCI only, but:
- we only have a single free MT entry, and I'm not sure we can
  afford to waste this on a specific platform (can we re-purpose
  GRE instead?),
- we'd need to teach the PCI code to use this...

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 7/7] irqchip/apple-aic: add SMP support to the Apple AIC driver.

2021-01-21 Thread Marc Zyngier


On 2021-01-21 13:34, Mohamed Mediouni wrote:

On 21 Jan 2021, at 14:22, Marc Zyngier  wrote:

On 2021-01-21 12:50, Mohamed Mediouni wrote:

On 21 Jan 2021, at 13:44, Arnd Bergmann  wrote:
On Wed, Jan 20, 2021 at 2:27 PM Mohamed Mediouni


[...]


+   aic.fast_ipi = of_property_read_bool(node, "fast-ipi");
Where is this property documented, and what decides which one to 
use?

It’s getting documented in the next patch set.
This property is there to enable support for older iPhone processors
later on, some of which do not have fast IPI support.
On Apple M1, fast-ipi is always on.


Then please focus on a single implementation. Additional features can
always be merged later once something is up and running.

Also, there sysregs can be detected by matching the MIDR, so I don't
think we need a DT property for that.

Thanks,


Because UART access adapters for the new M1 Macs aren’t plentiful
at all, I actually use this for development, with iPhones which have
much more easy to buy Lightning-to-UART adapters.

(That’s why the old implementation is there too)

Might be worth splitting the new one to a new commit though...


This series is supposed to cover M1 only, and adding extra support
as part of it is only likely to make the code harder to review.

I'd rather you focus on a single IPI interface (fast or slow,
I don't really care). Extra features can come in later.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 7/7] irqchip/apple-aic: add SMP support to the Apple AIC driver.

2021-01-21 Thread Marc Zyngier

On 2021-01-21 13:32, Mark Rutland wrote:

On Thu, Jan 21, 2021 at 01:22:37PM +, Marc Zyngier wrote:

On 2021-01-21 12:50, Mohamed Mediouni wrote:
> > On 21 Jan 2021, at 13:44, Arnd Bergmann  wrote:
> >
> > On Wed, Jan 20, 2021 at 2:27 PM Mohamed Mediouni

[...]

> > > +   aic.fast_ipi = of_property_read_bool(node, "fast-ipi");
> >
> > Where is this property documented, and what decides which one to use?
> It’s getting documented in the next patch set.
>
> This property is there to enable support for older iPhone processors
> later on, some of which do not have fast IPI support.
>
> On Apple M1, fast-ipi is always on.

Then please focus on a single implementation. Additional features can
always be merged later once something is up and running.

Also, there sysregs can be detected by matching the MIDR, so I don't
think we need a DT property for that.

Generally we do not detect IMP-DEF sysregs based on MIDR because they
won't necessarily be exposed to a VM, so I suspect that we do need DT
properties to describe that IMP-DEF sysregs are accessible, and should
not rely on the MIDR alone. Maybe that's implicit in another property,
but worth bearing in mind.

Hmm. That's a good point. I think this could be keyed off
the compatible property, which should accurately reflect
the version of the interrupt controller.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 7/7] irqchip/apple-aic: add SMP support to the Apple AIC driver.

2021-01-21 Thread Marc Zyngier


On 2021-01-21 12:50, Mohamed Mediouni wrote:

On 21 Jan 2021, at 13:44, Arnd Bergmann  wrote:

On Wed, Jan 20, 2021 at 2:27 PM Mohamed Mediouni


[...]


+   aic.fast_ipi = of_property_read_bool(node, "fast-ipi");


Where is this property documented, and what decides which one to use?

It’s getting documented in the next patch set.

This property is there to enable support for older iPhone processors
later on, some of which do not have fast IPI support.

On Apple M1, fast-ipi is always on.


Then please focus on a single implementation. Additional features can
always be merged later once something is up and running.

Also, there sysregs can be detected by matching the MIDR, so I don't
think we need a DT property for that.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [RFC PATCH 3/7] arm64: mm: use nGnRnE instead of nGnRE on Apple processors

2021-01-21 Thread Marc Zyngier


On 2021-01-21 11:27, Will Deacon wrote:

On Wed, Jan 20, 2021 at 02:27:13PM +0100, Mohamed Mediouni wrote:
Use nGnRnE instead of nGnRE on Apple SoCs to workaround a serious 
hardware quirk.


On Apple processors, writes using the nGnRE device memory type get 
dropped in flight,

getting to nowhere.

Signed-off-by: Stan Skowronek 
Signed-off-by: Mohamed Mediouni 
---
 arch/arm64/mm/proc.S | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 1f7ee8c8b7b8..06436916f137 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -51,6 +51,25 @@
 #define TCR_KASAN_HW_FLAGS 0
 #endif

+#ifdef CONFIG_ARCH_APPLE
+
+/*
+ * Apple cores appear to black-hole writes done with nGnRE.
+ * We settled on a work-around that uses MAIR vs changing every 
single user of

+ * nGnRE across the arm64 code.
+ */
+
+#define MAIR_EL1_SET_APPLE \
+   (MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |  \
+MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRE) |   \
+MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |\
+MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |  \
+MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |\
+MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT) |  \
+MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL_TAGGED))
+
+#endif
+
 /*
  * Default MAIR_EL1. MT_NORMAL_TAGGED is initially mapped as Normal 
memory and
  * changed during __cpu_setup to Normal Tagged if the system supports 
MTE.

@@ -432,6 +451,13 @@ SYM_FUNC_START(__cpu_setup)
 * Memory region attributes
 */
mov_q   x5, MAIR_EL1_SET
+#ifdef CONFIG_ARCH_APPLE
+   mrs x0, MIDR_EL1
+   lsr w0, w0, #24
+   mov_q   x1, MAIR_EL1_SET_APPLE
+   cmp x0, #0x61   // 0x61 = Implementer: Apple
+   cselx5, x1, x5, eq


Why does this need to be done so early? It would be a lot cleaner if we
could detect this in a similar fashion to other errata and update the 
MAIR
appropriately. If that's not possible because of early IO mappings 
(which

ones?), then we could instead initialise to nGnRnE unconditionally, but
relax it to nGnRE if we detect that we _don't_ have the erratum.


Would that imply another round-trip into the idmap, much like we do
when we switch to non-global mappings? Or do you expect that we can 
change

the MAIR with live mappings?

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v2] gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default

2021-01-20 Thread Marc Zyngier


On 2021-01-20 15:48, Greg Kroah-Hartman wrote:

On Wed, Jan 20, 2021 at 03:39:30PM +, Marc Zyngier wrote:



Anyway, I said what I had to say. If platforms break with this
change, I'll expect it to be disabled in 5.12.


I'm thinking we can not change the default and will probably revert 
this

patch "soon".


I think there is a lot of value in keeping this enabled for a bit,
so that we can work out what breaks, and find solutions that scale
a bit better.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v2] gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default

2021-01-20 Thread Marc Zyngier

On 2021-01-18 20:38, Saravana Kannan wrote:

On Sun, Jan 17, 2021 at 4:02 AM Marc Zyngier  wrote:

Hi Saravana,

Thanks for posting this, much appreciated.

On Sat, 16 Jan 2021 01:14:11 +,
Saravana Kannan  wrote:
>
> There are multiple instances of GPIO devictree nodes of the form:
>
> foo {
>   compatible = "acme,foo";
>   ...
>
>   gpio0: gpio0@ {
>   compatible = "acme,bar";
>   ...
>   gpio-controller;
>   };
>
>   gpio1: gpio1@ {
>   compatible = "acme,bar";
>   ...
>   gpio-controller;
>   };
>
>   ...
> }
>
> bazz {
>   my-gpios = <&gpio0 ...>;
> }
>
> Case 1: The driver for "foo" populates struct device for these gpio*
> nodes and then probes them using a driver that binds with "acme,bar".
> This lines up with how DT nodes with the "compatible" property are
> generally converted to struct devices and then registered with driver
> core to probe them. This also allows the gpio* devices to hook into all
> the driver core capabilities like runtime PM, probe deferral,
> suspend/resume ordering, device links, etc.
>
> Case 2: The driver for "foo" doesn't populate its child device nodes
> with "compatible" property and instead just loops through its child
> nodes and directly registers the GPIOs with gpiolib without ever
> populating a struct device or binding a driver to it.

That's not quite an accurate description. The gpiolib subsystem does
create a struct device. It doesn't register a driver though, which is
what causes the issue with fr_devlink (more on that below).

The devices created by gpiolib care are created for case 1 and case 2.
They are just devices gpiolib uses to represent a virtual software
device to hook different attributes to and expose them in sysfs. I'm
not talking about those devices here. The devices I'm referring to are
devices that represent the actual HW IP -- so what I'm saying is
accurate.

Please read what you wrote : "without ever populating a struct device".
I stand by the "not quite accurate".

>
> Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> is because fw_devlink will prevent bazz from probing until there's a
> struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> a GPIO supplier). Once the struct device is available, fw_devlink will
> create a device link between with gpio0 as the supplier and bazz as the
> consumer. After this point, the device link will prevent bazz from
> probing until its supplier (the gpio0 device) has bound to a driver.
> Once the supplier is bound to a driver, the probe of bazz is triggered
> automatically.
>
> Finding and refactoring all the instances of drivers that follow case 2
> will cause a lot of code churn and it not something that can be done in
> one shot. Examples of such instances are [1] [2].
>
> This patch works around this problem and avoids all the code churn by
> simply creating a stub driver to bind to the gpio_device. Since the
> gpio_device already points to the GPIO device tree node, this allows all
> the consumers to continue probing when the driver follows case 2.
>
> If/when all the old drivers are refactored, we can revert this
> patch.

My personal gripe with this approach is that it is an abrupt change in
the way DT and device model are mapped onto each other.

As far as I know (and someone please correct me if I am wrong), there
is zero expectation that a device/subdevice/random IP block described
by a node with a "compatible" property will end-up being managed by a
driver that is bound to that particular node.

The node/subnode division is a good way to express some HW boundaries,
but doesn't say anything about the way this should be handled in the
kernel. Assuming that everything containing a "compatible" string will
eventually be bound to a driver is IMO pretty limiting.

The default behavior of of_platform_populate() is to create a struct
device for every node with "compatible" property. That's how top level
devices (or children of simple-bus devices) are populated. IIRC,
that's what a lot of other busses do too. So, if anything, not having
a struct device for nodes with "compatible" property is an exception.

Again, that's not a description of the reality. The case we are talking
about here does have a struct device. The misconception you keep
repeating is that binding it to a driver is the only valid approach.

Honestly, if one has a driver that supports a HW IP, I don't see any
reason it should operate outside of the device-driver model support

Re: [PATCH v6 0/5] ARM: arm64: Add SMCCC TRNG entropy service

2021-01-20 Thread Marc Zyngier

On 2021-01-20 13:49, Will Deacon wrote:

On Wed, Jan 20, 2021 at 01:45:24PM +, Andre Przywara wrote:

On Wed, 20 Jan 2021 13:26:26 +
Marc Zyngier  wrote:

Hi,

> On 2021-01-20 13:01, Will Deacon wrote:
> > On Wed, 6 Jan 2021 10:34:48 +, Andre Przywara wrote:
> >> a fix to v5, now *really* fixing the wrong priority of SMCCC vs.
> >> RNDR in arch_get_random_seed_long_early(). Apologies for messing
> >> this up in v5 and thanks to broonie for being on the watch!
> >>
> >> Will, Catalin: it would be much appreciated if you could consider
> >> taking
> >> patch 1/5. This contains the common definitions, and is a
> >> prerequisite for every other patch, although they are somewhat
> >> independent and likely
> >> will need to go through different subsystems.
> >>
> >> [...]
> >
> > Applied the first patch only to arm64 (for-next/rng), thanks!
> >
> > [1/5] firmware: smccc: Add SMCCC TRNG function call IDs
> >   https://git.kernel.org/arm64/c/67c6bb56b649
>
> I can't see how the rest of the patches can go via any other tree
> if all the definitions are in the first one.
>
> Andre, can you explain what your plan is?

Well, I don't really have a great solution for that, other than hoping
that 1/5 makes it into Linus' master at some point.

I see that it's a stretch, but pulling 1/5 into 5.11 now would
prepare the stage for the others to go via any tree, into 5.12-rc1?

Or you could maybe take both 1/5 and 5/5 into your kvm-arm tree, and
would hope that a git rebase later would sort this out for you?

But I think you are much more experienced in those kind of issues, so
happy to hear about any other solutions.

for-next/rng is a stable branch, so anybody who wants the first patch 
can

just pull it (without anything I queue on top).

OK. I'll pull that branch and stash the KVM stuff on top.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v6 0/5] ARM: arm64: Add SMCCC TRNG entropy service

2021-01-20 Thread Marc Zyngier


On 2021-01-20 13:01, Will Deacon wrote:

On Wed, 6 Jan 2021 10:34:48 +, Andre Przywara wrote:

a fix to v5, now *really* fixing the wrong priority of SMCCC vs. RNDR
in arch_get_random_seed_long_early(). Apologies for messing this up
in v5 and thanks to broonie for being on the watch!

Will, Catalin: it would be much appreciated if you could consider 
taking

patch 1/5. This contains the common definitions, and is a prerequisite
for every other patch, although they are somewhat independent and 
likely

will need to go through different subsystems.

[...]


Applied the first patch only to arm64 (for-next/rng), thanks!

[1/5] firmware: smccc: Add SMCCC TRNG function call IDs
  https://git.kernel.org/arm64/c/67c6bb56b649


I can't see how the rest of the patches can go via any other tree
if all the definitions are in the first one.

Andre, can you explain what your plan is?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH 1/3] arm64/kernel: FIQ support

2021-01-20 Thread Marc Zyngier


Hi Mohamed,

On 2021-01-20 11:36, Mohamed Mediouni wrote:

From: Stan Skowronek 

On Apple processors, the timer is wired through FIQ.


Which timer? There are at least 3, potentially 4 timers per CPU
that can fire.


As such, add FIQ support to the kernel.

Signed-off-by: Stan Skowronek 


Missing SoB from the sender.


---
 arch/arm64/include/asm/arch_gicv3.h |  2 +-
 arch/arm64/include/asm/assembler.h  |  8 ++--
 arch/arm64/include/asm/daifflags.h  |  4 +-
 arch/arm64/include/asm/irq.h|  4 ++
 arch/arm64/include/asm/irqflags.h   |  6 +--
 arch/arm64/kernel/entry.S   | 74 ++---
 arch/arm64/kernel/irq.c | 14 ++
 arch/arm64/kernel/process.c |  2 +-
 8 files changed, 97 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/arch_gicv3.h
b/arch/arm64/include/asm/arch_gicv3.h
index 880b9054d75c..934b9be582d2 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -173,7 +173,7 @@ static inline void gic_pmr_mask_irqs(void)

 static inline void gic_arch_enable_irqs(void)
 {
-   asm volatile ("msr daifclr, #2" : : : "memory");
+   asm volatile ("msr daifclr, #3" : : : "memory");


If I trust the persistent rumour, this system doesn't have a GIC.
Why this change?


 }

 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/assembler.h
b/arch/arm64/include/asm/assembler.h
index bf125c591116..6fe55713dfe0 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -40,9 +40,9 @@
msr daif, \flags
.endm

-	/* IRQ is the lowest priority flag, unconditionally unmask the rest. 
*/

-   .macro enable_da_f
-   msr daifclr, #(8 | 4 | 1)
+	/* IRQ/FIQ is the lowest priority flag, unconditionally unmask the 
rest. */

+   .macro enable_da
+   msr daifclr, #(8 | 4)


This cannot be unconditional. This potentially changes existing 
behaviours,
and I'd feel a lot safer if FIQ was only messed with on that specific 
HW.


I have the feeling that this should be detected on the boot CPU and 
patched

before any interrupt can fire.


.endm

 /*
@@ -50,7 +50,7 @@
  */
.macro  save_and_disable_irq, flags
mrs \flags, daif
-   msr daifset, #2
+   msr daifset, #3
.endm

.macro  restore_irq, flags
diff --git a/arch/arm64/include/asm/daifflags.h
b/arch/arm64/include/asm/daifflags.h
index 1c26d7baa67f..44de96c7fb1a 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -13,8 +13,8 @@
 #include 

 #define DAIF_PROCCTX   0
-#define DAIF_PROCCTX_NOIRQ PSR_I_BIT
-#define DAIF_ERRCTX(PSR_I_BIT | PSR_A_BIT)
+#define DAIF_PROCCTX_NOIRQ (PSR_I_BIT | PSR_F_BIT)
+#define DAIF_ERRCTX(PSR_I_BIT | PSR_F_BIT | PSR_A_BIT)
 #define DAIF_MASK  (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)


diff --git a/arch/arm64/include/asm/irq.h 
b/arch/arm64/include/asm/irq.h

index b2b0c6405eb0..2d1537d3a245 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -13,5 +13,9 @@ static inline int nr_legacy_irqs(void)
return 0;
 }

+int set_handle_fiq(void (*handle_fiq)(struct pt_regs *));
+
+extern void (*handle_arch_fiq)(struct pt_regs *) __ro_after_init;


I guess this is set from the root interrupt controller, which also
will set handle_arch_irq? Why do we need two entry points? We have
ISR_EL1 to find out what is pending. Isn't that enough?


+
 #endif /* !__ASSEMBLER__ */
 #endif
diff --git a/arch/arm64/include/asm/irqflags.h
b/arch/arm64/include/asm/irqflags.h
index ff328e5bbb75..26d7f378113e 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -35,7 +35,7 @@ static inline void arch_local_irq_enable(void)
}

asm volatile(ALTERNATIVE(
-   "msr   daifclr, #2 // arch_local_irq_enable",
+   "msr   daifclr, #3 // arch_local_irq_enable",
__msr_s(SYS_ICC_PMR_EL1, "%0"),
ARM64_HAS_IRQ_PRIO_MASKING)
:
@@ -54,7 +54,7 @@ static inline void arch_local_irq_disable(void)
}

asm volatile(ALTERNATIVE(
-   "msr   daifset, #2 // arch_local_irq_disable",
+   "msr   daifset, #3 // arch_local_irq_disable",
__msr_s(SYS_ICC_PMR_EL1, "%0"),
ARM64_HAS_IRQ_PRIO_MASKING)
:
@@ -85,7 +85,7 @@ static inline int arch_irqs_disabled_flags(unsigned
long flags)
int res;

asm volatile(ALTERNATIVE(
-   "and   %w0, %w1, #" __stringify(PSR_I_BIT),
+   "and   %w0, %w1, #" __stringify(PSR_I_BIT | PSR_F_BIT),
"eor   %w0, %w1, #" __stringify(GIC_PRIO_IRQON),
ARM64_HAS_IRQ_PRIO_MASKING)
: "=&r" (res)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/ker

Re: [RFC PATCH 0/2] Avoid booting stall caused by idmap_kpti_install_ng_mappings

2021-01-20 Thread Marc Zyngier


Hi Justin,

On 2021-01-20 04:51, Justin He wrote:

Hi,
Kindly ping 😊


-Original Message-
From: Jia He 
Sent: Wednesday, January 13, 2021 9:41 AM
To: Catalin Marinas ; Will Deacon
; linux-arm-ker...@lists.infradead.org; linux-
ker...@vger.kernel.org
Cc: Anshuman Khandual ; Suzuki Poulose
; Justin He ; Mark Rutland
; Gustavo A. R. Silva ;
Richard Henderson ; Dave P Martin
; Steven Price ; Andrew 
Morton

; Mike Rapoport ; Ard
Biesheuvel ; Gavin Shan ; Kefeng 
Wang
; Mark Brown ; Marc 
Zyngier

; Cristian Marussi 
Subject: [RFC PATCH 0/2] Avoid booting stall caused by

There is a 10s stall in idmap_kpti_install_ng_mappings when kernel 
boots

on a Ampere EMAG server.

Commit f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap
swapper using nG mappings") updates the nG bit runtime if kpti is
required.

But things get worse if rodata=full in map_mem(). NO_BLOCK_MAPPINGS |
NO_CONT_MAPPINGS is required when creating pagetable mapping. Hence 
all

ptes are fully mapped in this case. On a Ampere EMAG server with 256G
memory(pagesize=4k), it causes the 10s stall.

After moving init_cpu_features() ahead of early_fixmap_init(), we can 
use

cpu_have_const_cap earlier than before. Hence we can avoid this stall
by updating arm64_use_ng_mappings.

After this patch series, it reduces the kernel boot time from 14.7s to
4.1s:
Before:
[   14.757569] Freeing initrd memory: 60752K
After:
[4.138819] Freeing initrd memory: 60752K

Set it as RFC because I want to resolve any other points which I have
misconerned.


But you don't really explain *why* having the CPU Feature discovery
early helps at all. Is that so that you can bypass the idmap mapping?
I'd expect something that explain the problem instead of paraphrasing
the patches.

Another thing is whether you have tested this on some ThunderX HW
(the first version, not TX2), as this is the whole reason for this
code...

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH] of: property: Add device link support for interrupts

2021-01-20 Thread Marc Zyngier


On 2021-01-20 09:53, Geert Uytterhoeven wrote:

Hi Saravana,

On Fri, Dec 18, 2020 at 10:11 PM Saravana Kannan  
wrote:

Add support for creating device links out of interrupts property.

Cc: Marc Zyngier 
Cc: Kevin Hilman 
Signed-off-by: Saravana Kannan 


Thanks for your patch!

This does not seem to add all links.  I see links being created to the
secondary interrupt controller (e61c "renesas,irqc"), but not to
the primary interrupt controller (GIC)

Which is good, as the GIC driver is not a platform_driver, and thus
creating links would break everything ;-)

BTW, I'd _love_ the GIC driver to be a platform_driver, as the GIC is
part of a power/and or clock domain on Renesas SoCs...


The trouble is that we need the GIC much earlier than the device model
is available, because the timer needs to be available super early.
This isn't specific to the GIC, by the way, but also to all root
interrupt controllers that end-up controlling the per-CPU interrupts.

If you try to relax this constraint, you'll end up observing some of the
very weird dependencies between sysfs, sched, and the device model (I 
went
there a few years back, wasted a week on it, did a reset --hard on the 
branch,

and swore never to look at this again! ;-)

But for a start, I'd like the ITS code to be turned into a platform 
driver,
as this would potentially allow for the various domains to be 
instantiated

dynamically. One day.

M.
--
Jazz is not dead. It just smells funny...

[PATCH v4 16/21] arm64: Make kvm-arm.mode={nvhe,protected} an alias of id_aa64mmfr1.vh=0

2021-01-18 Thread Marc Zyngier

Admitedly, passing id_aa64mmfr1.vh=0 on the command-line isn't
that easy to understand, and it is likely that users would much
prefer write "kvm-arm.mode=nvhe", or "...=protected".

So here you go. This has the added advantage that we can now
always honor the "kvm-arm.mode=protected" option, even when
booting on a VHE system.

Signed-off-by: Marc Zyngier 
---
 Documentation/admin-guide/kernel-parameters.txt | 3 +++
 arch/arm64/kernel/idreg-override.c  | 2 ++
 arch/arm64/kvm/arm.c| 3 +++
 3 files changed, 8 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 9e3cdb271d06..2786fd39a047 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2257,6 +2257,9 @@
kvm-arm.mode=
[KVM,ARM] Select one of KVM/arm64's modes of operation.
 
+   nvhe: Standard nVHE-based mode, without support for
+ protected guests.
+
protected: nVHE-based mode with support for guests whose
   state is kept private from the host.
   Not valid if the kernel is running in EL2.
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 16bc8b3b93ae..1db54878b2c4 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -41,6 +41,8 @@ static const struct {
const char * const  alias;
const char * const  feature;
 } aliases[] __initdata = {
+   { "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
+   { "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
 };
 
 static int __init find_field(const char *cmdline, const struct reg_desc *reg,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 04c44853b103..597565a65ca2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1966,6 +1966,9 @@ static int __init early_kvm_mode_cfg(char *arg)
return 0;
}
 
+   if (strcmp(arg, "nvhe") == 0 && !WARN_ON(is_kernel_in_hyp_mode()))
+   return 0;
+
return -EINVAL;
 }
 early_param("kvm-arm.mode", early_kvm_mode_cfg);
-- 
2.29.2

[PATCH v4 13/21] arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command line

2021-01-18 Thread Marc Zyngier

As we want to be able to disable VHE at runtime, let's match
"id_aa64mmfr1.vh=" from the command line as an override.
This doesn't have much effect yet as our boot code doesn't look
at the cpufeature, but only at the HW registers.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/cpufeature.h |  3 +++
 arch/arm64/kernel/cpufeature.c  |  6 +-
 arch/arm64/kernel/idreg-override.c  | 12 
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index fe0130d6c0ff..80a5f423444e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -814,6 +814,9 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
return 8;
 }
 
+extern u64 id_aa64mmfr1_val;
+extern u64 id_aa64mmfr1_mask;
+
 u32 get_kvm_ipa_limit(void);
 void dump_cpu_features(void);
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 48a011935d8c..5b9343d2e9f0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -555,6 +555,9 @@ static const struct arm64_ftr_bits ftr_raz[] = {
 
 #define ARM64_FTR_REG(id, table) ARM64_FTR_REG_OVERRIDE(id, table, NULL, NULL)
 
+u64 id_aa64mmfr1_val;
+u64 id_aa64mmfr1_mask;
+
 static const struct __ftr_reg_entry {
u32 sys_id;
struct arm64_ftr_reg*reg;
@@ -602,7 +605,8 @@ static const struct __ftr_reg_entry {
 
/* Op1 = 0, CRn = 0, CRm = 7 */
ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0),
-   ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1),
+   ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1,
+  &id_aa64mmfr1_val, &id_aa64mmfr1_mask),
ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2),
 
/* Op1 = 0, CRn = 1, CRm = 2 */
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 392f93b67103..75d9845f489b 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -10,6 +10,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 struct reg_desc {
@@ -22,7 +23,18 @@ struct reg_desc {
}   fields[];
 };
 
+static const struct reg_desc mmfr1 __initdata = {
+   .name   = "id_aa64mmfr1",
+   .val= &id_aa64mmfr1_val,
+   .mask   = &id_aa64mmfr1_mask,
+   .fields = {
+   { "vh", ID_AA64MMFR1_VHE_SHIFT },
+   {}
+   },
+};
+
 static const struct reg_desc * const regs[] __initdata = {
+   &mmfr1,
 };
 
 static int __init find_field(const char *cmdline, const struct reg_desc *reg,
-- 
2.29.2

[PATCH v4 20/21] arm64: Defer enabling pointer authentication on boot core

2021-01-18 Thread Marc Zyngier

From: Srinivas Ramana 

Defer enabling pointer authentication on boot core until
after its required to be enabled by cpufeature framework.
This will help in controlling the feature dynamically
with a boot parameter.

Signed-off-by: Ajay Patil 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Srinivas Ramana 
Signed-off-by: Marc Zyngier 
Link: 
https://lore.kernel.org/r/1610152163-16554-2-git-send-email-sram...@codeaurora.org
---
 arch/arm64/include/asm/pointer_auth.h   | 10 ++
 arch/arm64/include/asm/stackprotector.h |  1 +
 arch/arm64/kernel/head.S|  4 
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/pointer_auth.h 
b/arch/arm64/include/asm/pointer_auth.h
index c6b4f0603024..b112a11e9302 100644
--- a/arch/arm64/include/asm/pointer_auth.h
+++ b/arch/arm64/include/asm/pointer_auth.h
@@ -76,6 +76,15 @@ static inline unsigned long ptrauth_strip_insn_pac(unsigned 
long ptr)
return ptrauth_clear_pac(ptr);
 }
 
+static __always_inline void ptrauth_enable(void)
+{
+   if (!system_supports_address_auth())
+   return;
+   sysreg_clear_set(sctlr_el1, 0, (SCTLR_ELx_ENIA | SCTLR_ELx_ENIB |
+   SCTLR_ELx_ENDA | SCTLR_ELx_ENDB));
+   isb();
+}
+
 #define ptrauth_thread_init_user(tsk)  \
ptrauth_keys_init_user(&(tsk)->thread.keys_user)
 #define ptrauth_thread_init_kernel(tsk)
\
@@ -84,6 +93,7 @@ static inline unsigned long ptrauth_strip_insn_pac(unsigned 
long ptr)
ptrauth_keys_switch_kernel(&(tsk)->thread.keys_kernel)
 
 #else /* CONFIG_ARM64_PTR_AUTH */
+#define ptrauth_enable()
 #define ptrauth_prctl_reset_keys(tsk, arg) (-EINVAL)
 #define ptrauth_strip_insn_pac(lr) (lr)
 #define ptrauth_thread_init_user(tsk)
diff --git a/arch/arm64/include/asm/stackprotector.h 
b/arch/arm64/include/asm/stackprotector.h
index 7263e0bac680..33f1bb453150 100644
--- a/arch/arm64/include/asm/stackprotector.h
+++ b/arch/arm64/include/asm/stackprotector.h
@@ -41,6 +41,7 @@ static __always_inline void boot_init_stack_canary(void)
 #endif
ptrauth_thread_init_kernel(current);
ptrauth_thread_switch_kernel(current);
+   ptrauth_enable();
 }
 
 #endif /* _ASM_STACKPROTECTOR_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b3c4dd04f74b..2a152d96d767 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -404,10 +404,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
adr_l   x5, init_task
msr sp_el0, x5  // Save thread_info
 
-#ifdef CONFIG_ARM64_PTR_AUTH
-   __ptrauth_keys_init_cpu x5, x6, x7, x8
-#endif
-
adr_l   x8, vectors // load VBAR_EL1 with virtual
msr vbar_el1, x8// vector table address
isb
-- 
2.29.2

[PATCH v4 21/21] arm64: cpufeatures: Allow disabling of Pointer Auth from the command-line

2021-01-18 Thread Marc Zyngier

In order to be able to disable Pointer Authentication  at runtime,
whether it is for testing purposes, or to work around HW issues,
let's add support for overriding the ID_AA64ISAR1_EL1.{GPI,GPA,API,APA}
fields.

This is further mapped on the arm64.nopauth command-line alias.

Signed-off-by: Marc Zyngier 
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/arm64/include/asm/cpufeature.h |  2 ++
 arch/arm64/kernel/cpufeature.c  |  5 -
 arch/arm64/kernel/idreg-override.c  | 17 +
 4 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 7599fd0f1ad7..f9cb28a39bd0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -376,6 +376,9 @@
arm64.nobti [ARM64] Unconditionally disable Branch Target
Identification support
 
+   arm64.nopauth   [ARM64] Unconditionally disable Pointer Authentication
+   support
+
ataflop=[HW,M68k]
 
atarimouse= [HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index d3e0f6dd43c4..9d8dcf380ad5 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -818,6 +818,8 @@ extern u64 id_aa64mmfr1_val;
 extern u64 id_aa64mmfr1_mask;
 extern u64 id_aa64pfr1_val;
 extern u64 id_aa64pfr1_mask;
+extern u64 id_aa64isar1_val;
+extern u64 id_aa64isar1_mask;
 
 u32 get_kvm_ipa_limit(void);
 void dump_cpu_features(void);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f223171a7c34..f5ba7fd615b5 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -559,6 +559,8 @@ u64 id_aa64mmfr1_val;
 u64 id_aa64mmfr1_mask;
 u64 id_aa64pfr1_val;
 u64 id_aa64pfr1_mask;
+u64 id_aa64isar1_val;
+u64 id_aa64isar1_mask;
 
 static const struct __ftr_reg_entry {
u32 sys_id;
@@ -604,7 +606,8 @@ static const struct __ftr_reg_entry {
 
/* Op1 = 0, CRn = 0, CRm = 6 */
ARM64_FTR_REG(SYS_ID_AA64ISAR0_EL1, ftr_id_aa64isar0),
-   ARM64_FTR_REG(SYS_ID_AA64ISAR1_EL1, ftr_id_aa64isar1),
+   ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ISAR1_EL1, ftr_id_aa64isar1,
+  &id_aa64isar1_val, &id_aa64isar1_mask),
 
/* Op1 = 0, CRn = 0, CRm = 7 */
ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0),
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index a9e3ed193fd4..7037c9b214d0 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -43,6 +43,19 @@ static const struct reg_desc pfr1 __initdata = {
},
 };
 
+static const struct reg_desc isar1 __initdata = {
+   .name   = "id_aa64isar1",
+   .val= &id_aa64isar1_val,
+   .mask   = &id_aa64isar1_mask,
+   .fields = {
+   { "gpi", ID_AA64ISAR1_GPI_SHIFT },
+   { "gpa", ID_AA64ISAR1_GPA_SHIFT },
+   { "api", ID_AA64ISAR1_API_SHIFT },
+   { "apa", ID_AA64ISAR1_APA_SHIFT },
+   {}
+   },
+};
+
 extern u64 kaslr_feature_val;
 extern u64 kaslr_feature_mask;
 
@@ -61,6 +74,7 @@ static const struct reg_desc kaslr __initdata = {
 static const struct reg_desc * const regs[] __initdata = {
&mmfr1,
&pfr1,
+   &isar1,
&kaslr,
 };
 
@@ -71,6 +85,9 @@ static const struct {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
{ "arm64.nobti","id_aa64pfr1.bt=0" },
+   { "arm64.nopauth",
+ "id_aa64isar1.gpi=0 id_aa64isar1.gpa=0 "
+ "id_aa64isar1.api=0 id_aa64isar1.apa=0"  },
{ "nokaslr","kaslr.disabled=1" },
 };
 
-- 
2.29.2

[PATCH v4 17/21] KVM: arm64: Document HVC_VHE_RESTART stub hypercall

2021-01-18 Thread Marc Zyngier

For completeness, let's document the HVC_VHE_RESTART stub.

Signed-off-by: Marc Zyngier 
---
 Documentation/virt/kvm/arm/hyp-abi.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/virt/kvm/arm/hyp-abi.rst 
b/Documentation/virt/kvm/arm/hyp-abi.rst
index 83cadd8186fa..1ba628baf11b 100644
--- a/Documentation/virt/kvm/arm/hyp-abi.rst
+++ b/Documentation/virt/kvm/arm/hyp-abi.rst
@@ -58,6 +58,15 @@ these functions (see arch/arm{,64}/include/asm/virt.h):
   into place (arm64 only), and jump to the restart address while at HYP/EL2.
   This hypercall is not expected to return to its caller.
 
+* ::
+
+x0 = HVC_VHE_RESTART (arm64 only)
+
+  Attempt to upgrade the kernel's exception level from EL1 to EL2 by enabling
+  the VHE mode. This is conditioned by the CPU supporting VHE, the EL2 MMU
+  being off, and VHE not being disabled by any other mean (comment line option,
+  for example).
+
 Any other value of r0/x0 triggers a hypervisor-specific handling,
 which is not documented here.
 
-- 
2.29.2

[PATCH v4 01/21] arm64: Fix labels in el2_setup macros

2021-01-18 Thread Marc Zyngier

If someone happens to write the following code:

b   1f
init_el2_state  vhe
1:
[...]

they will be in for a long debugging session, as the label "1f"
will be resolved *inside* the init_el2_state macro instead of
after it. Not really what one expects.

Instead, rewite the EL2 setup macros to use unambiguous labels,
thanks to the usual macro counter trick.

Acked-by: Catalin Marinas 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/el2_setup.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h 
b/arch/arm64/include/asm/el2_setup.h
index a7f5a1bbc8ac..540116de80bf 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -45,24 +45,24 @@
mrs x1, id_aa64dfr0_el1
sbfxx0, x1, #ID_AA64DFR0_PMUVER_SHIFT, #4
cmp x0, #1
-   b.lt1f  // Skip if no PMU present
+   b.lt.Lskip_pmu_\@   // Skip if no PMU present
mrs x0, pmcr_el0// Disable debug access traps
ubfxx0, x0, #11, #5 // to EL2 and allow access to
-1:
+.Lskip_pmu_\@:
cselx2, xzr, x0, lt // all PMU counters from EL1
 
/* Statistical profiling */
ubfxx0, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
-   cbz x0, 3f  // Skip if SPE not present
+   cbz x0, .Lskip_spe_\@   // Skip if SPE not present
 
 .ifeqs "\mode", "nvhe"
mrs_s   x0, SYS_PMBIDR_EL1  // If SPE available at EL2,
and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
-   cbnzx0, 2f  // then permit sampling of 
physical
+   cbnzx0, .Lskip_spe_el2_\@   // then permit sampling of 
physical
mov x0, #(1 << SYS_PMSCR_EL2_PCT_SHIFT | \
  1 << SYS_PMSCR_EL2_PA_SHIFT)
msr_s   SYS_PMSCR_EL2, x0   // addresses and physical 
counter
-2:
+.Lskip_spe_el2_\@:
mov x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
orr x2, x2, x0  // If we don't have VHE, then
// use EL1&0 translation.
@@ -71,7 +71,7 @@
// and disable access from EL1
 .endif
 
-3:
+.Lskip_spe_\@:
msr mdcr_el2, x2// Configure debug traps
 .endm
 
@@ -79,9 +79,9 @@
 .macro __init_el2_lor
mrs x1, id_aa64mmfr1_el1
ubfxx0, x1, #ID_AA64MMFR1_LOR_SHIFT, 4
-   cbz x0, 1f
+   cbz x0, .Lskip_lor_\@
msr_s   SYS_LORC_EL1, xzr
-1:
+.Lskip_lor_\@:
 .endm
 
 /* Stage-2 translation */
@@ -93,7 +93,7 @@
 .macro __init_el2_gicv3
mrs x0, id_aa64pfr0_el1
ubfxx0, x0, #ID_AA64PFR0_GIC_SHIFT, #4
-   cbz x0, 1f
+   cbz x0, .Lskip_gicv3_\@
 
mrs_s   x0, SYS_ICC_SRE_EL2
orr x0, x0, #ICC_SRE_EL2_SRE// Set ICC_SRE_EL2.SRE==1
@@ -103,7 +103,7 @@
mrs_s   x0, SYS_ICC_SRE_EL2 // Read SRE back,
tbz x0, #0, 1f  // and check that it sticks
msr_s   SYS_ICH_HCR_EL2, xzr// Reset ICC_HCR_EL2 to defaults
-1:
+.Lskip_gicv3_\@:
 .endm
 
 .macro __init_el2_hstr
@@ -128,14 +128,14 @@
 .macro __init_el2_nvhe_sve
mrs x1, id_aa64pfr0_el1
ubfxx1, x1, #ID_AA64PFR0_SVE_SHIFT, #4
-   cbz x1, 1f
+   cbz x1, .Lskip_sve_\@
 
bic x0, x0, #CPTR_EL2_TZ// Also disable SVE traps
msr cptr_el2, x0// Disable copro. traps to EL2
isb
mov x1, #ZCR_ELx_LEN_MASK   // SVE: Enable full vector
msr_s   SYS_ZCR_EL2, x1 // length for EL1.
-1:
+.Lskip_sve_\@:
 .endm
 
 .macro __init_el2_nvhe_prepare_eret
-- 
2.29.2

[PATCH v4 12/21] arm64: cpufeature: Add an early command-line cpufeature override facility

2021-01-18 Thread Marc Zyngier

In order to be able to override CPU features at boot time,
let's add a command line parser that matches options of the
form "cpureg.feature=value", and store the corresponding
value into the override val/mask pair.

No features are currently defined, so no expected change in
functionality.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/Makefile |   2 +-
 arch/arm64/kernel/head.S   |   1 +
 arch/arm64/kernel/idreg-override.c | 119 +
 3 files changed, 121 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/idreg-override.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 86364ab6f13f..2262f0392857 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -17,7 +17,7 @@ obj-y := debug-monitors.o entry.o irq.o 
fpsimd.o  \
   return_address.o cpuinfo.o cpu_errata.o  
\
   cpufeature.o alternative.o cacheinfo.o   
\
   smp.o smp_spin_table.o topology.o smccc-call.o   
\
-  syscall.o proton-pack.o
+  syscall.o proton-pack.o idreg-override.o
 
 targets+= efi-entry.o
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index d74e5f84042e..b3c4dd04f74b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -435,6 +435,7 @@ SYM_FUNC_START_LOCAL(__primary_switched)
 
mov x0, x21 // pass FDT address in x0
bl  early_fdt_map   // Try mapping the FDT early
+   bl  init_shadow_regs
bl  switch_to_vhe
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl  kasan_early_init
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
new file mode 100644
index ..392f93b67103
--- /dev/null
+++ b/arch/arm64/kernel/idreg-override.c
@@ -0,0 +1,119 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Early cpufeature override framework
+ *
+ * Copyright (C) 2020 Google LLC
+ * Author: Marc Zyngier 
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+
+struct reg_desc {
+   const char * const  name;
+   u64 * const val;
+   u64 * const mask;
+   struct {
+   const char * const  name;
+   u8   shift;
+   }   fields[];
+};
+
+static const struct reg_desc * const regs[] __initdata = {
+};
+
+static int __init find_field(const char *cmdline, const struct reg_desc *reg,
+int f, u64 *v)
+{
+   char buf[256], *str;
+   size_t len;
+
+   snprintf(buf, ARRAY_SIZE(buf), "%s.%s=", reg->name, 
reg->fields[f].name);
+
+   str = strstr(cmdline, buf);
+   if (!(str == cmdline || (str > cmdline && *(str - 1) == ' ')))
+   return -1;
+
+   str += strlen(buf);
+   len = strcspn(str, " ");
+   len = min(len, ARRAY_SIZE(buf) - 1);
+   strncpy(buf, str, len);
+   buf[len] = 0;
+
+   return kstrtou64(buf, 0, v);
+}
+
+static void __init match_options(const char *cmdline)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(regs); i++) {
+   int f;
+
+   if (!regs[i]->val || !regs[i]->mask)
+   continue;
+
+   for (f = 0; regs[i]->fields[f].name; f++) {
+   u64 v;
+
+   if (find_field(cmdline, regs[i], f, &v))
+   continue;
+
+   *regs[i]->val  |= (v & 0xf) << regs[i]->fields[f].shift;
+   *regs[i]->mask |= 0xfUL << regs[i]->fields[f].shift;
+   }
+   }
+}
+
+static __init void parse_cmdline(void)
+{
+   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+   const u8 *prop;
+   void *fdt;
+   int node;
+
+   fdt = get_early_fdt_ptr();
+   if (!fdt)
+   goto out;
+
+   node = fdt_path_offset(fdt, "/chosen");
+   if (node < 0)
+   goto out;
+
+   prop = fdt_getprop(fdt, node, "bootargs", NULL);
+   if (!prop)
+   goto out;
+
+   match_options(prop);
+
+   if (!IS_ENABLED(CONFIG_CMDLINE_EXTEND))
+   return;
+   }
+
+out:
+   match_options(CONFIG_CMDLINE);
+}
+
+void __init init_shadow_regs(void)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(regs); i++) {
+   if (regs[i]->val)
+   *regs[i]->val  = 0;
+   if (regs[i]->mask)
+   *regs[i]->mask = 0;
+   }
+
+   parse_cmdline(

[PATCH v4 10/21] arm64: cpufeature: Use IDreg override in __read_sysreg_by_encoding()

2021-01-18 Thread Marc Zyngier

__read_sysreg_by_encoding() is used by a bunch of cpufeature helpers,
which should take the feature override into account. Let's do that.

For a good measure (and because we are likely to need to further
down the line), make this helper available to the rest of the
non-modular kernel.

Code that needs to know the *real* features of a CPU can still
use read_sysreg_s(), and find the bare, ugly truth.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/kernel/cpufeature.c  | 15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 465d2cb63bfc..fe0130d6c0ff 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -602,6 +602,7 @@ void __init setup_cpu_features(void);
 void check_local_cpu_capabilities(void);
 
 u64 read_sanitised_ftr_reg(u32 id);
+u64 __read_sysreg_by_encoding(u32 sys_id);
 
 static inline bool cpu_supports_mixed_endian_el0(void)
 {
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index aaa075c6f029..48a011935d8c 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1149,14 +1149,17 @@ u64 read_sanitised_ftr_reg(u32 id)
 EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
 
 #define read_sysreg_case(r)\
-   case r: return read_sysreg_s(r)
+   case r: val = read_sysreg_s(r); break;
 
 /*
  * __read_sysreg_by_encoding() - Used by a STARTING cpu before cpuinfo is 
populated.
  * Read the system register on the current CPU
  */
-static u64 __read_sysreg_by_encoding(u32 sys_id)
+u64 __read_sysreg_by_encoding(u32 sys_id)
 {
+   struct arm64_ftr_reg *regp;
+   u64 val;
+
switch (sys_id) {
read_sysreg_case(SYS_ID_PFR0_EL1);
read_sysreg_case(SYS_ID_PFR1_EL1);
@@ -1199,6 +1202,14 @@ static u64 __read_sysreg_by_encoding(u32 sys_id)
BUG();
return 0;
}
+
+   regp  = get_arm64_ftr_reg(sys_id);
+   if (regp && regp->override_mask && regp->override_val) {
+   val &= ~*regp->override_mask;
+   val |= (*regp->override_val & *regp->override_mask);
+   }
+
+   return val;
 }
 
 #include 
-- 
2.29.2

[PATCH v4 14/21] arm64: Honor VHE being disabled from the command-line

2021-01-18 Thread Marc Zyngier

Finally we can check whether VHE is disabled on the command line,
and not enable it if that's the user's wish.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/hyp-stub.S | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 59820f9b8522..bbab2148a2a2 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -77,13 +77,24 @@ SYM_CODE_END(el1_sync)
 SYM_CODE_START_LOCAL(mutate_to_vhe)
// Sanity check: MMU *must* be off
mrs x0, sctlr_el2
-   tbnzx0, #0, 1f
+   tbnzx0, #0, 2f
 
// Needs to be VHE capable, obviously
mrs x0, id_aa64mmfr1_el1
ubfxx0, x0, #ID_AA64MMFR1_VHE_SHIFT, #4
-   cbz x0, 1f
+   cbz x0, 2f
 
+   // Check whether VHE is disabled from the command line
+   adr_l   x1, id_aa64mmfr1_val
+   ldr x0, [x1]
+   adr_l   x1, id_aa64mmfr1_mask
+   ldr x1, [x1]
+   ubfxx0, x0, #ID_AA64MMFR1_VHE_SHIFT, #4
+   ubfxx1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4
+   cbz x1, 1f
+   and x0, x0, x1
+   cbz x0, 2f
+1:
// Engage the VHE magic!
mov_q   x0, HCR_HOST_VHE_FLAGS
msr hcr_el2, x0
@@ -152,7 +163,7 @@ skip_spe:
orr x0, x0, x1
msr spsr_el1, x0
 
-1: eret
+2: eret
 SYM_CODE_END(mutate_to_vhe)
 
 .macro invalid_vector  label
-- 
2.29.2

[PATCH v4 18/21] arm64: Move "nokaslr" over to the early cpufeature infrastructure

2021-01-18 Thread Marc Zyngier

Given that the early cpufeature infrastructure has borrowed quite
a lot of code from the kaslr implementation, let's reimplement
the matching of the "nokaslr" option with it.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/idreg-override.c | 17 ++
 arch/arm64/kernel/kaslr.c  | 37 +++---
 2 files changed, 20 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 1db54878b2c4..143fe7b8e3ce 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -33,8 +33,24 @@ static const struct reg_desc mmfr1 __initdata = {
},
 };
 
+extern u64 kaslr_feature_val;
+extern u64 kaslr_feature_mask;
+
+static const struct reg_desc kaslr __initdata = {
+   .name   = "kaslr",
+#ifdef CONFIG_RANDOMIZE_BASE
+   .val= &kaslr_feature_val,
+   .mask   = &kaslr_feature_mask,
+#endif
+   .fields = {
+   { "disabled", 0 },
+   {}
+   },
+};
+
 static const struct reg_desc * const regs[] __initdata = {
&mmfr1,
+   &kaslr,
 };
 
 static const struct {
@@ -43,6 +59,7 @@ static const struct {
 } aliases[] __initdata = {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
+   { "nokaslr","kaslr.disabled=1" },
 };
 
 static int __init find_field(const char *cmdline, const struct reg_desc *reg,
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 5fc86e7d01a1..dcc4a5aadbe2 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -51,39 +51,8 @@ static __init u64 get_kaslr_seed(void *fdt)
return ret;
 }
 
-static __init bool cmdline_contains_nokaslr(const u8 *cmdline)
-{
-   const u8 *str;
-
-   str = strstr(cmdline, "nokaslr");
-   return str == cmdline || (str > cmdline && *(str - 1) == ' ');
-}
-
-static __init bool is_kaslr_disabled_cmdline(void *fdt)
-{
-   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
-   int node;
-   const u8 *prop;
-
-   node = fdt_path_offset(fdt, "/chosen");
-   if (node < 0)
-   goto out;
-
-   prop = fdt_getprop(fdt, node, "bootargs", NULL);
-   if (!prop)
-   goto out;
-
-   if (cmdline_contains_nokaslr(prop))
-   return true;
-
-   if (IS_ENABLED(CONFIG_CMDLINE_EXTEND))
-   goto out;
-
-   return false;
-   }
-out:
-   return cmdline_contains_nokaslr(CONFIG_CMDLINE);
-}
+u64 kaslr_feature_val __initdata;
+u64 kaslr_feature_mask __initdata;
 
 /*
  * This routine will be executed with the kernel mapped at its default virtual
@@ -126,7 +95,7 @@ u64 __init kaslr_early_init(void)
 * Check if 'nokaslr' appears on the command line, and
 * return 0 if that is the case.
 */
-   if (is_kaslr_disabled_cmdline(fdt)) {
+   if (kaslr_feature_val & kaslr_feature_mask & 0xf) {
kaslr_status = KASLR_DISABLED_CMDLINE;
return 0;
}
-- 
2.29.2

[PATCH v4 11/21] arm64: Extract early FDT mapping from kaslr_early_init()

2021-01-18 Thread Marc Zyngier

As we want to parse more options very early in the kernel lifetime,
let's always map the FDT early. This is achieved by moving that
code out of kaslr_early_init().

No functionnal change expected.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/setup.h | 11 +++
 arch/arm64/kernel/head.S   |  3 ++-
 arch/arm64/kernel/kaslr.c  |  7 +++
 arch/arm64/kernel/setup.c  | 15 +++
 4 files changed, 31 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/include/asm/setup.h

diff --git a/arch/arm64/include/asm/setup.h b/arch/arm64/include/asm/setup.h
new file mode 100644
index ..d3320618ed14
--- /dev/null
+++ b/arch/arm64/include/asm/setup.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#ifndef __ARM64_ASM_SETUP_H
+#define __ARM64_ASM_SETUP_H
+
+#include 
+
+void *get_early_fdt_ptr(void);
+void early_fdt_map(u64 dt_phys);
+
+#endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b425d2587cdb..d74e5f84042e 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -433,6 +433,8 @@ SYM_FUNC_START_LOCAL(__primary_switched)
bl  __pi_memset
dsb ishst   // Make zero page visible to PTW
 
+   mov x0, x21 // pass FDT address in x0
+   bl  early_fdt_map   // Try mapping the FDT early
bl  switch_to_vhe
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl  kasan_early_init
@@ -440,7 +442,6 @@ SYM_FUNC_START_LOCAL(__primary_switched)
 #ifdef CONFIG_RANDOMIZE_BASE
tst x23, ~(MIN_KIMG_ALIGN - 1)  // already running randomized?
b.ne0f
-   mov x0, x21 // pass FDT address in x0
bl  kaslr_early_init// parse FDT for KASLR options
cbz x0, 0f  // KASLR disabled? just proceed
orr x23, x23, x0// record KASLR offset
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 1c74c45b9494..5fc86e7d01a1 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 enum kaslr_status {
KASLR_ENABLED,
@@ -92,12 +93,11 @@ static __init bool is_kaslr_disabled_cmdline(void *fdt)
  * containing function pointers) to be reinitialized, and zero-initialized
  * .bss variables will be reset to 0.
  */
-u64 __init kaslr_early_init(u64 dt_phys)
+u64 __init kaslr_early_init(void)
 {
void *fdt;
u64 seed, offset, mask, module_range;
unsigned long raw;
-   int size;
 
/*
 * Set a reasonable default for module_alloc_base in case
@@ -111,8 +111,7 @@ u64 __init kaslr_early_init(u64 dt_phys)
 * and proceed with KASLR disabled. We will make another
 * attempt at mapping the FDT in setup_machine()
 */
-   early_fixmap_init();
-   fdt = fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+   fdt = get_early_fdt_ptr();
if (!fdt) {
kaslr_status = KASLR_DISABLED_FDT_REMAP;
return 0;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index c18aacde8bb0..01a994730754 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -168,6 +168,21 @@ static void __init smp_build_mpidr_hash(void)
pr_warn("Large number of MPIDR hash buckets detected\n");
 }
 
+static void *early_fdt_ptr __initdata;
+
+void __init *get_early_fdt_ptr(void)
+{
+   return early_fdt_ptr;
+}
+
+void __init early_fdt_map(u64 dt_phys)
+{
+   int fdt_size;
+
+   early_fixmap_init();
+   early_fdt_ptr = fixmap_remap_fdt(dt_phys, &fdt_size, PAGE_KERNEL);
+}
+
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
 {
int size;
-- 
2.29.2

[PATCH v4 15/21] arm64: Add an aliasing facility for the idreg override

2021-01-18 Thread Marc Zyngier

In order to map the override of idregs to options that a user
can easily understand, let's introduce yet another option
array, which maps an option to the corresponding idreg options.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/idreg-override.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 75d9845f489b..16bc8b3b93ae 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -37,6 +37,12 @@ static const struct reg_desc * const regs[] __initdata = {
&mmfr1,
 };
 
+static const struct {
+   const char * const  alias;
+   const char * const  feature;
+} aliases[] __initdata = {
+};
+
 static int __init find_field(const char *cmdline, const struct reg_desc *reg,
 int f, u64 *v)
 {
@@ -80,6 +86,18 @@ static void __init match_options(const char *cmdline)
}
 }
 
+static __init void match_aliases(const char *cmdline)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(aliases); i++) {
+   char *str = strstr(cmdline, aliases[i].alias);
+
+   if ((str == cmdline || (str > cmdline && *(str - 1) == ' ')))
+   match_options(aliases[i].feature);
+   }
+}
+
 static __init void parse_cmdline(void)
 {
if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
@@ -100,6 +118,7 @@ static __init void parse_cmdline(void)
goto out;
 
match_options(prop);
+   match_aliases(prop);
 
if (!IS_ENABLED(CONFIG_CMDLINE_EXTEND))
return;
@@ -107,6 +126,7 @@ static __init void parse_cmdline(void)
 
 out:
match_options(CONFIG_CMDLINE);
+   match_aliases(CONFIG_CMDLINE);
 }
 
 void __init init_shadow_regs(void)
-- 
2.29.2

Re: [PATCH v1 5/5] driver core: Set fw_devlink=on by default

2021-01-18 Thread Marc Zyngier

On 2021-01-18 19:16, Geert Uytterhoeven wrote:

Hi Marc,

On Mon, Jan 18, 2021 at 6:59 PM Marc Zyngier  wrote:

On 2021-01-18 17:39, Geert Uytterhoeven wrote:
> On Fri, Dec 18, 2020 at 4:34 AM Saravana Kannan 
> wrote:
>> Cyclic dependencies in some firmware was one of the last remaining
>> reasons fw_devlink=on couldn't be set by default. Now that cyclic
>> dependencies don't block probing, set fw_devlink=on by default.
>>
>> Setting fw_devlink=on by default brings a bunch of benefits
>> (currently,
>> only for systems with device tree firmware):
>> * Significantly cuts down deferred probes.
>> * Device probe is effectively attempted in graph order.
>> * Makes it much easier to load drivers as modules without having to
>>   worry about functional dependencies between modules (depmod is still
>>   needed for symbol dependencies).
>>
>> If this patch prevents some devices from probing, it's very likely due
>> to the system having one or more device drivers that "probe"/set up a
>> device (DT node with compatible property) without creating a struct
>> device for it.  If we hit such cases, the device drivers need to be
>> fixed so that they populate struct devices and probe them like normal
>> device drivers so that the driver core is aware of the devices and
>> their
>> status. See [1] for an example of such a case.
>>
>> [1] -
>> 
https://lore.kernel.org/lkml/CAGETcx9PiX==mlxb9po8myyk6u2vhpvwtmsa5nkd-ywh5xh...@mail.gmail.com/
>> Signed-off-by: Saravana Kannan 
>
> Shimoda-san reported that next-20210111 and later fail to boot
> on Renesas R-Car Gen3 platforms. No output is seen, unless earlycon
> is enabled.
>
> I have bisected this to commit e590474768f1cc04 ("driver core: Set
> fw_devlink=on by default").

There is a tentative patch from Saravana here[1], which works around
some issues on my RK3399 platform, and it'd be interesting to find
out whether that helps on your system.

Thanks,

 M.

[1]
https://lore.kernel.org/r/20210116011412.3211292-1-sarava...@google.com

Thanks for the suggestion, but given no devices probe (incl. GPIO
providers), I'm afraid it won't help. [testing] Indeed.

With the debug prints in device_links_check_suppliers enabled, and
some postprocessing, I get:

255 supplier e618.system-controller not ready
  9 supplier fe99.iommu not ready
  9 supplier fe98.iommu not ready
  6 supplier febd.iommu not ready
  6 supplier ec67.iommu not ready
  3 supplier febe.iommu not ready
  3 supplier e774.iommu not ready
  3 supplier e674.iommu not ready
  3 supplier e65ee000.usb-phy not ready
  3 supplier e657.iommu not ready
  3 supplier e6054000.gpio not ready
  3 supplier e6053000.gpio not ready

As everything is part of a PM Domain, the (lack of the) system 
controller
must be the culprit. What's wrong with it? It is registered very early 
in

the boot:

[0.142096] rcar_sysc_pd_init:442: of_genpd_add_provider_onecell() 
returned 0

Yeah, this looks like the exact same problem. The devlink stuff assumes
that because there is a "compatible" property, there will be a driver
directly associated with the node containing this property.

If any other node has a reference to that first node, the dependency
will only get resolved if/when that first node is bound to a driver.
Trouble is, there are *tons* of code in the tree that invalidate
this heuristic, and for each occurrence of this we get another failure.

The patch I referred to papers over it by registering a dummy driver,
but that doesn't scale easily...

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v1 5/5] driver core: Set fw_devlink=on by default

2021-01-18 Thread Marc Zyngier

Hi Geert,

On 2021-01-18 17:39, Geert Uytterhoeven wrote:

Hi Saravana,

On Fri, Dec 18, 2020 at 4:34 AM Saravana Kannan
wrote:

Cyclic dependencies in some firmware was one of the last remaining
reasons fw_devlink=on couldn't be set by default. Now that cyclic
dependencies don't block probing, set fw_devlink=on by default.

Setting fw_devlink=on by default brings a bunch of benefits
(currently,

only for systems with device tree firmware):
* Significantly cuts down deferred probes.
* Device probe is effectively attempted in graph order.
* Makes it much easier to load drivers as modules without having to
worry about functional dependencies between modules (depmod is still
needed for symbol dependencies).

If this patch prevents some devices from probing, it's very likely due
to the system having one or more device drivers that "probe"/set up a
device (DT node with compatible property) without creating a struct
device for it. If we hit such cases, the device drivers need to be
fixed so that they populate struct devices and probe them like normal
device drivers so that the driver core is aware of the devices and
their

status. See [1] for an example of such a case.

[1] -
https://lore.kernel.org/lkml/CAGETcx9PiX==mlxb9po8myyk6u2vhpvwtmsa5nkd-ywh5xh...@mail.gmail.com/

Signed-off-by: Saravana Kannan

Shimoda-san reported that next-20210111 and later fail to boot
on Renesas R-Car Gen3 platforms. No output is seen, unless earlycon
is enabled.

I have bisected this to commit e590474768f1cc04 ("driver core: Set
fw_devlink=on by default").

There is a tentative patch from Saravana here[1], which works around
some issues on my RK3399 platform, and it'd be interesting to find
out whether that helps on your system.

Thanks,

[1]
https://lore.kernel.org/r/20210116011412.3211292-1-sarava...@google.com

--
Jazz is not dead. It just smells funny...

Re: [PATCH] KVM: arm64: Fix the return value of smp_call_function_single()

2021-01-18 Thread Marc Zyngier


On 2021-01-18 09:31, Yejune Deng wrote:

In smp_call_function_single(), the 3rd parameter isn't the return value
and it's always positive. But it may return a negative value. So the
'ret' is should be the return value of the smp_call_function_single().

In check_kvm_target_cpu(), 'phys_target' is more readable than 'ret'.

Signed-off-by: Yejune Deng 
---
 arch/arm64/kvm/arm.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 04c44853b103..5fa5c04106de 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1815,9 +1815,9 @@ static int init_hyp_mode(void)
return err;
 }

-static void check_kvm_target_cpu(void *ret)
+static void check_kvm_target_cpu(void *phys_target)
 {
-   *(int *)ret = kvm_target_cpu();
+   *(int *)phys_target = kvm_target_cpu();
 }

 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long 
mpidr)

@@ -1879,7 +1879,7 @@ void kvm_arch_irq_bypass_start(struct
irq_bypass_consumer *cons)
 int kvm_arch_init(void *opaque)
 {
int err;
-   int ret, cpu;
+   int ret, cpu, phys_target;
bool in_hyp_mode;

if (!is_hyp_mode_available()) {
@@ -1900,7 +1900,7 @@ int kvm_arch_init(void *opaque)
 "Only trusted guests should be used on this 
system.\n");

for_each_online_cpu(cpu) {
-   smp_call_function_single(cpu, check_kvm_target_cpu, &ret, 1);
+		ret = smp_call_function_single(cpu, check_kvm_target_cpu, 
&phys_target, 1);

if (ret < 0) {
kvm_err("Error, CPU %d not supported!\n", cpu);
return -ENODEV;


That's not the purpose of this code. We check the target CPU
for so that we can decide to *fail* the KVM init if there is
a CPU we do not support (we definitely used to do that with
32bit), and the error message clearly states this.

So if you want to handle a cross-call failure, please do that.
But don't change the existing semantics of this code.

M.
--
Jazz is not dead. It just smells funny...

[PATCH v4 19/21] arm64: cpufeatures: Allow disabling of BTI from the command-line

2021-01-18 Thread Marc Zyngier

In order to be able to disable BTI at runtime, whether it is
for testing purposes, or to work around HW issues, let's add
support for overriding the ID_AA64PFR1_EL1.BTI field.

This is further mapped on the arm64.nobti command-line alias.

Signed-off-by: Marc Zyngier 
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/arm64/include/asm/cpufeature.h |  2 ++
 arch/arm64/kernel/cpufeature.c  |  5 -
 arch/arm64/kernel/idreg-override.c  | 12 
 arch/arm64/mm/mmu.c |  2 +-
 5 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 2786fd39a047..7599fd0f1ad7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -373,6 +373,9 @@
arcrimi=[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
Format: ,,
 
+   arm64.nobti [ARM64] Unconditionally disable Branch Target
+   Identification support
+
ataflop=[HW,M68k]
 
atarimouse= [HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 80a5f423444e..d3e0f6dd43c4 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -816,6 +816,8 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
 
 extern u64 id_aa64mmfr1_val;
 extern u64 id_aa64mmfr1_mask;
+extern u64 id_aa64pfr1_val;
+extern u64 id_aa64pfr1_mask;
 
 u32 get_kvm_ipa_limit(void);
 void dump_cpu_features(void);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 5b9343d2e9f0..f223171a7c34 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -557,6 +557,8 @@ static const struct arm64_ftr_bits ftr_raz[] = {
 
 u64 id_aa64mmfr1_val;
 u64 id_aa64mmfr1_mask;
+u64 id_aa64pfr1_val;
+u64 id_aa64pfr1_mask;
 
 static const struct __ftr_reg_entry {
u32 sys_id;
@@ -592,7 +594,8 @@ static const struct __ftr_reg_entry {
 
/* Op1 = 0, CRn = 0, CRm = 4 */
ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0),
-   ARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1),
+   ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1,
+  &id_aa64pfr1_val, &id_aa64pfr1_mask),
ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0),
 
/* Op1 = 0, CRn = 0, CRm = 5 */
diff --git a/arch/arm64/kernel/idreg-override.c 
b/arch/arm64/kernel/idreg-override.c
index 143fe7b8e3ce..a9e3ed193fd4 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -33,6 +33,16 @@ static const struct reg_desc mmfr1 __initdata = {
},
 };
 
+static const struct reg_desc pfr1 __initdata = {
+   .name   = "id_aa64pfr1",
+   .val= &id_aa64pfr1_val,
+   .mask   = &id_aa64pfr1_mask,
+   .fields = {
+   { "bt", ID_AA64PFR1_BT_SHIFT },
+   {}
+   },
+};
+
 extern u64 kaslr_feature_val;
 extern u64 kaslr_feature_mask;
 
@@ -50,6 +60,7 @@ static const struct reg_desc kaslr __initdata = {
 
 static const struct reg_desc * const regs[] __initdata = {
&mmfr1,
+   &pfr1,
&kaslr,
 };
 
@@ -59,6 +70,7 @@ static const struct {
 } aliases[] __initdata = {
{ "kvm-arm.mode=nvhe",  "id_aa64mmfr1.vh=0" },
{ "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" },
+   { "arm64.nobti","id_aa64pfr1.bt=0" },
{ "nokaslr","kaslr.disabled=1" },
 };
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ae0c3d023824..617e704c980b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -628,7 +628,7 @@ static bool arm64_early_this_cpu_has_bti(void)
if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
return false;
 
-   pfr1 = read_sysreg_s(SYS_ID_AA64PFR1_EL1);
+   pfr1 = __read_sysreg_by_encoding(SYS_ID_AA64PFR1_EL1);
return cpuid_feature_extract_unsigned_field(pfr1,
ID_AA64PFR1_BT_SHIFT);
 }
-- 
2.29.2

[PATCH v4 04/21] arm64: Provide an 'upgrade to VHE' stub hypercall

2021-01-18 Thread Marc Zyngier

As we are about to change the way a VHE system boots, let's
provide the core helper, in the form of a stub hypercall that
enables VHE and replicates the full EL1 context at EL2, thanks
to EL1 and VHE-EL2 being extremely similar.

On exception return, the kernel carries on at EL2. Fancy!

Nothing calls this new hypercall yet, so no functional change.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/virt.h |  7 +++-
 arch/arm64/kernel/hyp-stub.S  | 67 +--
 2 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index ee6a48df89d9..7379f35ae2c6 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -35,8 +35,13 @@
  */
 #define HVC_RESET_VECTORS 2
 
+/*
+ * HVC_VHE_RESTART - Upgrade the CPU from EL1 to EL2, if possible
+ */
+#define HVC_VHE_RESTART3
+
 /* Max number of HYP stub hypercalls */
-#define HVC_STUB_HCALL_NR 3
+#define HVC_STUB_HCALL_NR 4
 
 /* Error returned when an invalid stub number is passed into x0 */
 #define HVC_STUB_ERR   0xbadca11
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 160f5881a0b7..fb12398b5c28 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -8,9 +8,9 @@
 
 #include 
 #include 
-#include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -47,10 +47,13 @@ SYM_CODE_END(__hyp_stub_vectors)
 
 SYM_CODE_START_LOCAL(el1_sync)
cmp x0, #HVC_SET_VECTORS
-   b.ne2f
+   b.ne1f
msr vbar_el2, x1
b   9f
 
+1: cmp x0, #HVC_VHE_RESTART
+   b.eqmutate_to_vhe
+
 2: cmp x0, #HVC_SOFT_RESTART
b.ne3f
mov x0, x2
@@ -70,6 +73,66 @@ SYM_CODE_START_LOCAL(el1_sync)
eret
 SYM_CODE_END(el1_sync)
 
+// nVHE? No way! Give me the real thing!
+SYM_CODE_START_LOCAL(mutate_to_vhe)
+   // Sanity check: MMU *must* be off
+   mrs x0, sctlr_el2
+   tbnzx0, #0, 1f
+
+   // Needs to be VHE capable, obviously
+   mrs x0, id_aa64mmfr1_el1
+   ubfxx0, x0, #ID_AA64MMFR1_VHE_SHIFT, #4
+   cbz x0, 1f
+
+   // Engage the VHE magic!
+   mov_q   x0, HCR_HOST_VHE_FLAGS
+   msr hcr_el2, x0
+   isb
+
+   // Doesn't do much on VHE, but still, worth a shot
+   init_el2_state vhe
+
+   // Use the EL1 allocated stack, per-cpu offset
+   mrs x0, sp_el1
+   mov sp, x0
+   mrs x0, tpidr_el1
+   msr tpidr_el2, x0
+
+   // FP configuration, vectors
+   mrs_s   x0, SYS_CPACR_EL12
+   msr cpacr_el1, x0
+   mrs_s   x0, SYS_VBAR_EL12
+   msr vbar_el1, x0
+
+   // Transfer the MM state from EL1 to EL2
+   mrs_s   x0, SYS_TCR_EL12
+   msr tcr_el1, x0
+   mrs_s   x0, SYS_TTBR0_EL12
+   msr ttbr0_el1, x0
+   mrs_s   x0, SYS_TTBR1_EL12
+   msr ttbr1_el1, x0
+   mrs_s   x0, SYS_MAIR_EL12
+   msr mair_el1, x0
+   isb
+
+   // Invalidate TLBs before enabling the MMU
+   tlbivmalle1
+   dsb nsh
+
+   // Enable the EL2 S1 MMU, as set up from EL1
+   mrs_s   x0, SYS_SCTLR_EL12
+   set_sctlr_el1   x0
+
+   // Hack the exception return to stay at EL2
+   mrs x0, spsr_el1
+   and x0, x0, #~PSR_MODE_MASK
+   mov x1, #PSR_MODE_EL2h
+   orr x0, x0, x1
+   msr spsr_el1, x0
+
+1: eret
+SYM_CODE_END(mutate_to_vhe)
+
 .macro invalid_vector  label
 SYM_CODE_START_LOCAL(\label)
b \label
-- 
2.29.2

[PATCH v4 03/21] arm64: Turn the MMU-on sequence into a macro

2021-01-18 Thread Marc Zyngier

Turning the MMU on is a popular sport in the arm64 kernel, and
we do it more than once, or even twice. As we are about to add
even more, let's turn it into a macro.

No expected functional change.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/assembler.h | 17 +
 arch/arm64/kernel/head.S   | 19 ---
 arch/arm64/mm/proc.S   | 12 +---
 3 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index bf125c591116..8cded93f99c3 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -675,6 +675,23 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
.endif
.endm
 
+/*
+ * Set SCTLR_EL1 to the passed value, and invalidate the local icache
+ * in the process. This is called when setting the MMU on.
+ */
+.macro set_sctlr_el1, reg
+   msr sctlr_el1, \reg
+   isb
+   /*
+* Invalidate the local I-cache so that any instructions fetched
+* speculatively from the PoC are discarded, since they may have
+* been dynamically patched at the PoU.
+*/
+   ic  iallu
+   dsb nsh
+   isb
+.endm
+
 /*
  * Check whether to yield to another runnable task from kernel mode NEON code
  * (which runs with preemption disabled).
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a0dc987724ed..28e9735302df 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -703,16 +703,9 @@ SYM_FUNC_START(__enable_mmu)
offset_ttbr1 x1, x3
msr ttbr1_el1, x1   // load TTBR1
isb
-   msr sctlr_el1, x0
-   isb
-   /*
-* Invalidate the local I-cache so that any instructions fetched
-* speculatively from the PoC are discarded, since they may have
-* been dynamically patched at the PoU.
-*/
-   ic  iallu
-   dsb nsh
-   isb
+
+   set_sctlr_el1   x0
+
ret
 SYM_FUNC_END(__enable_mmu)
 
@@ -883,11 +876,7 @@ SYM_FUNC_START_LOCAL(__primary_switch)
tlbivmalle1 // Remove any stale TLB entries
dsb nsh
 
-   msr sctlr_el1, x19  // re-enable the MMU
-   isb
-   ic  iallu   // flush instructions fetched
-   dsb nsh // via old mapping
-   isb
+   set_sctlr_el1   x19 // re-enable the MMU
 
bl  __relocate_kernel
 #endif
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index ece785477bdc..c967bfd30d2b 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -291,17 +291,7 @@ skip_pgd:
/* We're done: fire up the MMU again */
mrs x17, sctlr_el1
orr x17, x17, #SCTLR_ELx_M
-   msr sctlr_el1, x17
-   isb
-
-   /*
-* Invalidate the local I-cache so that any instructions fetched
-* speculatively from the PoC are discarded, since they may have
-* been dynamically patched at the PoU.
-*/
-   ic  iallu
-   dsb nsh
-   isb
+   set_sctlr_el1   x17
 
/* Set the flag to zero to indicate that we're all done */
str wzr, [flag_ptr]
-- 
2.29.2

[PATCH v4 02/21] arm64: Fix outdated TCR setup comment

2021-01-18 Thread Marc Zyngier

The arm64 kernel has long be able to use more than 39bit VAs.
Since day one, actually. Let's rewrite the offending comment.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/mm/proc.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 1f7ee8c8b7b8..ece785477bdc 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -464,8 +464,8 @@ SYM_FUNC_START(__cpu_setup)
 #endif
msr mair_el1, x5
/*
-* Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
-* both user and kernel.
+* Set/prepare TCR and TTBR. TCR_EL1.T1SZ gets further
+* adjusted if the kernel is compiled with 52bit VA support.
 */
mov_q   x10, TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
-- 
2.29.2

[PATCH v4 07/21] arm64: Simplify init_el2_state to be non-VHE only

2021-01-18 Thread Marc Zyngier

As init_el2_state is now nVHE only, let's simplify it and drop
the VHE setup.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/el2_setup.h | 36 +++---
 arch/arm64/kernel/head.S   |  2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S |  2 +-
 3 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h 
b/arch/arm64/include/asm/el2_setup.h
index 540116de80bf..d77d358f9395 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -32,16 +32,14 @@
  * to transparently mess with the EL0 bits via CNTKCTL_EL1 access in
  * EL2.
  */
-.macro __init_el2_timers mode
-.ifeqs "\mode", "nvhe"
+.macro __init_el2_timers
mrs x0, cnthctl_el2
orr x0, x0, #3  // Enable EL1 physical timers
msr cnthctl_el2, x0
-.endif
msr cntvoff_el2, xzr// Clear virtual offset
 .endm
 
-.macro __init_el2_debug mode
+.macro __init_el2_debug
mrs x1, id_aa64dfr0_el1
sbfxx0, x1, #ID_AA64DFR0_PMUVER_SHIFT, #4
cmp x0, #1
@@ -55,7 +53,6 @@
ubfxx0, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
cbz x0, .Lskip_spe_\@   // Skip if SPE not present
 
-.ifeqs "\mode", "nvhe"
mrs_s   x0, SYS_PMBIDR_EL1  // If SPE available at EL2,
and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
cbnzx0, .Lskip_spe_el2_\@   // then permit sampling of 
physical
@@ -66,10 +63,6 @@
mov x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
orr x2, x2, x0  // If we don't have VHE, then
// use EL1&0 translation.
-.else
-   orr x2, x2, #MDCR_EL2_TPMS  // For VHE, use EL2 translation
-   // and disable access from EL1
-.endif
 
 .Lskip_spe_\@:
msr mdcr_el2, x2// Configure debug traps
@@ -145,37 +138,24 @@
 
 /**
  * Initialize EL2 registers to sane values. This should be called early on all
- * cores that were booted in EL2.
+ * cores that were booted in EL2. Note that everything gets initialised as
+ * if VHE was not evailable. The kernel context will be upgraded to VHE
+ * if possible later on in the boot process
  *
  * Regs: x0, x1 and x2 are clobbered.
  */
-.macro init_el2_state mode
-.ifnes "\mode", "vhe"
-.ifnes "\mode", "nvhe"
-.error "Invalid 'mode' argument"
-.endif
-.endif
-
+.macro init_el2_state
__init_el2_sctlr
-   __init_el2_timers \mode
-   __init_el2_debug \mode
+   __init_el2_timers
+   __init_el2_debug
__init_el2_lor
__init_el2_stage2
__init_el2_gicv3
__init_el2_hstr
-
-   /*
-* When VHE is not in use, early init of EL2 needs to be done here.
-* When VHE _is_ in use, EL1 will not be used in the host and
-* requires no configuration, and all non-hyp-specific EL2 setup
-* will be done via the _EL1 system register aliases in __cpu_setup.
-*/
-.ifeqs "\mode", "nvhe"
__init_el2_nvhe_idregs
__init_el2_nvhe_cptr
__init_el2_nvhe_sve
__init_el2_nvhe_prepare_eret
-.endif
 .endm
 
 #endif /* __ARM_KVM_INIT_H__ */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 07445fd976ef..36212c05df42 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -501,7 +501,7 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr hcr_el2, x0
isb
 
-   init_el2_state nvhe
+   init_el2_state
 
/* Hypervisor stub */
adr_l   x0, __hyp_stub_vectors
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 31b060a44045..222cfc3e7190 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -189,7 +189,7 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
 2: msr SPsel, #1   // We want to use SP_EL{1,2}
 
/* Initialize EL2 CPU state to sane values. */
-   init_el2_state nvhe // Clobbers x0..x2
+   init_el2_state  // Clobbers x0..x2
 
/* Enable MMU, set vectors and stack. */
mov x0, x28
-- 
2.29.2

[PATCH v4 08/21] arm64: Move SCTLR_EL1 initialisation to EL-agnostic code

2021-01-18 Thread Marc Zyngier

We can now move the initial SCTLR_EL1 setup to be used for both
EL1 and EL2 setup.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/head.S | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 36212c05df42..b425d2587cdb 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -479,13 +479,14 @@ EXPORT_SYMBOL(kimage_vaddr)
  * booted in EL1 or EL2 respectively.
  */
 SYM_FUNC_START(init_kernel_el)
+   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
+   msr sctlr_el1, x0
+
mrs x0, CurrentEL
cmp x0, #CurrentEL_EL2
b.eqinit_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
-   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
-   msr sctlr_el1, x0
isb
mov_q   x0, INIT_PSTATE_EL1
msr spsr_el1, x0
@@ -494,9 +495,6 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
-   mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
-   msr sctlr_el1, x0
-
mov_q   x0, HCR_HOST_NVHE_FLAGS
msr hcr_el2, x0
isb
-- 
2.29.2

[PATCH v4 05/21] arm64: Initialise as nVHE before switching to VHE

2021-01-18 Thread Marc Zyngier

As we are aiming to be able to control whether we enable VHE or
not, let's always drop down to EL1 first, and only then upgrade
to VHE if at all possible.

This means that if the kernel is booted at EL2, we always start
with a nVHE init, drop to EL1 to initialise the the kernel, and
only then upgrade the kernel EL to EL2 if possible (the process
is obviously shortened for secondary CPUs).

The resume path is handled similarly to a secondary CPU boot.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/head.S | 38 ++--
 arch/arm64/kernel/hyp-stub.S | 24 +++
 arch/arm64/kernel/sleep.S|  1 +
 3 files changed, 27 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 28e9735302df..07445fd976ef 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -433,6 +433,7 @@ SYM_FUNC_START_LOCAL(__primary_switched)
bl  __pi_memset
dsb ishst   // Make zero page visible to PTW
 
+   bl  switch_to_vhe
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
bl  kasan_early_init
 #endif
@@ -493,42 +494,6 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
-#ifdef CONFIG_ARM64_VHE
-   /*
-* Check for VHE being present. x2 being non-zero indicates that we
-* do have VHE, and that the kernel is intended to run at EL2.
-*/
-   mrs x2, id_aa64mmfr1_el1
-   ubfxx2, x2, #ID_AA64MMFR1_VHE_SHIFT, #4
-#else
-   mov x2, xzr
-#endif
-   cbz x2, init_el2_nvhe
-
-   /*
-* When VHE _is_ in use, EL1 will not be used in the host and
-* requires no configuration, and all non-hyp-specific EL2 setup
-* will be done via the _EL1 system register aliases in __cpu_setup.
-*/
-   mov_q   x0, HCR_HOST_VHE_FLAGS
-   msr hcr_el2, x0
-   isb
-
-   init_el2_state vhe
-
-   isb
-
-   mov_q   x0, INIT_PSTATE_EL2
-   msr spsr_el2, x0
-   msr elr_el2, lr
-   mov w0, #BOOT_CPU_MODE_EL2
-   eret
-
-SYM_INNER_LABEL(init_el2_nvhe, SYM_L_LOCAL)
-   /*
-* When VHE is not in use, early init of EL2 and EL1 needs to be
-* done here.
-*/
mov_q   x0, INIT_SCTLR_EL1_MMU_OFF
msr sctlr_el1, x0
 
@@ -623,6 +588,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
/*
 * Common entry point for secondary CPUs.
 */
+   bl  switch_to_vhe
bl  __cpu_secondary_check52bitva
bl  __cpu_setup // initialise processor
adrpx1, swapper_pg_dir
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index fb12398b5c28..52b29c54a9f7 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -181,3 +181,27 @@ SYM_FUNC_START(__hyp_reset_vectors)
hvc #0
ret
 SYM_FUNC_END(__hyp_reset_vectors)
+
+/*
+ * Entry point to switch to VHE if deemed capable
+ */
+SYM_FUNC_START(switch_to_vhe)
+#ifdef CONFIG_ARM64_VHE
+   // Need to have booted at EL2
+   adr_l   x1, __boot_cpu_mode
+   ldr w0, [x1]
+   cmp w0, #BOOT_CPU_MODE_EL2
+   b.ne1f
+
+   // and still be at EL1
+   mrs x0, CurrentEL
+   cmp x0, #CurrentEL_EL1
+   b.ne1f
+
+   // Turn the world upside down
+   mov x0, #HVC_VHE_RESTART
+   hvc #0
+1:
+#endif
+   ret
+SYM_FUNC_END(switch_to_vhe)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 6bdef7362c0e..5bfd9b87f85d 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,6 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
bl  init_kernel_el
+   bl  switch_to_vhe
bl  __cpu_setup
/* enable the MMU early - so we can access sleep_save_stash by va */
adrpx1, swapper_pg_dir
-- 
2.29.2

[PATCH v4 06/21] arm64: Move VHE-specific SPE setup to mutate_to_vhe()

2021-01-18 Thread Marc Zyngier

There isn't much that a VHE kernel needs on top of whatever has
been done for nVHE, so let's move the little we need to the
VHE stub (the SPE setup), and drop the init_el2_state macro.

No expected functional change.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/hyp-stub.S | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 52b29c54a9f7..59820f9b8522 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -89,9 +89,6 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
msr hcr_el2, x0
isb
 
-   // Doesn't do much on VHE, but still, worth a shot
-   init_el2_state vhe
-
// Use the EL1 allocated stack, per-cpu offset
mrs x0, sp_el1
mov sp, x0
@@ -104,6 +101,31 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
mrs_s   x0, SYS_VBAR_EL12
msr vbar_el1, x0
 
+   // Fixup SPE configuration, if supported...
+   mrs x1, id_aa64dfr0_el1
+   ubfxx1, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
+   mov x2, xzr
+   cbz x1, skip_spe
+
+   // ... and not owned by EL3
+   mrs_s   x0, SYS_PMBIDR_EL1
+   and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
+   cbnzx0, skip_spe
+
+   // Let the SPE driver in control of the sampling
+   mrs_s   x0, SYS_PMSCR_EL1
+   bic x0, x0, #(1 << SYS_PMSCR_EL2_PCT_SHIFT)
+   bic x0, x0, #(1 << SYS_PMSCR_EL2_PA_SHIFT)
+   msr_s   SYS_PMSCR_EL1, x0
+   mov x2, #MDCR_EL2_TPMS
+
+skip_spe:
+   // For VHE, use EL2 translation and disable access from EL1
+   mrs x0, mdcr_el2
+   bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
+   orr x0, x0, x2
+   msr mdcr_el2, x0
+
// Transfer the MM state from EL1 to EL2
mrs_s   x0, SYS_TCR_EL12
msr tcr_el1, x0
-- 
2.29.2

[PATCH v4 09/21] arm64: cpufeature: Add global feature override facility

2021-01-18 Thread Marc Zyngier

Add a facility to globally override a feature, no matter what
the HW says. Yes, this sounds dangerous, but we do respect the
"safe" value for a given feature. This doesn't mean the user
doesn't need to know what they are doing.

Nothing uses this yet, so we are pretty safe. For now.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/cpufeature.h |  2 ++
 arch/arm64/kernel/cpufeature.c  | 44 +
 2 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 9a555809b89c..465d2cb63bfc 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -75,6 +75,8 @@ struct arm64_ftr_reg {
u64 sys_val;
u64 user_val;
const struct arm64_ftr_bits *ftr_bits;
+   u64 *override_val;
+   u64 *override_mask;
 };
 
 extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e99eddec0a46..aaa075c6f029 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -544,13 +544,17 @@ static const struct arm64_ftr_bits ftr_raz[] = {
ARM64_FTR_END,
 };
 
-#define ARM64_FTR_REG(id, table) { \
-   .sys_id = id,   \
-   .reg =  &(struct arm64_ftr_reg){\
-   .name = #id,\
-   .ftr_bits = &((table)[0]),  \
+#define ARM64_FTR_REG_OVERRIDE(id, table, v, m) {  \
+   .sys_id = id,   \
+   .reg =  &(struct arm64_ftr_reg){\
+   .name = #id,\
+   .ftr_bits = &((table)[0]),  \
+   .override_val = v,  \
+   .override_mask = m, \
}}
 
+#define ARM64_FTR_REG(id, table) ARM64_FTR_REG_OVERRIDE(id, table, NULL, NULL)
+
 static const struct __ftr_reg_entry {
u32 sys_id;
struct arm64_ftr_reg*reg;
@@ -760,6 +764,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
u64 strict_mask = ~0x0ULL;
u64 user_mask = 0;
u64 valid_mask = 0;
+   u64 override_val = 0, override_mask = 0;
 
const struct arm64_ftr_bits *ftrp;
struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -767,9 +772,38 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
if (!reg)
return;
 
+   if (reg->override_mask && reg->override_val) {
+   override_mask = *reg->override_mask;
+   override_val = *reg->override_val;
+   }
+
for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
u64 ftr_mask = arm64_ftr_mask(ftrp);
s64 ftr_new = arm64_ftr_value(ftrp, new);
+   s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+   if ((ftr_mask & override_mask) == ftr_mask) {
+   s64 tmp = arm64_ftr_safe_value(ftrp, ftr_ovr, ftr_new);
+   char *str = NULL;
+
+   if (ftr_ovr != tmp) {
+   /* Unsafe, remove the override */
+   *reg->override_mask &= ~ftr_mask;
+   *reg->override_val &= ~ftr_mask;
+   tmp = ftr_ovr;
+   str = "ignoring override";
+   } else if (ftr_new != tmp) {
+   /* Override was valid */
+   ftr_new = tmp;
+   str = "forced";
+   }
+
+   if (str)
+   pr_warn("%s[%d:%d]: %s to %llx\n",
+   reg->name,
+   ftrp->shift + ftrp->width - 1,
+   ftrp->shift, str, tmp);
+   }
 
val = arm64_ftr_set_value(ftrp, val, ftr_new);
 
-- 
2.29.2

[PATCH v4 00/21] arm64: Early CPU feature override, and applications to VHE, BTI and PAuth

2021-01-18 Thread Marc Zyngier

It recently came to light that there is a need to be able to override
some CPU features very early on, before the kernel is fully up and
running. The reasons for this range from specific feature support
(such as using Protected KVM on VHE HW, which is the main motivation
for this work) to errata workaround (a feature is broken on a CPU and
needs to be turned off, or rather not enabled).

This series tries to offer a limited framework for this kind of
problems, by allowing a set of options to be passed on the
command-line and altering the feature set that the cpufeature
subsystem exposes to the rest of the kernel. Note that this doesn't
change anything for code that directly uses the CPU ID registers.

The series completely changes the way a VHE-capable system boots, by
*always* booting non-VHE first, and then upgrading to VHE when deemed
capable. Although it sounds scary, this is actually simple to
implement (and I wish I had done that five years ago). The "upgrade to
VHE" path is then conditioned on the VHE feature not being disabled
from the command-line.

Said command-line parsing borrows a lot from the kaslr code, and
subsequently allows the "nokaslr" option to be moved to the new
infrastructure (though it all looks a bit... odd).

Further patches now add support for disabling BTI and PAuth, the
latter being based on an initial series by Srinivas Ramana[0]. There
is some ongoing discussions about being able to disable MTE, but no
clear resolution on that subject yet.

This has been tested on multiple VHE and non-VHE systems.

* From v3 [3]:
  - Fixed the VHE_RESTART stub (duh!)
  - Switched to using arm64_ftr_safe_value() instead of the user
provided value
  - Per-feature override warning

* From v2 [2]:
  - Simplify the VHE_RESTART stub
  - Fixed a number of spelling mistakes, and hopefully introduced a
few more
  - Override features in __read_sysreg_by_encoding()
  - Allow both BTI and PAuth to be overridden on the command line
  - Rebased on -rc3

* From v1 [1]:
  - Fix SPE init on VHE when EL2 doesn't own SPE
  - Fix re-init when KASLR is used
  - Handle the resume path
  - Rebased to 5.11-rc2

[0] 
https://lore.kernel.org/r/1610152163-16554-1-git-send-email-sram...@codeaurora.org
[1] https://lore.kernel.org/r/20201228104958.1848833-1-...@kernel.org
[2] https://lore.kernel.org/r/20210104135011.2063104-1-...@kernel.org
[3] https://lore.kernel.org/r/2021032811.2455113-1-...@kernel.org

Marc Zyngier (20):
  arm64: Fix labels in el2_setup macros
  arm64: Fix outdated TCR setup comment
  arm64: Turn the MMU-on sequence into a macro
  arm64: Provide an 'upgrade to VHE' stub hypercall
  arm64: Initialise as nVHE before switching to VHE
  arm64: Move VHE-specific SPE setup to mutate_to_vhe()
  arm64: Simplify init_el2_state to be non-VHE only
  arm64: Move SCTLR_EL1 initialisation to EL-agnostic code
  arm64: cpufeature: Add global feature override facility
  arm64: cpufeature: Use IDreg override in __read_sysreg_by_encoding()
  arm64: Extract early FDT mapping from kaslr_early_init()
  arm64: cpufeature: Add an early command-line cpufeature override
facility
  arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command
line
  arm64: Honor VHE being disabled from the command-line
  arm64: Add an aliasing facility for the idreg override
  arm64: Make kvm-arm.mode={nvhe,protected} an alias of
id_aa64mmfr1.vh=0
  KVM: arm64: Document HVC_VHE_RESTART stub hypercall
  arm64: Move "nokaslr" over to the early cpufeature infrastructure
  arm64: cpufeatures: Allow disabling of BTI from the command-line
  arm64: cpufeatures: Allow disabling of Pointer Auth from the
command-line

Srinivas Ramana (1):
  arm64: Defer enabling pointer authentication on boot core

 .../admin-guide/kernel-parameters.txt |   9 +
 Documentation/virt/kvm/arm/hyp-abi.rst|   9 +
 arch/arm64/include/asm/assembler.h|  17 ++
 arch/arm64/include/asm/cpufeature.h   |  10 +
 arch/arm64/include/asm/el2_setup.h|  60 ++
 arch/arm64/include/asm/pointer_auth.h |  10 +
 arch/arm64/include/asm/setup.h|  11 +
 arch/arm64/include/asm/stackprotector.h   |   1 +
 arch/arm64/include/asm/virt.h |   7 +-
 arch/arm64/kernel/Makefile|   2 +-
 arch/arm64/kernel/cpufeature.c|  75 ++-
 arch/arm64/kernel/head.S  |  75 ++-
 arch/arm64/kernel/hyp-stub.S  | 124 ++-
 arch/arm64/kernel/idreg-override.c| 199 ++
 arch/arm64/kernel/kaslr.c |  44 +---
 arch/arm64/kernel/setup.c |  15 ++
 arch/arm64/kernel/sleep.S |   1 +
 arch/arm64/kvm/arm.c  |   3 +
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|   2 +-
 arch/arm64/mm/mmu.c   |   2 +-
 arch/arm64/mm/proc.S

Re: [PATCH v2] gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default

2021-01-17 Thread Marc Zyngier

Hi Saravana,

Thanks for posting this, much appreciated.

On Sat, 16 Jan 2021 01:14:11 +,
Saravana Kannan  wrote:
> 
> There are multiple instances of GPIO devictree nodes of the form:
> 
> foo {
>   compatible = "acme,foo";
>   ...
> 
>   gpio0: gpio0@ {
>   compatible = "acme,bar";
>   ...
>   gpio-controller;
>   };
> 
>   gpio1: gpio1@ {
>   compatible = "acme,bar";
>   ...
>   gpio-controller;
>   };
> 
>   ...
> }
> 
> bazz {
>   my-gpios = <&gpio0 ...>;
> }
> 
> Case 1: The driver for "foo" populates struct device for these gpio*
> nodes and then probes them using a driver that binds with "acme,bar".
> This lines up with how DT nodes with the "compatible" property are
> generally converted to struct devices and then registered with driver
> core to probe them. This also allows the gpio* devices to hook into all
> the driver core capabilities like runtime PM, probe deferral,
> suspend/resume ordering, device links, etc.
> 
> Case 2: The driver for "foo" doesn't populate its child device nodes
> with "compatible" property and instead just loops through its child
> nodes and directly registers the GPIOs with gpiolib without ever
> populating a struct device or binding a driver to it.

That's not quite an accurate description. The gpiolib subsystem does
create a struct device. It doesn't register a driver though, which is
what causes the issue with fr_devlink (more on that below).

> 
> Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> is because fw_devlink will prevent bazz from probing until there's a
> struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> a GPIO supplier). Once the struct device is available, fw_devlink will
> create a device link between with gpio0 as the supplier and bazz as the
> consumer. After this point, the device link will prevent bazz from
> probing until its supplier (the gpio0 device) has bound to a driver.
> Once the supplier is bound to a driver, the probe of bazz is triggered
> automatically.
> 
> Finding and refactoring all the instances of drivers that follow case 2
> will cause a lot of code churn and it not something that can be done in
> one shot. Examples of such instances are [1] [2].
> 
> This patch works around this problem and avoids all the code churn by
> simply creating a stub driver to bind to the gpio_device. Since the
> gpio_device already points to the GPIO device tree node, this allows all
> the consumers to continue probing when the driver follows case 2.
> 
> If/when all the old drivers are refactored, we can revert this
> patch.

My personal gripe with this approach is that it is an abrupt change in
the way DT and device model are mapped onto each other.

As far as I know (and someone please correct me if I am wrong), there
is zero expectation that a device/subdevice/random IP block described
by a node with a "compatible" property will end-up being managed by a
driver that is bound to that particular node.

The node/subnode division is a good way to express some HW boundaries,
but doesn't say anything about the way this should be handled in the
kernel. Assuming that everything containing a "compatible" string will
eventually be bound to a driver is IMO pretty limiting.

> 
> [1] - https://lore.kernel.org/lkml/20201014191235.7f71fcb4@xhacker.debian/
> [2] - 
> https://lore.kernel.org/lkml/e28e1f38d87c12a3c714a6573beba...@kernel.org/
> Cc: Marc Zyngier 
> Cc: Jisheng Zhang 
> Cc: Kever Yang 
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Saravana Kannan 
> ---
> v1 -> v2:
> - Fixed up compilation errors that were introduced accidentally
> - Fixed a missing put_device()
> 
>  drivers/gpio/gpiolib.c | 37 +
>  1 file changed, 37 insertions(+)
> 
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index b02cc2abd3b6..12c579a953b0 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -574,6 +574,9 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void 
> *data,
>   unsignedi;
>   int base = gc->base;
>   struct gpio_device *gdev;
> + struct device_node *of_node;
> + struct fwnode_handle *fwnode;
> + struct device *fwnode_dev;
>  
>   /*
>* First: allocate and populate the internal stat container, and
> @@ -596,6 +599,22 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, 
> void *data,
>

Re: [PATCH] KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return

2021-01-15 Thread Marc Zyngier

On Tue, 29 Dec 2020 16:00:59 +, David Brazdil wrote:
> The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should
> not return, as dictated by the PSCI spec. However, there is firmware out
> there which breaks this assumption, leading to a hyp panic. Make KVM
> more robust to broken firmware by allowing these to return.

Applied to next, thanks!

[1/1] KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return
  commit: 2c91ef39216149df6703c3fa6a47dd9a1e6091c1

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH v4 03/10] irqchip/sun6i-r: Use a stacked irqchip driver

2021-01-15 Thread Marc Zyngier


On 2021-01-15 04:01, Samuel Holland wrote:

Hello,

On 1/14/21 3:06 PM, Marc Zyngier wrote:

Hi Samuel,

On 2021-01-12 05:59, Samuel Holland wrote:

[...]


+static void sun6i_r_intc_ack_nmi(void)
+{
+   writel(SUN6I_NMI_BIT, base + SUN6I_IRQ_PENDING(0));


writel_relaxed()


irq_chip_unmask_parent(), which calls gic_unmask_irq(), is called
immediately after this in .irq_unmask. Since gic_unmask_irq() also uses
writel_relaxed(), the GIC write could be ordered before the write here.


That's odd. writel() places a barrier *before* the actual write,
ensuring that this write is ordered w.r.t. previous accesses.
If you are trying to ensure ordering with what follows, you need
an explicit barrier after this access.

I guess that in the end, you may need both, as what you have orders
the access to GICC_AIR to take place before the write to this pending
register, and you also need to provide the ordering you just described.



I was getting occasional spurious interrupts (1 out of each 20-25) when
using a level trigger, which were resolved by switching to writel() 
here.


I mentioned this in the changelog, but it probably deserves a comment 
in

the code as well. Or maybe I should use an explicit barrier somewhere?


Please document it in the code. This is subtle enough to warrant a good
description.


+}
+
+static void sun6i_r_intc_nmi_ack(struct irq_data *data)
+{
+   if (irqd_get_trigger_type(data) & IRQ_TYPE_EDGE_BOTH)
+   sun6i_r_intc_ack_nmi();
+   else
+   data->chip_data = SUN6I_NMI_NEEDS_ACK;
+}
+
+static void sun6i_r_intc_nmi_eoi(struct irq_data *data)
+{
+   /* For oneshot IRQs, delay the ack until the IRQ is unmasked. */
+	if (data->chip_data == SUN6I_NMI_NEEDS_ACK && 
!irqd_irq_masked(data))

{
+   sun6i_r_intc_ack_nmi();
+   data->chip_data = 0;


nit: NULL rather than 0?


NULL seemed less appropriate since I'm not using the field as a 
pointer,

but I don't have a strong opinion about it.


chip_data *is* a pointer, which is why we conventionally use NULL rather
than an integer value. Up to you.




[...]


+static struct irq_chip sun6i_r_intc_nmi_chip = {
+   .name   = "sun6i-r-intc",
+   .irq_ack= sun6i_r_intc_nmi_ack,
+   .irq_mask   = irq_chip_mask_parent,
+   .irq_unmask = sun6i_r_intc_nmi_unmask,
+   .irq_eoi= sun6i_r_intc_nmi_eoi,
+   .irq_set_affinity   = irq_chip_set_affinity_parent,
+   .irq_set_type   = sun6i_r_intc_nmi_set_type,
+   .irq_set_irqchip_state  = sun6i_r_intc_nmi_set_irqchip_state,


You probably also want to wire irq_get_irqchip_state(), while
you're at it.


I thought if the interrupt was pending here, it would necessarily also
be pending at the GIC, so adding a separate layer would be redundant.

irq_set_vcpu_affinity(), __irq_get_irqchip_state(), and
irq_set_irqchip_state() [the functions, not the callbacks] have the
interesting property that they search up the irqdomain hierarchy for 
the

first irqdomain with the callback. So if all the callback would do is
defer to its parent, it doesn't need to be provided at all*.


Ah, of course... I even wrote that code!

Thanks,

 M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v4 04/10] irqchip/sun6i-r: Add wakeup support

2021-01-14 Thread Marc Zyngier

On Tue, 12 Jan 2021 05:59:44 +,
Samuel Holland  wrote:
> 
> Maintain bitmaps of wake-enabled IRQs and mux inputs, and program them
> to the hardware during the syscore phase of suspend and shutdown. Then
> restore the original set of enabled IRQs (only the NMI) during resume.
> 
> This serves two purposes. First, it lets power management firmware
> running on the ARISC coprocessor know which wakeup sources Linux wants
> to have enabled. That way, it can avoid turning them off when it shuts
> down the remainder of the clock tree. Second, it preconfigures the
> coprocessor's interrupt controller, so the firmware's wakeup logic
> is as simple as waiting for an interrupt to arrive.
> 
> The suspend/resume logic is not conditional on PM_SLEEP because it is
> identical to the init/shutdown logic. Wake IRQs may be enabled during
> shutdown to allow powering the board back on. As an example, see
> commit a5c5e50cce9d ("Input: gpio-keys - add shutdown callback").
> 
> Signed-off-by: Samuel Holland 
> ---
>  drivers/irqchip/irq-sun6i-r.c | 107 --
>  1 file changed, 101 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-sun6i-r.c b/drivers/irqchip/irq-sun6i-r.c
> index d04d067423f4..a1b58c98d6ca 100644
> --- a/drivers/irqchip/irq-sun6i-r.c
> +++ b/drivers/irqchip/irq-sun6i-r.c
> @@ -39,6 +39,7 @@
>   * set of 128 mux bits. This requires a second set of top-level registers.
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -46,6 +47,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -67,8 +69,17 @@
>  #define SUN6I_NR_DIRECT_IRQS 16
>  #define SUN6I_NR_MUX_BITS128
>  
> +struct sun6i_r_intc_variant {
> + u32 first_mux_irq;
> + u32 nr_mux_irqs;
> + u32 mux_valid[BITS_TO_U32(SUN6I_NR_MUX_BITS)];
> +};
> +
>  static void __iomem *base;
>  static irq_hw_number_t nmi_hwirq;
> +static DECLARE_BITMAP(wake_irq_enabled, SUN6I_NR_TOP_LEVEL_IRQS);
> +static DECLARE_BITMAP(wake_mux_enabled, SUN6I_NR_MUX_BITS);
> +static DECLARE_BITMAP(wake_mux_valid, SUN6I_NR_MUX_BITS);
>  
>  static void sun6i_r_intc_ack_nmi(void)
>  {
> @@ -145,6 +156,21 @@ static int sun6i_r_intc_nmi_set_irqchip_state(struct 
> irq_data *data,
>   return irq_chip_set_parent_state(data, which, state);
>  }
>  
> +static int sun6i_r_intc_irq_set_wake(struct irq_data *data, unsigned int on)
> +{
> + unsigned long offset_from_nmi = data->hwirq - nmi_hwirq;
> +
> + if (offset_from_nmi < SUN6I_NR_DIRECT_IRQS)
> + assign_bit(offset_from_nmi, wake_irq_enabled, on);
> + else if (test_bit(data->hwirq, wake_mux_valid))
> + assign_bit(data->hwirq, wake_mux_enabled, on);
> + else
> + /* Not wakeup capable. */
> + return -EPERM;
> +
> + return 0;
> +}
> +
>  static struct irq_chip sun6i_r_intc_nmi_chip = {
>   .name   = "sun6i-r-intc",
>   .irq_ack= sun6i_r_intc_nmi_ack,
> @@ -154,8 +180,19 @@ static struct irq_chip sun6i_r_intc_nmi_chip = {
>   .irq_set_affinity   = irq_chip_set_affinity_parent,
>   .irq_set_type   = sun6i_r_intc_nmi_set_type,
>   .irq_set_irqchip_state  = sun6i_r_intc_nmi_set_irqchip_state,
> - .flags  = IRQCHIP_SET_TYPE_MASKED |
> -   IRQCHIP_SKIP_SET_WAKE,
> + .irq_set_wake   = sun6i_r_intc_irq_set_wake,
> + .flags  = IRQCHIP_SET_TYPE_MASKED,
> +};
> +
> +static struct irq_chip sun6i_r_intc_wakeup_chip = {
> + .name   = "sun6i-r-intc",
> + .irq_mask   = irq_chip_mask_parent,
> + .irq_unmask = irq_chip_unmask_parent,
> + .irq_eoi= irq_chip_eoi_parent,
> + .irq_set_affinity   = irq_chip_set_affinity_parent,
> + .irq_set_type   = irq_chip_set_type_parent,
> + .irq_set_wake   = sun6i_r_intc_irq_set_wake,
> + .flags  = IRQCHIP_SET_TYPE_MASKED,

Worth implementing irq_get/set_irqchip_state() using the _parent
helper, I guess.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 5743 matches

Mail list logo