At the moment the KVM VGICv3 only supports a single redistributor
region (whose base address is set through the GICv3 kvm device
KVM_DEV_ARM_VGIC_GRP_ADDR/KVM_VGIC_V3_ADDR_TYPE_REDIST). There,
all the redistributors are laid out contiguously. The size of this
single redistributor region is not set
in case kvm_vgic_map_resources() fails, typically if the vgic
distributor is not defined, __kvm_vgic_destroy will be called
several times. Indeed kvm_vgic_map_resources() is called on
first vcpu run. As a result dist->spis is freeed more than once
and on the second time it causes a "kernel BUG at m
We introduce a new KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attribute in
KVM_DEV_ARM_VGIC_GRP_ADDR group. It allows userspace to provide the
base address and size of a redistributor region
Compared to KVM_VGIC_V3_ADDR_TYPE_REDIST, this new attribute allows
to declare several separate redistributor regi
At the moment KVM supports a single rdist region. We want to
support several separate rdist regions so let's introduce a list
of them. This patch currently only cares about a single
entry in this list as the functionality to register several redist
regions is not yet there. So this only translates
We introduce vgic_v3_rdist_free_slot to help identifying
where we can place a new 2x64KB redistributor.
Signed-off-by: Eric Auger
Reviewed-by: Christoffer Dall
---
v3 -> v4:
- add details to vgic_v3_rdist_free_slot kernel doc comment
- Added Christoffer's R-b
---
virt/kvm/arm/vgic/vgic-mmio-v
The TYPER of an redistributor reflects whether the rdist is
the last one of the redistributor region. Let's compare the TYPER
GPA against the address of the last occupied slot within the
redistributor region.
Signed-off-by: Eric Auger
Reviewed-by: Christoffer Dall
---
v3 -> v4:
- added Christo
vgic_v3_check_base() currently only handles the case of a unique
legacy redistributor region whose size is not explicitly set but
inferred, instead, from the number of online vcpus.
We adapt it to handle the case of multiple redistributor regions
with explicitly defined size. We rely on two new he
kvm_vgic_vcpu_early_init gets called after kvm_vgic_cpu_init which
is confusing. The call path is as follows:
kvm_vm_ioctl_create_vcpu
|_ kvm_arch_cpu_create
|_ kvm_vcpu_init
|_ kvm_arch_vcpu_init
|_ kvm_vgic_vcpu_init
|_ kvm_arch_vcpu_postcreate
|_ kvm_vgic_vcpu_early_init
St
We introduce a new helper that creates and inserts a new redistributor
region into the rdist region list. This helper both handles the case
where the redistributor region size is known at registration time
and the legacy case where it is not (eventually depending on the number
of online vcpus). Dep
As we are going to register several redist regions,
vgic_register_all_redist_iodevs() may be called several times. We need
to register a redist_iodev for a given vcpu only once. So let's
check if the base address has already been set. Initialize this latter
in kvm_vgic_vcpu_init().
Signed-off-by:
This new attribute allows the userspace to set the base address
of a reditributor region, relaxing the constraint of having all
consecutive redistibutor frames contiguous.
Signed-off-by: Eric Auger
Acked-by: Christoffer Dall
---
v3 -> v4:
- keep previous indentation for existing attributes
- a
On vcpu first run, we eventually know the actual number of vcpus.
This is a synchronization point to check all redistributors
were assigned. On kvm_vgic_map_resources() we check both dist and
redist were set, eventually check potential base address inconsistencies.
Signed-off-by: Eric Auger
Revie
Now all the internals are ready to handle multiple redistributor
regions, let's allow the userspace to register them.
Signed-off-by: Eric Auger
Reviewed-by: Christoffer Dall
---
v5 -> v6:
- added Christoffer's R-b
v4 -> v5:
- s/uint_t/u
- fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
- fix read
Let's raise the number of supported vcpus along with
vgic v3 now that HW is looming with more physical CPUs.
Signed-off-by: Eric Auger
Acked-by: Christoffer Dall
---
v4 -> v5:
- addded Christoffer's A-b
---
include/kvm/arm_vgic.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --g
On Mon, May 21, 2018 at 08:21:58PM -0700, Florian Fainelli wrote:
>
>
> On 05/21/2018 04:44 AM, Russell King wrote:
> > Harden the branch predictor against Spectre v2 attacks on context
> > switches for ARMv7 and later CPUs. We do this by:
> >
> > Cortex A9, A12, A17, A73, A75: invalidating the
Hi Russell,
On Wed, May 16, 2018 at 1:01 PM, Russell King
wrote:
> When the branch predictor hardening is enabled, firmware must have set
> the IBE bit in the auxiliary control register. If this bit has not
> been set, the Spectre workarounds will not be functional.
>
> Add validation that this
On Tue, May 22, 2018 at 12:38:05PM +0200, Geert Uytterhoeven wrote:
> Hi Russell,
>
> On Wed, May 16, 2018 at 1:01 PM, Russell King
> wrote:
> > When the branch predictor hardening is enabled, firmware must have set
> > the IBE bit in the auxiliary control register. If this bit has not
> > been
On Sun, May 20, 2018 at 02:14:41PM +0100, Marc Zyngier wrote:
> On Wed, 16 May 2018 11:49:42 +0100
> Dave Martin wrote:
>
> Hi Dave,
>
> > Hi Marc,
> >
> > This is a trivial update to the previously posted v7 [1]. The only
> > changes are a couple of minor cosmetic changes requested by reviewe
/linux/commits/Eric-Auger/KVM-arm-arm64-Allow-multiple-GICv3-redistributor-regions/20180522-004717
base: https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next
config: arm-axm55xx_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce
please drop us a note to
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Eric-Auger/KVM-arm-arm64-Allow-multiple-GICv3-redistributor-regions/20180522-004717
> base: https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next
> con
Hi all,
This patch series implements the Linux kernel side of the "Spectre-v4"
(CVE-2018-3639) mitigation known as "Speculative Store Bypass Disable"
(SSBD).
More information can be found at:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1528
https://developer.arm.com/support/ar
We've so far used the PSCI return codes for SMCCC because they
were extremely similar. But with the new ARM DEN 0070A specification,
"NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS".
Let's bite the bullet and add SMCCC specific return codes. Users
can be repainted as and when
In a heterogeneous system, we can end up with both affected and
unaffected CPUs. Let's check their status before calling into the
firmware.
Signed-off-by: Marc Zyngier
---
arch/arm64/kernel/cpu_errata.c | 2 ++
arch/arm64/kernel/entry.S | 11 +++
2 files changed, 9 insertions(+), 4
In order for the kernel to protect itself, let's call the SSBD mitigation
implemented by the higher exception level (either hypervisor or firmware)
on each transition between userspace and kernel.
We must take the PSCI conduit into account in order to target the
right exception level, hence the in
As for Spectre variant-2, we rely on SMCCC 1.1 to provide the
discovery mechanism for detecting the SSBD mitigation.
A new capability is also allocated for that purpose, and a
config option.
Signed-off-by: Marc Zyngier
---
arch/arm64/Kconfig | 9 ++
arch/arm64/include/asm/cpu
On a system where the firmware implements ARCH_WORKAROUND_2,
it may be useful to either permanently enable or disable the
workaround for cases where the user decides that they'd rather
not get a trap overhead, and keep the mitigation permanently
on or off instead of switching it on exception entry/
We're about to need the mitigation state in various parts of the
kernel in order to do the right thing for userspace and guests.
Let's expose an accessor that will let other subsystems know
about the state.
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/cpufeature.h | 10 ++
1 f
In order to avoid checking arm64_ssbd_callback_required on each
kernel entry/exit even if no mitigation is required, let's
add yet another alternative that by default jumps over the mitigation,
and that gets nop'ed out if we're doing dynamic mitigation.
Think of it as a poor man's static key...
S
On a system where firmware can dynamically change the state of the
mitigation, the CPU will always come up with the mitigation enabled,
including when coming back from suspend.
If the user has requested "no mitigation" via a command line option,
let's enforce it by calling into the firmware again
In order to allow userspace to be mitigated on demand, let's
introduce a new thread flag that prevents the mitigation from
being turned off when exiting to userspace, and doesn't turn
it on on entry into the kernel (with the assumtion that the
mitigation is always enabled in the kernel itself).
Th
If running on a system that performs dynamic SSBD mitigation, allow
userspace to request the mitigation for itself. This is implemented
as a prctl call, allowing the mitigation to be enabled or disabled at
will for this particular thread.
Signed-off-by: Marc Zyngier
---
arch/arm64/kernel/Makefil
As we're going to require to access per-cpu variables at EL2,
let's craft the minimum set of accessors required to implement
reading a per-cpu variable, relying on tpidr_el2 to contain the
per-cpu offset.
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/kvm_a
In order to offer ARCH_WORKAROUND_2 support to guests, we need
a bit of infrastructure.
Let's add a flag indicating whether or not the guest uses
SSBD mitigation. Depending on the state of this flag, allow
KVM to disable ARCH_WORKAROUND_2 before entering the guest,
and enable it when exiting it.
In order to forward the guest's ARCH_WORKAROUND_2 calls to EL3,
add a small(-ish) sequence to handle it at EL2. Special care must
be taken to track the state of the guest itself by updating the
workaround flags. We also rely on patching to enable calls into
the firmware.
Note that since we need to
Now that all our infrastructure is in place, let's expose the
availability of ARCH_WORKAROUND_2 to guests. We take this opportunity
to tidy up a couple of SMCCC constants.
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm/include/asm/kvm_host.h | 12
arch/arm64/
On 05/22/2018 08:06 AM, Marc Zyngier wrote:
> On a system where the firmware implements ARCH_WORKAROUND_2,
> it may be useful to either permanently enable or disable the
> workaround for cases where the user decides that they'd rather
> not get a trap overhead, and keep the mitigation permanently
>
Note: Most of these patches are Arm-specific. People not Cc'd on the
whole series can find it in the linux-arm-kernel archive [2].
This series aims to improve the way FPSIMD context is handled by KVM.
Changes since the previous v9 [1] are mostly minor, but there are some
fixes worthy of closer at
From: Christoffer Dall
KVM/ARM differs from other architectures in having to maintain an
additional virtual address space from that of the host and the
guest, because we split the execution of KVM across both EL1 and
EL2.
This results in a need to explicitly map data structures into EL2
(hyp) wh
This patch uses the new update_thread_flag() helpers to simplify a
couple of if () set; else clear; constructs.
No functional change.
Signed-off-by: Dave Martin
Acked-by: Marc Zyngier
Acked-by: Catalin Marinas
Cc: Will Deacon
---
arch/arm64/kernel/fpsimd.c | 18 --
1 file cha
There are a number of bits of code sprinkled around the kernel to
set a thread flag if a certain condition is true, and clear it
otherwise.
To help make those call sites terser and less cumbersome, this
patch adds a new family of thread flag manipulators
update*_thread_flag([...,] flag, c
fpsimd_last_state.st is set to NULL as a way of indicating that
current's FPSIMD registers are no longer loaded in the cpu. In
particular, this is done when the kernel temporarily uses or
clobbers the FPSIMD registers for its own purposes, as in CPU PM or
kernel-mode NEON, resulting in them being
To make the lazy FPSIMD context switch trap code easier to hack on,
this patch converts it to C.
This is not amazingly efficient, but the trap should typically only
be taken once per host context switch.
Signed-off-by: Dave Martin
Reviewed-by: Marc Zyngier
---
arch/arm64/kvm/hyp/entry.S | 57
In preparation for optimising the way KVM manages switching the
guest and host FPSIMD state, it is necessary to provide a means for
code outside arch/arm64/kernel/fpsimd.c to restore the user trap
configuration for SVE correctly for the current task.
Rather than requiring external code to duplicat
In struct vcpu_arch, the debug_flags field is used to store
debug-related flags about the vcpu state.
Since we are about to add some more flags related to FPSIMD and
SVE, it makes sense to add them to the existing flags field rather
than adding new fields. Since there is only one debug_flags flag
sve_pffr(), which is used to derive the base address used for
low-level SVE save/restore routines, currently takes the relevant
task_struct as an argument.
The only accessed fields are actually part of thread_struct, so
this patch changes the argument type accordingly. This is done in
preparation
Having read_zcr_features() inline in cpufeature.h results in that
header requiring #includes which make it hard to include
elsewhere without triggering header inclusion
cycles.
This is not a hot-path function and arguably should not be in
cpufeature.h in the first place, so this patch moves it to
In order to make sve_save_state()/sve_load_state() more easily
reusable and to get rid of a potential branch on context switch
critical paths, this patch makes sve_pffr() inline and moves it to
fpsimd.h.
must be included in fpsimd.h in order to make
this work, and this creates an #include cycle t
This patch adds SVE context saving to the hyp FPSIMD context switch
path. This means that it is no longer necessary to save the host
SVE state in advance of entering the guest, when in use.
In order to avoid adding pointless complexity to the code, VHE is
assumed if SVE is in use. VHE is an arch
This patch refactors KVM to align the host and guest FPSIMD
save/restore logic with each other for arm64. This reduces the
number of redundant save/restore operations that must occur, and
reduces the common-case IRQ blackout time during guest exit storms
by saving the host state lazily and optimis
Currently the FPSIMD handling code uses the condition task->mm ==
NULL as a hint that task has no FPSIMD register context.
The ->mm check is only there to filter out tasks that cannot
possibly have FPSIMD context loaded, for optimisation purposes.
Also, TIF_FOREIGN_FPSTATE must always be checked a
Now that the host SVE context can be saved on demand from Hyp,
there is no longer any need to save this state in advance before
entering the guest.
This patch removes the relevant call to
kvm_fpsimd_flush_cpu_state().
Since the problem that function was intended to solve now no longer
exists, the
In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD
contexts to be handled by the fpsimd common code, this patch adapts
task_fpsimd_save() to save back the currently loaded context,
removing the explicit dependency on current.
The relevant storage to write back to in memory is now found b
In fixup_guest_exit(), there are a couple of cases where after
checking what the exit code was, we assign it explicitly with the
value it already had.
Assuming this is not indicative of a bug, these assignments are not
needed.
This patch removes the redundant assignments, and simplifies some
if-n
The entire tail of fixup_guest_exit() is contained in if statements
of the form if (x && *exit_code == ARM_EXCEPTION_TRAP). As a result,
we can check just once and bail out of the function early, allowing
the remaining if conditions to be simplified.
The only awkward case is where *exit_code is c
The conversion of the FPSIMD context switch trap code to C has added
some overhead to calling it, due to the need to save registers that
the procedure call standard defines as caller-saved.
So, perhaps it is no longer worth invoking this trap handler quite
so early.
Instead, we can invoke it from
On Tue, 22 May 2018 16:48:42 +0100,
Dominik Brodowski wrote:
>
>
> On Tue, May 22, 2018 at 04:06:44PM +0100, Marc Zyngier wrote:
> > If running on a system that performs dynamic SSBD mitigation, allow
> > userspace to request the mitigation for itself. This is implemented
> > as a prctl call, all
On Mon, May 14, 2018 at 6:24 PM, Nick Desaulniers
wrote:
> On Fri, Apr 20, 2018 at 7:59 AM Andrey Konovalov
> wrote:
>> On Fri, Apr 20, 2018 at 10:13 AM, Marc Zyngier
> wrote:
>> >> The issue is that
>> >> clang doesn't know about the "S" asm constraint. I reported this to
>> >> clang [2], and h
On 21/05/18 12:45, Russell King wrote:
> In order to prevent aliasing attacks on the branch predictor,
> invalidate the BTB or instruction cache on CPUs that are known to be
> affected when taking an abort on a address that is outside of a user
> task limit:
>
> Cortex A8, A9, A12, A17, A73, A75:
On 21/05/18 12:45, Russell King wrote:
> Add PSCI based hardening for cores that require more complex handling in
> firmware.
>
> Signed-off-by: Russell King
> Acked-by: Marc Zyngier
> ---
> arch/arm/mm/proc-v7-bugs.c | 50
> ++
> arch/arm/mm/proc-v7
On Tue, May 22, 2018 at 06:15:02PM +0100, Marc Zyngier wrote:
> On 21/05/18 12:45, Russell King wrote:
> > In order to prevent aliasing attacks on the branch predictor,
> > invalidate the BTB or instruction cache on CPUs that are known to be
> > affected when taking an abort on a address that is ou
On Tue, May 22, 2018 at 06:24:13PM +0100, Marc Zyngier wrote:
> On 21/05/18 12:45, Russell King wrote:
> > Add PSCI based hardening for cores that require more complex handling in
> > firmware.
> >
> > Signed-off-by: Russell King
> > Acked-by: Marc Zyngier
> > ---
> > arch/arm/mm/proc-v7-bugs.c
On Sat, May 19, 2018 at 12:44 PM, Marc Zyngier wrote:
> That would definitely be the right thing to do. Make sure you (or
> Andrey tests with the latest released mainline kernel (4.16 for now)
> or (even better) the tip of Linus' tree.
Hi!
I can confirm that after applying this patch onto 4.17-r
On Tue, May 22, 2018 at 06:56:03PM +0100, Russell King - ARM Linux wrote:
> On Tue, May 22, 2018 at 06:15:02PM +0100, Marc Zyngier wrote:
> > On 21/05/18 12:45, Russell King wrote:
> > > In order to prevent aliasing attacks on the branch predictor,
> > > invalidate the BTB or instruction cache on C
On 05/22/2018 11:12 AM, Russell King - ARM Linux wrote:
> On Tue, May 22, 2018 at 06:56:03PM +0100, Russell King - ARM Linux wrote:
>> On Tue, May 22, 2018 at 06:15:02PM +0100, Marc Zyngier wrote:
>>> On 21/05/18 12:45, Russell King wrote:
In order to prevent aliasing attacks on the branch pre
On Tue, May 22, 2018 at 04:06:44PM +0100, Marc Zyngier wrote:
> If running on a system that performs dynamic SSBD mitigation, allow
> userspace to request the mitigation for itself. This is implemented
> as a prctl call, allowing the mitigation to be enabled or disabled at
> will for this particul
* Russell King [180521 12:06]:
> Harden the branch predictor against Spectre v2 attacks on context
> switches for ARMv7 and later CPUs. We do this by:
>
> Cortex A9, A12, A17, A73, A75: invalidating the BTB.
> Cortex A15, Brahma B15: invalidating the instruction cache.
>
> Cortex A57 and Cortex
* Russell King [180521 12:09]:
> When the branch predictor hardening is enabled, firmware must have set
> the IBE bit in the auxiliary control register. If this bit has not
> been set, the Spectre workarounds will not be functional.
>
> Add validation that this bit is set, and print a warning at
On Fri, May 18, 2018 at 11:13 AM Marc Zyngier wrote:
> > - you have checked that with a released version of the compiler, you
On Tue, May 22, 2018 at 10:58 AM Andrey Konovalov
wrote:
> Tested-by: Andrey Konovalov
Hi Andrey,
Thank you very much for this report. Can you confirm as well the vers
On 11/27/2017 11:38 AM, Mark Rutland wrote:
> This patch adds basic support for pointer authentication, allowing
> userspace to make use of APIAKey. The kernel maintains an APIAKey value
> for each process (shared by all threads within), which is initialised to
> a random value at exec() time.
>
>
On 5/3/2018 9:20 AM, Mark Rutland wrote:
> This patch adds basic support for pointer authentication, allowing
> userspace to make use of APIAKey. The kernel maintains an APIAKey value
> for each process (shared by all threads within), which is initialised to
> a random value at exec() time.
>
> To
On Tue, May 22, 2018 at 06:15:02PM +0100, Marc Zyngier wrote:
> On 21/05/18 12:45, Russell King wrote:
> > + switch (read_cpuid_part()) {
> > + case ARM_CPU_PART_CORTEX_A8:
> > + case ARM_CPU_PART_CORTEX_A9:
> > + case ARM_CPU_PART_CORTEX_A12:
> > + case ARM_CPU_PART_CORTEX_A17:
> > + c
71 matches
Mail list logo