In ACPI systems, the OS can direct power management, as opposed to the
firmware. This OS-directed Power Management is called OSPM. Part of
telling the firmware that the OS going to direct power management is
making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC
methods must be
So that it doesn't break the build when CONFIG_XEN or CONFIG_XEN_DOM0
are not enabled.
The current header is only included when CONFIG_XEN_DOM0 is selected,
so instead place it in the top level xen.h header, and also use the
dummy helper when CONFIG_X86 is not selected, as the current
In ACPI systems, the OS can direct power management, as opposed to the
firmware. This OS-directed Power Management is called OSPM. Part of
telling the firmware that the OS going to direct power management is
making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC
methods must be
The handling of the MSI-X table accesses by Xen requires that any
pages part of the MSI-X related tables are not mapped into the domain
physmap. As a result, any device registers in the same pages as the
start or the end of the MSIX or PBA tables is not currently
accessible, as the accesses are
In ACPI systems, the OS can direct power management, as opposed to the
firmware. This OS-directed Power Management is called OSPM. Part of
telling the firmware that the OS going to direct power management is
making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC
methods must be
In ACPI systems, the OS can direct power management, as opposed to the
firmware. This OS-directed Power Management is called OSPM. Part of
telling the firmware that the OS going to direct power management is
making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC
methods must be
Slightly change the meaning of the command line
gnttab_max_{maptrack_,}frames: do not use them as upper bounds for the
passed values at domain creation, instead just use them as defaults
in the absence of any provided value.
It's not very useful for the options to be used both as defaults and
as
The handling of the MSI-X table accesses by Xen requires that any
pages part of the MSI-X table are not mapped into the domain physmap.
As a result, any device registers in the same pages as the start or
the end of the MSIX table is not currently accessible, as the accesses
are just dropped.
Note
Slightly change the meaning of the command line
gnttab_max_{maptrack_,}frames: do not use them as upper bounds for the
passed values at domain creation, instead just use them as defaults
in the absence of any provided value.
It's not very useful for the options to be used both as defaults and
as
Introduce an install target, like it's used by other tests. This
allows running the test on the installed systems, which is easier than
running it during the build phase when dealing with automated testing.
Strictly speaking the vpci test doesn't require to be run on a Xen
host currently, but
Introduce an install target, like it's used by other tests. This
allows running the test on the installed systems, which is easier than
running it during the build phase when dealing with automated testing.
Strictly speaking the vpci test doesn't require to be run on a Xen
host currently, but
Under certain conditions guests can get the CPU stuck in an unbounded
loop without the possibility of an interrupt window to occur on
instruction boundary. This was the case with the scenarios described
in XSA-156.
Make use of the Notify VM Exit mechanism, that will trigger a VM Exit
if no
Add support for enabling guest Bus Lock Detection on Intel systems.
Such detection works by triggering a vmexit, which ought to be enough
of a pause to prevent a guest from abusing of the Bus Lock.
Add an extra Xen perf counter to track the number of Bus Locks detected.
This is done because Bus
Introduce a small helper to OR VMX_INTR_SHADOW_NMI in
GUEST_INTERRUPTIBILITY_INFO in order to help dealing with the NMI
unblocked by IRET case. Replace the existing usage in handling
EXIT_REASON_EXCEPTION_NMI and also add such handling to EPT violations
and page-modification log-full events.
Hello,
Following series implements support for bus lock and notify VM exit.
Patches are not really dependent, but I've developed them together by
virtue of both features being in Intel Instructions Set Extensions PR
Chapter 9.
Thanks, Roger.
Roger Pau Monne (3):
x86/vmx: implement VMExit
The currently lockless access to the xen console list in
vtermno_to_xencons() is incorrect, as additions and removals from the
list can happen anytime, and as such the traversal of the list to get
the private console data for a given termno needs to happen with the
lock held. Note users that
The hvc machinery registers both a console and a tty device based on
the hv ops provided by the specific implementation. Those two
interfaces however have different locks, and there's no single locks
that's shared between the tty and the console implementations, hence
the driver needs to protect
The currently lockless access to the xen console list in
vtermno_to_xencons() is incorrect, as additions and removals from the
list can happen anytime, and as such the traversal of the list to get
the private console data for a given termno needs to happen with the
lock held. Note users that
The hvc machinery registers both a console and a tty device based on
the hv ops provided by the specific implementation. Those two
interfaces however have different locks, and there's no single locks
that's shared between the tty and the console implementations, hence
the driver needs to protect
Currently the vga command line gfx- option is ignored when booted
using multboot2 and EFI, as the setting of the GOP mode is done way
before the command line is processed.
Add support for parsing the vga gfx- selection if present in order to
set the selected GOP mode.
Signed-off-by: Roger Pau
Only set the GOP mode if vga is selected in the console option,
otherwise just fetch the information from the current mode in order to
make it available to dom0.
Introduce support for passing the command line to the efi_multiboot2()
helper, and parse the console= option if present.
.
Marek: after this series using console= without the vga option should
result in Xen not attempting to touch the selected GOP mode and the
screen not getting cleared.
Thanks, Roger.
Roger Pau Monne (5):
x86/platform: introduce hypercall to get initial video console
settings
efi: only set
Modify efi_find_gop_mode() so that passing cols or rows as 0 is
interpreted as a request to attempt to keep the currently set mode,
and do so if the mode query for information is successful and the depth
is supported.
Signed-off-by: Roger Pau Monné
---
xen/common/efi/boot.c | 20
Do not unconditionally set a mode in efi_console_set_mode(), do so
only if the currently set mode is not valid.
Signed-off-by: Roger Pau Monné
---
xen/common/efi/boot.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index
This is required so PVH dom0 can get the initial video console state
as handled by Xen. PV dom0 will get this as part of the start_info,
but it doesn't seem necessary to place such information in the
HVM start info.
Signed-off-by: Roger Pau Monné
---
xen/arch/x86/platform_hypercall.c | 11
When running as a PVH dom0 the ACPI MADT is crafted by Xen in order to
report the correct numbers of vCPUs that dom0 has, so the host MADT is
not provided to dom0. This creates issues when parsing the power and
performance related data from ACPI dynamic tables, as the ACPI
Processor UIDs found on
The Processor _PDC buffer bits notify ACPI of the OS capabilities, and
so ACPI can adjust the return of other Processor methods taking the OS
capabilities into account.
When Linux is running as a Xen dom0, it's the hypervisor the entity
in charge of processor power management, and hence Xen needs
When running as a Xen dom0 the number of CPUs available to Linux can
be different from the number of CPUs present on the system, but in
order to properly fetch processor performance related data _PDC must
be executed on all the physical CPUs online on the system.
The current checks in
ght be better to just execute _PDC from
that same Xen ACPI Processor driver instead of polluting the generic
ACPI Processor driver.
The series should be taken as a RFC partially, due to my own doubts
about whether the current implementation is indeed the right one moving
forward.
Thanks, Roger.
Roger
On one of my boxes when the HDMI cable is not plugged in the
FrameBufferBase of the EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE structure is
set to 0 by the firmware (while some of the other fields looking
plausible).
Such (bogus address) ends up mapped in vesa_init(), and since it
overlaps with a RAM
On one of my boxes when the HDMI cable is not plugged in the
FrameBufferBase of the EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE structure is
set to 0 by the firmware (while some of the other fields looking
plausible).
Such (bogus address) ends up mapped in vesa_init(), and since it
overlaps with a RAM
Currently Xen will passthrough any Local APIC NMI Structure found in
the native ACPI MADT table to a PVH dom0. This is wrong because PVH
doesn't have access to the physical local APIC, and instead gets an
emulated local APIC by Xen, that doesn't have the LINT0 or LINT1
pins wired to anything.
Current code in _clear_irq_vector() will mark the irq as unused before
doing the cleanup required when move_in_progress is true.
This can lead to races in create_irq() if the function picks an irq
desc that's been marked as unused but has move_in_progress set, as the
call to assign_irq_vector()
Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched
on vm{entry,exit} there's no need to use a synthetic feature bit for
it anymore.
Remove the bit and instead use a global variable.
No functional change intended.
Signed-off-by: Roger Pau Monné
Reviewed-by: Jan Beulich
This fixes an issue with running C code in a GIF=0 region, that's
problematic when using UBSAN or other instrumentation techniques.
The current logic for AMD SSBD context switches it on every
vm{entry,exit} if the Xen and guest selections don't match. This is
expensive when not using SPEC_CTRL,
privately with patch 2/2, but
I'm still sending it so that comments can be made publicly (or the patch
applied).
Thanks, Roger.
Roger Pau Monne (2):
amd/virt_ssbd: set SSBD at vCPU context switch
amd: remove VIRT_SC_MSR_HVM synthetic feature
docs/misc/xen-command-line.pandoc | 10
Current code in _clear_irq_vector() will mark the irq as unused before
doing the cleanup required when move_in_progress is true.
This can lead to races in create_irq() if the function picks an irq
desc that's been marked as unused but has move_in_progress set, as the
call to assign_irq_vector()
The current logic in the Intel PMC driver will forcefully attach it
when detecting any CPU on the intel_pmc_core_platform_ids array,
even if the matching ACPI device is not present.
There's no checking in pmc_core_probe() to assert that the PMC device
is present, and hence on virtualized
The current logic in the Intel PMC driver will forcefully attach it
when detecting any CPU on the intel_pmc_core_platform_ids array,
even if the matching ACPI device is not present.
There's no checking in pmc_core_probe() to assert that the PMC device
is present, and hence on virtualized
The VFCT ACPI table is used by AMD GPUs to expose the vbios ROM image
from the firmware instead of doing it on the PCI ROM on the physical
device.
As such, this needs to be available for PVH dom0 to access, or else
the GPU won't work.
Reported-by: Huang Rui
Signed-off-by: Roger Pau Monné
---
Like on the Arm side, return -EINVAL when attempting to do a p2m
operation on dying domains.
The current logic returns 0 and leaves the domctl parameter
uninitialized for any parameter fetching operations (like the
GET_ALLOCATION operation), which is not helpful from a toolstack point
of view,
The current reporting of the hardware assisted APIC options is done by
checking "virtualize APIC accesses" which is not very helpful, as that
feature doesn't avoid a vmexit, instead it does provide some help in
order to detect APIC MMIO accesses in vmexit processing.
Repurpose the current
This reverts commit adb715db698bc8ec3b88c24eb88b21e9da5b6c07.
The dumping of stacks for HVM guests is problematic, since it requires
taking the p2m lock in order to walk the guest page tables and the
p2m.
The suggested solution to the issue is to introduce and use a lockless
p2m walker, that
the
flags so we don't release a version of Xen with a set of flags that we
will then either remove or use to report different hardware features.
Thanks, Roger.
Roger Pau Monne (2):
viridian: suggest MSR APIC accesses if MSR accesses are accelerated
hvm/apic: repurpose the reporting of the APIC
The "APIC register virtualization" Intel hardware feature applies to
both MMIO or MSR APIC accesses depending on whether "virtualize x2APIC
mode" is also available.
As such also suggest MSR APIC accesses if both "APIC register
virtualization" and "virtualize x2APIC mode" features are available.
The current reporting of the hardware assisted APIC options is done by
checking "virtualize APIC accesses" which is not very helpful, as that
feature doesn't avoid a vmexit, instead it does provide some help in
order to detect APIC MMIO accesses in vmexit processing.
Repurpose the current
The current logic for AMD SSBD context switches it on every
vm{entry,exit} if the Xen and guest selections don't match. This is
expensive when not using SPEC_CTRL, and hence should be avoided as
much as possible.
When SSBD is not being set from SPEC_CTRL on AMD don't context switch
at
Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched
on vm{entry,exit} there's no need to use a synthetic feature bit for
it anymore.
Remove the bit and instead use a global variable.
No functional change intended.
Signed-off-by: Roger Pau Monné
Reviewed-by: Jan Beulich
Hello,
Just two patches remaining, and the last one is already Acked.
First patch deals with moving the switching of SSBD from guest
vm{entry,exit} to vCPU context switch, and lets Xen run with the guest
SSBD selection under some circumstances by default.
Thanks, Roger.
Roger Pau Monne (2
The current logic for AMD SSBD context switches it on every
vm{entry,exit} if the Xen and guest selections don't match. This is
expensive when not using SPEC_CTRL, and hence should be avoided as
much as possible.
When SSBD is not being set from SPEC_CTRL on AMD don't context switch
at
Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched
on vm{entry,exit} there's no need to use a synthetic feature bit for
it anymore.
Remove the bit and instead use a global variable.
No functional change intended.
Signed-off-by: Roger Pau Monné
Reviewed-by: Jan Beulich
---
The current logic for AMD SSBD context switches it on every
vm{entry,exit} if the Xen and guest selections don't match. This is
expensive when not using SPEC_CTRL, and hence should be avoided as
much as possible.
When SSBD is not being set from SPEC_CTRL on AMD don't context switch
at
Add MSR_VIRT_SPEC_CTRL to the list of MSRs handled by
hvm_load_cpu_msrs(), or else it would be lost.
Fixes: 8ffd5496f4 ('amd/msr: implement VIRT_SPEC_CTRL for HVM guests on top of
SPEC_CTRL')
Signed-off-by: Roger Pau Monné
---
I'm confused as to why we have two different list of MSR to send and
on a
platform that exposes VIRT_SSBD itself. I think the path is
sufficiently similar to the legacy one.
Currently running a gitlab CI loop in order to check everything is OK.
Roger Pau Monne (3):
hvm/msr: load VIRT_SPEC_CTRL
amd/virt_ssbd: set SSBD at vCPU context switch
amd: remove
Commit 75cc460a1b added checks to ensure the position of the BARs from
PCI devices don't overlap with regions defined on the memory map.
When there's a collision memory decoding is left disabled for the
device, assuming that dom0 will reposition the BAR if necessary and
enable memory decoding.
Writes to the BARs are ignored if memory decoding is enabled for the
device, and the same happen with ROM BARs if the write is an attempt
to change the position of the BAR without disabling it first.
The reason of ignoring such writes is a limitation in Xen, as it would
need to unmap the BAR,
Hello,
Just two patches left, but likely the ones with more meat in them.
Previous series was release-acked by Henry, but I haven't kept the acks
in case there's delay in getting them reviewed at which point the
release-ack would expire.
Thanks, Roger.
Roger Pau Monne (2):
pci: do
Writes to the BARs are ignored if memory decoding is enabled for the
device, and the same happen with ROM BARs if the write is an attempt
to change the position of the BAR without disabling it first.
The reason of ignoring such writes is a limitation in Xen, as it would
need to unmap the BAR,
This is done to shorten line length in the function in preparation for
adding further usages of the vpci_bar data structure.
No functional change.
Signed-off-by: Roger Pau Monné
Reviewed-by: Jan Beulich
---
xen/drivers/vpci/header.c | 14 --
1 file changed, 8 insertions(+), 6
Teardown of MSIX vPCI related data doesn't currently remove the MSIX
device data from the list of MSIX tables handled by the domain,
leading to a use-after-free of the data in the msix structure.
Remove the structure from the list before freeing in order to solve
it.
Reported-by: Jan Beulich
Commit 75cc460a1b added checks to ensure the position of the BARs from
PCI devices don't overlap with regions defined on the memory map.
When there's a collision memory decoding is left disabled for the
device, assuming that dom0 will reposition the BAR if necessary and
enable memory decoding.
is not fixing a
regression (since vPCI code has always behaved this way).
Thanks, Roger.
Roger Pau Monne (5):
vpci: don't assume that vpci per-device data exists unconditionally
vpci/msix: remove from table list on detach
vpci: introduce a local vpci_bar variable to modify_decoding()
pci
It's possible for a device to be assigned to a domain but have no
vpci structure if vpci_process_pending() failed and called
vpci_remove_device() as a result. The unconditional accesses done by
vpci_{read,write}() and vpci_remove_device() to pdev->vpci would
then trigger a NULL pointer
CONFIG_HAS_PCI is not defined for the tools build, and as a result the
vpci harness would never get build. Fix this by building it
unconditionally, there's nothing arch specific in it.
Reported-by: Andrew Cooper
Signed-off-by: Roger Pau Monné
---
While not strictly a bugfix, I think it's worth
Writes to the BARs are ignored if memory decoding is enabled for the
device, and the same happen with ROM BARs if the write is an attempt
to change the position of the BAR without disabling it first.
The reason of ignoring such writes is a limitation in Xen, as it would
need to unmap the BAR,
Commit 75cc460a1b added checks to ensure the position of the BARs from
PCI devices don't overlap with regions defined on the memory map.
When there's a collision memory decoding is left disabled for the
device, assuming that dom0 will reposition the BAR if necessary and
enable memory decoding.
This is done to shorten line length in the function in preparation for
adding further usages of the vpci_bar data structure.
No functional change.
Signed-off-by: Roger Pau Monné
---
xen/drivers/vpci/header.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git
Some vpci functions got the cfcheck attribute added, but that's not
defined in the user-space test harness, so add a dummy define in order
for the harness to build.
Fixes: 4ed7d5525f ('xen/vpci: CFI hardening')
Signed-off-by: Roger Pau Monné
---
tools/tests/vpci/emul.h | 1 +
1 file changed, 1
with memory decoding enabled.
I consider all of them bug fixes, albeit the last patch is not fixing a
regression (since vPCI code has always behaved this way).
Thanks, Roger.
Roger Pau Monne (6):
test/vpci: add dummy cfcheck define
test/vpci: fix vPCI test harness to provide pci_get_pdev()
vpci
Instead of pci_get_pdev_by_domain(), which is no longer present in the
hypervisor.
While there add parentheses around the define value.
Fixes: a37f9ea7a6 ('PCI: fold pci_get_pdev{,_by_domain}()')
Signed-off-by: Roger Pau Monné
---
tools/tests/vpci/emul.h | 2 +-
1 file changed, 1 insertion(+),
It's possible for a device to be assigned to a domain but have no
vpci structure if vpci_process_pending() failed and called
vpci_remove_device() as a result. The unconditional accesses done by
vpci_{read,write}() and vpci_remove_device() to pdev->vpci would
then trigger a NULL pointer
The current logic for AMD SSBD context switches it on every
vm{entry,exit} if the Xen and guest selections don't match. This is
expensive when not using SPEC_CTRL, and hence should be avoided as
much as possible.
When SSBD is not being set from SPEC_CTRL on AMD don't context switch
at
Hardware that exposes SSB_NO can implement the setting of SSBD as a
no-op because it's not affected by SSB.
Take advantage of that and allow exposing VIRT_SPEC_CTRL.SSBD to guest
running on hadrware that has SSB_NO. Only set VIRT_SSBD on the max
policy though, as the feature is only intended to
itself. I think the path is
sufficiently similar to the legacy one.
Currently running a gitlab CI loop in order to check everything is OK.
Roger Pau Monne (4):
amd/virt_ssbd: set SSBD at vCPU context switch
amd: remove VIRT_SC_MSR_HVM synthetic feature
amd/ssbd: remove hypervisor SSBD
Like on Intel AMD guests are now capable of setting SSBD on their own,
either from SPEC_CTRL or from VIRT_SPEC_CTRL. As a result the
unconditional setting of SSBD from Xen in order to cope with the bit
not being exposed to guests is no longer needed.
Remove the Xen command line `spec-ctrl=ssbd`
Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched
on vm{entry,exit} there's no need to use a synthetic feature bit for
it anymore.
Remove the bit and instead use a global variable.
No functional change intended.
Signed-off-by: Roger Pau Monné
---
xen/arch/x86/cpu/amd.c
The EFI memory map contains two memory types (EfiMemoryMappedIO and
EfiMemoryMappedIOPortSpace) used to describe IO memory areas used by
EFI firmware.
The current parsing of the EFI memory map is translating
EfiMemoryMappedIO to E820_RESERVED on x86. This causes issues on some
boxes as the
The EFI memory map contains two memory types (EfiMemoryMappedIO and
EfiMemoryMappedIOPortSpace) used to describe IO memory areas of
devices used by EFI.
The current parsing of the EFI memory map was translating
EfiMemoryMappedIO and EfiMemoryMappedIOPortSpace to E820_RESERVED on
x86. This is an
While correct from a code point of view, the usage of the const
attribute for the domain parameter of gic_iomem_deny_access() is at
least partially bogus. Contents of the domain structure (the iomem
rangeset) is modified by the function. Such modifications succeed
because right now the iomem
memory_type_changed() is currently only implemented for Intel EPT, and
results in the invalidation of EMT attributes on all the entries in
the EPT page tables. Such invalidation causes EPT_MISCONFIG vmexits
when the guest tries to access any gfns for the first time, which
results in the
.
Roger Pau Monne (2):
arm/vgic: drop const attribute from gic_iomem_deny_access()
x86/ept: limit calls to memory_type_changed()
xen/arch/arm/gic-v2.c| 2 +-
xen/arch/arm/gic-v3.c| 2 +-
xen/arch/arm/gic.c | 2 +-
xen/arch/arm/include/asm/gic.h | 4
memory_type_changed() is currently only implemented for Intel EPT, and
results in the invalidation of EMT attributes on all the entries in
the EPT page tables. Such invalidation causes EPT_MISCONFIG vmexits
when the guest tries to access any gfns for the first time, which
results in the
The current way to detect whether a page handled to
epte_get_entry_emt() is special and needs a forced write-back cache
attribute involves iterating over all the smaller 4K pages for
superpages.
Such loop consumes a high amount of CPU time for 1GiB pages (order
18): on a Xeon® Silver 4216
memory_type_changed() is currently only implemented for Intel EPT, and
results in the invalidation of EMT attributes on all the entries in
the EPT page tables. Such invalidation causes EPT_MISCONFIG vmexits
when the guest tries to access any gfns for the first time, which
results in the
Current usage of Werror=switch-enum by default for libvirt builds out
of the git tree causes issues when new items are added to libxl public
API enums if those are used in a switch statement in libvirt code.
This leads to libvirt build failures for seemingly unrelated libxl
changes.
In order to
Current usage of Werror=switch-enum by default for libvirt builds out
of the git tree causes issues when new items are added to libxl public
API enums if those are used in a switch statement in libvirt code.
This leads to libvirt build failures for seemingly unrelated libxl
changes.
In order to
Under certain conditions guests can get the CPU stuck in an unbounded
loop without the possibility of an interrupt window to occur on
instruction boundary. This was the case with the scenarios described
in XSA-156.
Make use of the Notify VM Exit mechanism, that will trigger a VM Exit
if no
Introduce a small helper to OR VMX_INTR_SHADOW_NMI in
GUEST_INTERRUPTIBILITY_INFO in order to help dealing with the NMI
unblocked by IRET case. Replace the existing usage in handling
EXIT_REASON_EXCEPTION_NMI and also add such handling to EPT violations
and page-modification log-full events.
Add support for enabling guest Bus Lock Detection on Intel systems.
Such detection works by triggering a vmexit, which ought to be enough
of a pause to prevent a guest from abusing of the Bus Lock.
Add an extra Xen perf counter to track the number of Bus Locks detected.
This is done because Bus
Hello,
Following series implements support for bus lock and notify VM exit.
Patches are not really dependent, but I've developed them together by
virtue of both features being in Intel Instructions Set Extensions PR
Chapter 9.
Thanks, Roger.
Roger Pau Monne (3):
x86/vmx: implement VMExit
The current logic in epte_get_entry_emt() will split any page marked
as special with order greater than zero, without checking whether the
super page is all special.
Fix this by only splitting the page only if it's not all marked as
special, in order to prevent unneeded super page shuttering.
When using an APIC do not set nr_irqs based on a factor of nr_irqs_gsi
(currently x8), and instead do so exclusively based on the amount of
available vectors on the system.
There's no point in setting nr_irqs to a value higher than the
available set of vectors, as vector allocation will fail
Using nr_irqs minus nr_irqs_gsi is misleading, as GSI interrupts are
not allocated unless requested by the hardware domain, so a hardware
domain could not use any GSI (or just one for the ACPI SCI), and hence
(almost) all nr_irqs will be available for MSI(-X) usage.
No functional difference, just
Current code to calculate nr_irqs assumes the APIC destination mode to
be physical, so all vectors on each possible CPU is available for use
by a different interrupt source. This is not true when using Logical
(Cluster) destination mode, where CPUs in the same cluster share the
vector space.
Fix
Logic in ioapic_init() that sets the number of available vectors for
external interrupts requires knowing the x2APIC Destination Mode. As
such move the call after x2APIC BSP setup.
Do it as part of init_irq_data(), which is called just after x2APIC
BSP init and also makes use of nr_irqs itself.
be set from Kconfig, and
will default to phys in order to reliable boot on all boxes.
Further patches are a bit of cleanup related to the interrupt limits
reported at boot, and making those values more realistic.
Thanks, Roger.
Roger Pau Monne (6):
x86/Kconfig: add option for default x2APIC
Using cluster mode by default greatly limits the amount of vectors
available, as then vector space is shared amongst all the CPUs in the
logical cluster.
This can lead to vector shortage issues on boxes with not a huge
amount of CPUs but with a non-trivial amount of devices. There are
reports of
Allow setting the default x2APIC destination mode from Kconfig to
Physical.
Note the default destination mode is still Logical (Cluster) mode.
Signed-off-by: Roger Pau Monné
---
Changes since v1:
- Use a boolean rather than a choice.
- Expand to X2APIC_PHYSICAL.
---
TBH I wasn't sure whether
Take the opportunity to convert the variable to read-only after init.
No functional change intended.
Signed-off-by: Roger Pau Monné
---
Changes since v1:
- Fix help message about rounded boundary, do not round up the
default value (will be done at runtime).
- Use kiB instead of KB.
---
-b173-e497-5b8a-5e0bb6d94...@suse.com/
Thanks, Roger.
Roger Pau Monne (2):
console/serial: set the default transmit buffer size in Kconfig
console/serial: bump buffer from 16K to 32K
xen/drivers/char/Kconfig | 10 ++
xen/drivers/char/serial.c | 2 +-
2 files changed, 11 insertions
401 - 500 of 2071 matches
Mail list logo