Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field
On Wed, 2022-01-26 at 13:47 +0100, Jan Beulich wrote: > On 25.01.2022 16:13, Roger Pau Monné wrote: > > On Mon, Jan 24, 2022 at 02:20:47PM +0100, Jan Beulich wrote: > > > On 20.01.2022 16:23, Roger Pau Monne wrote: > > > > Such field uses bits 55:48, but for the purposes the register > > > > will be > > > > used use bits 55:49 instead. Bit 48 is used to signal an RTE > > > > entry is > > > > in remappable format which is not supported by the vIO-APIC. > > > > > > Neither here nor in the cover letter you point at a formal > > > specification > > > of this mode of operation. > > > > I'm not aware of any formal specification of this mode, apart from > > the > > work done to introduce support in Linux and QEMU: > > > > https://lore.kernel.org/all/20201009104616.1314746-1-dw...@infradead.org/ > > > > https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c1bb5418e > > > > > > Adding David in case there's some kind of specification somewhere > > I'm > > not aware of. > > > > > What I'm aware of are vague indications of > > > this mode's existence in some of Intel's chipset data sheets. Yet that > > > leaves open, for example, whether indeed bit 48 cannot be used here. > > > > Bit 48 cannot be used because it's already used to signal an RTE is in > > remappable format. We still want to differentiate an RTE entry in > > remappable format, as it should be possible to expose both the > > extended ID support and an emulated IOMMU. > > I think I did say so on irc already: There's not really a problem like > this. For one I wouldn't expect an OS to use this extended ID at the > same time as having an IOMMU to deal with the width restriction. And > then, even if they wanted to use both at the same time, they'd simply > need to care about the specific meaning of this bit themselves: When > the bit is set, it would be unavoidable to have it (perhaps identity-) > remapped by the IOMMU. As you later said, it's too late for bikeshedding that decision. But I stand by it regardless of the time. Even by the time *I* made that choice, it was long since established by Intel. You could make the same argument about their original hardware design, that the format bit is pointless and that if an OS enables interrupt remapping, it knows full well when it's going to use it. It can even be configured in the IOMMU per PCI function. There is benefit to having a very clear and unambiguous difference between the MSI formats that isn't entirely dependent on the IOMMU being configured correctly. And in my case there is *definitely* benefit to following the precedent already set by Intel in the real hardware. For me, those outweighed the marginal additional benefit of going from 15 to 16 bits of APIC ID in the MSI. > > > > --- a/xen/arch/x86/hvm/vioapic.c > > > > +++ b/xen/arch/x86/hvm/vioapic.c > > > > @@ -412,7 +412,8 @@ static void ioapic_inj_irq( > > > > > > > > static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int > > > > pin) > > > > { > > > > -uint16_t dest = vioapic->redirtbl[pin].fields.dest_id; > > > > +uint16_t dest = vioapic->redirtbl[pin].fields.dest_id | > > > > +(vioapic->redirtbl[pin].fields.ext_dest_id << 8); > > > > > > What if an existing guest has been writing non-zero in these bits? Can > > > you really use them here without any further indication by the guest? > > > > Those bits where reserved previously, so no OS should have used them. > > There are hypervisors already in the field (QEMU/KVM and HyperV) using > > this mode. > > > > We could add a per-domain option to disable extended ID mode if we are > > really worried about OSes having used those bits for some reason. > > Generally I think previously ignored bits need to be handled with care. > If there was a specification, what is being said there might serve as > a guideline for us. Even if there was just a proper description of the > EDID field found in recent Intel chipset spec, this might already help > determining whether we want/need an enable (or disable). But there's > not even a bit announcing the functionality in, say, the version > register. It's not very verbose, but the Extended Destination ID in the I/OAPIC is at least mentioned in the RTE documentation in the 82806AA datasheet https://datasheet.octopart.com/FW82806AA-SL3VZ-Intel-datasheet-13695406.pdf See page 47, §2.4.10 "Redirection Table High DWord". The rest you have to kind of piece together from the later documentation once they actually started *using* it for IRQ remapping. I think it may also have been used on IA64? The realisation that we didn't need to have special different code to compose RTE entries for Compatibility Format vs. Remappable Format, and that we could just allow the 'upstream' APIC code to compose the MSI message and then swizzle the bits into the RTE... was rather slow to come. https://lore.kernel.org/all/20201024213535.443185-22-dw...@infradead.org/ smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field
On 26.01.2022 14:52, David Woodhouse wrote: > On Tue, 2022-01-25 at 16:13 +0100, Roger Pau Monné wrote: >> On Mon, Jan 24, 2022 at 02:20:47PM +0100, Jan Beulich wrote: >>> On 20.01.2022 16:23, Roger Pau Monne wrote: Such field uses bits 55:48, but for the purposes the register will be used use bits 55:49 instead. Bit 48 is used to signal an RTE entry is in remappable format which is not supported by the vIO-APIC. >>> >>> Neither here nor in the cover letter you point at a formal specification >>> of this mode of operation. >> >> I'm not aware of any formal specification of this mode, apart from the >> work done to introduce support in Linux and QEMU: >> >> https://lore.kernel.org/all/20201009104616.1314746-1-dw...@infradead.org/ >> >> https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c1bb5418e >> >> >> Adding David in case there's some kind of specification somewhere I'm >> not aware of. > > Indeed there is no formal specification that I am aware of, although > it's vaguely possible that Microsoft wrote something up when they added > it to Hyper-V. > > https://lore.kernel.org/all/20201103011136.59108-1-de...@microsoft.com/ > > I had an internal doc which looks like I can clean it up a tiny bit > and then share at http://david.woodhou.se/15-bit-msi.pdf if that helps? Thanks, this at least puts us on common grounds. >>> What I'm aware of are vague indications of >>> this mode's existence in some of Intel's chipset data sheets. Yet that >>> leaves open, for example, whether indeed bit 48 cannot be used here. >> >> Bit 48 cannot be used because it's already used to signal an RTE is in >> remappable format. We still want to differentiate an RTE entry in >> remappable format, as it should be possible to expose both the >> extended ID support and an emulated IOMMU. > > Right. I chose not to use the low bit of the existing Extended > Destination ID because that's the one Intel used to indicate Remappable > Format. This means we can still expose an IOMMU to guests and easily > distinguish between Compatibility Format and Remappable Format MSIs > just as real hardware does. Well, with the defacto standard of using only 7 of the bits we will have to follow suit of course. Jan
Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field
On Tue, 2022-01-25 at 16:13 +0100, Roger Pau Monné wrote: > On Mon, Jan 24, 2022 at 02:20:47PM +0100, Jan Beulich wrote: > > On 20.01.2022 16:23, Roger Pau Monne wrote: > > > Such field uses bits 55:48, but for the purposes the register will be > > > used use bits 55:49 instead. Bit 48 is used to signal an RTE entry is > > > in remappable format which is not supported by the vIO-APIC. > > > > Neither here nor in the cover letter you point at a formal specification > > of this mode of operation. > > I'm not aware of any formal specification of this mode, apart from the > work done to introduce support in Linux and QEMU: > > https://lore.kernel.org/all/20201009104616.1314746-1-dw...@infradead.org/ > > https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c1bb5418e > > > Adding David in case there's some kind of specification somewhere I'm > not aware of. Indeed there is no formal specification that I am aware of, although it's vaguely possible that Microsoft wrote something up when they added it to Hyper-V. https://lore.kernel.org/all/20201103011136.59108-1-de...@microsoft.com/ I had an internal doc which looks like I can clean it up a tiny bit and then share at http://david.woodhou.se/15-bit-msi.pdf if that helps? > > What I'm aware of are vague indications of > > this mode's existence in some of Intel's chipset data sheets. Yet that > > leaves open, for example, whether indeed bit 48 cannot be used here. > > Bit 48 cannot be used because it's already used to signal an RTE is in > remappable format. We still want to differentiate an RTE entry in > remappable format, as it should be possible to expose both the > extended ID support and an emulated IOMMU. Right. I chose not to use the low bit of the existing Extended Destination ID because that's the one Intel used to indicate Remappable Format. This means we can still expose an IOMMU to guests and easily distinguish between Compatibility Format and Remappable Format MSIs just as real hardware does. > > > --- a/xen/arch/x86/hvm/vioapic.c > > > +++ b/xen/arch/x86/hvm/vioapic.c > > > @@ -412,7 +412,8 @@ static void ioapic_inj_irq( > > > > > > static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int > > > pin) > > > { > > > -uint16_t dest = vioapic->redirtbl[pin].fields.dest_id; > > > +uint16_t dest = vioapic->redirtbl[pin].fields.dest_id | > > > +(vioapic->redirtbl[pin].fields.ext_dest_id << 8); > > > > What if an existing guest has been writing non-zero in these bits? Can > > you really use them here without any further indication by the guest? > > Those bits where reserved previously, so no OS should have used them. > There are hypervisors already in the field (QEMU/KVM and HyperV) using > this mode. > > We could add a per-domain option to disable extended ID mode if we are > really worried about OSes having used those bits for some reason. Note that I didn't even have to touch this part in qemu; the swizzling of those 8 'Extended Destination ID' bits from the RTE into the corresponding bits of the MSI message was already being done — without which, IRQ remapping wouldn't have worked anyway. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field
On 25.01.2022 16:13, Roger Pau Monné wrote: > On Mon, Jan 24, 2022 at 02:20:47PM +0100, Jan Beulich wrote: >> On 20.01.2022 16:23, Roger Pau Monne wrote: >>> Such field uses bits 55:48, but for the purposes the register will be >>> used use bits 55:49 instead. Bit 48 is used to signal an RTE entry is >>> in remappable format which is not supported by the vIO-APIC. >> >> Neither here nor in the cover letter you point at a formal specification >> of this mode of operation. > > I'm not aware of any formal specification of this mode, apart from the > work done to introduce support in Linux and QEMU: > > https://lore.kernel.org/all/20201009104616.1314746-1-dw...@infradead.org/ > https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c1bb5418e > > Adding David in case there's some kind of specification somewhere I'm > not aware of. > >> What I'm aware of are vague indications of >> this mode's existence in some of Intel's chipset data sheets. Yet that >> leaves open, for example, whether indeed bit 48 cannot be used here. > > Bit 48 cannot be used because it's already used to signal an RTE is in > remappable format. We still want to differentiate an RTE entry in > remappable format, as it should be possible to expose both the > extended ID support and an emulated IOMMU. I think I did say so on irc already: There's not really a problem like this. For one I wouldn't expect an OS to use this extended ID at the same time as having an IOMMU to deal with the width restriction. And then, even if they wanted to use both at the same time, they'd simply need to care about the specific meaning of this bit themselves: When the bit is set, it would be unavoidable to have it (perhaps identity-) remapped by the IOMMU. >>> --- a/xen/arch/x86/hvm/vioapic.c >>> +++ b/xen/arch/x86/hvm/vioapic.c >>> @@ -412,7 +412,8 @@ static void ioapic_inj_irq( >>> >>> static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin) >>> { >>> -uint16_t dest = vioapic->redirtbl[pin].fields.dest_id; >>> +uint16_t dest = vioapic->redirtbl[pin].fields.dest_id | >>> +(vioapic->redirtbl[pin].fields.ext_dest_id << 8); >> >> What if an existing guest has been writing non-zero in these bits? Can >> you really use them here without any further indication by the guest? > > Those bits where reserved previously, so no OS should have used them. > There are hypervisors already in the field (QEMU/KVM and HyperV) using > this mode. > > We could add a per-domain option to disable extended ID mode if we are > really worried about OSes having used those bits for some reason. Generally I think previously ignored bits need to be handled with care. If there was a specification, what is being said there might serve as a guideline for us. Even if there was just a proper description of the EDID field found in recent Intel chipset spec, this might already help determining whether we want/need an enable (or disable). But there's not even a bit announcing the functionality in, say, the version register. Jan
Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field
On Mon, Jan 24, 2022 at 02:20:47PM +0100, Jan Beulich wrote: > On 20.01.2022 16:23, Roger Pau Monne wrote: > > Such field uses bits 55:48, but for the purposes the register will be > > used use bits 55:49 instead. Bit 48 is used to signal an RTE entry is > > in remappable format which is not supported by the vIO-APIC. > > Neither here nor in the cover letter you point at a formal specification > of this mode of operation. I'm not aware of any formal specification of this mode, apart from the work done to introduce support in Linux and QEMU: https://lore.kernel.org/all/20201009104616.1314746-1-dw...@infradead.org/ https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c1bb5418e Adding David in case there's some kind of specification somewhere I'm not aware of. > What I'm aware of are vague indications of > this mode's existence in some of Intel's chipset data sheets. Yet that > leaves open, for example, whether indeed bit 48 cannot be used here. Bit 48 cannot be used because it's already used to signal an RTE is in remappable format. We still want to differentiate an RTE entry in remappable format, as it should be possible to expose both the extended ID support and an emulated IOMMU. > > --- a/xen/arch/x86/hvm/vioapic.c > > +++ b/xen/arch/x86/hvm/vioapic.c > > @@ -412,7 +412,8 @@ static void ioapic_inj_irq( > > > > static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin) > > { > > -uint16_t dest = vioapic->redirtbl[pin].fields.dest_id; > > +uint16_t dest = vioapic->redirtbl[pin].fields.dest_id | > > +(vioapic->redirtbl[pin].fields.ext_dest_id << 8); > > What if an existing guest has been writing non-zero in these bits? Can > you really use them here without any further indication by the guest? Those bits where reserved previously, so no OS should have used them. There are hypervisors already in the field (QEMU/KVM and HyperV) using this mode. We could add a per-domain option to disable extended ID mode if we are really worried about OSes having used those bits for some reason. Thanks, Roger.
Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field
On 20.01.2022 16:23, Roger Pau Monne wrote: > Such field uses bits 55:48, but for the purposes the register will be > used use bits 55:49 instead. Bit 48 is used to signal an RTE entry is > in remappable format which is not supported by the vIO-APIC. Neither here nor in the cover letter you point at a formal specification of this mode of operation. What I'm aware of are vague indications of this mode's existence in some of Intel's chipset data sheets. Yet that leaves open, for example, whether indeed bit 48 cannot be used here. > --- a/xen/arch/x86/hvm/vioapic.c > +++ b/xen/arch/x86/hvm/vioapic.c > @@ -412,7 +412,8 @@ static void ioapic_inj_irq( > > static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin) > { > -uint16_t dest = vioapic->redirtbl[pin].fields.dest_id; > +uint16_t dest = vioapic->redirtbl[pin].fields.dest_id | > +(vioapic->redirtbl[pin].fields.ext_dest_id << 8); What if an existing guest has been writing non-zero in these bits? Can you really use them here without any further indication by the guest? Jan
[PATCH 1/3] xen/vioapic: add support for the extended destination ID field
Such field uses bits 55:48, but for the purposes the register will be used use bits 55:49 instead. Bit 48 is used to signal an RTE entry is in remappable format which is not supported by the vIO-APIC. Use the extended destination ID to store the high bits from the destination ID, thus expanding the size of the destination ID field to 15 bits, allowing an IO-APIC to target APIC IDs up to 32768. Note this is already supported by QEMU/KVM and HyperV. Signed-off-by: Roger Pau Monné --- xen/arch/x86/hvm/vioapic.c | 3 ++- xen/include/public/arch-x86/hvm/save.h | 4 +++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c index 553c0f76ef..1f2305c232 100644 --- a/xen/arch/x86/hvm/vioapic.c +++ b/xen/arch/x86/hvm/vioapic.c @@ -412,7 +412,8 @@ static void ioapic_inj_irq( static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin) { -uint16_t dest = vioapic->redirtbl[pin].fields.dest_id; +uint16_t dest = vioapic->redirtbl[pin].fields.dest_id | +(vioapic->redirtbl[pin].fields.ext_dest_id << 8); uint8_t dest_mode = vioapic->redirtbl[pin].fields.dest_mode; uint8_t delivery_mode = vioapic->redirtbl[pin].fields.delivery_mode; uint8_t vector = vioapic->redirtbl[pin].fields.vector; diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h index 773a380bc2..14a5d94588 100644 --- a/xen/include/public/arch-x86/hvm/save.h +++ b/xen/include/public/arch-x86/hvm/save.h @@ -376,7 +376,9 @@ union vioapic_redir_entry uint8_t trig_mode:1; uint8_t mask:1; uint8_t reserve:7; -uint8_t reserved[4]; +uint8_t reserved[3]; +uint8_t :1; +uint8_t ext_dest_id:7; uint8_t dest_id; } fields; }; -- 2.34.1