Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Jan 15, 2019 at 03:13:14PM +0800, Yu Zhang wrote: > On Fri, Dec 28, 2018 at 11:29:41PM -0200, Eduardo Habkost wrote: > > On Fri, Dec 28, 2018 at 10:32:59AM +0800, Yu Zhang wrote: > > > On Thu, Dec 27, 2018 at 01:14:11PM -0200, Eduardo Habkost wrote: > > > > On Wed, Dec 26, 2018 at 01:30:00PM +0800, Yu Zhang wrote: > > > > > On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > > > > > > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > > > > > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin > > > > > > > wrote: > > > > > > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > > > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin > > > > > > > > > wrote: > > > > > > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > > > > > > Well, my understanding of the vt-d spec is that the > > > > > > > > > > > address limitation in > > > > > > > > > > > DMAR are referring to the same concept of > > > > > > > > > > > CPUID.MAXPHYSADDR. I do not think > > > > > > > > > > > there's any different in the native scenario. :) > > > > > > > > > > > > > > > > > > > > I think native machines exist on which the two values are > > > > > > > > > > different. > > > > > > > > > > Is that true? > > > > > > > > > > > > > > > > > > I think the answer is not. My understanding is that HAW(host > > > > > > > > > address wdith) is > > > > > > > > > the maximum physical address width a CPU can detects(by > > > > > > > > > cpuid.0x8008). > > > > > > > > > > > > > > > > > > I agree there are some addresses the CPU does not touch, but > > > > > > > > > they are still in > > > > > > > > > the physical address space, and there's only one physical > > > > > > > > > address space... > > > > > > > > > > > > > > > > > > B.R. > > > > > > > > > Yu > > > > > > > > > > > > > > > > Ouch I thought we are talking about the virtual address size. > > > > > > > > I think I did have a box where VTD's virtual address size was > > > > > > > > smaller than CPU's. > > > > > > > > For physical one - we just need to make it as big as max > > > > > > > > supported > > > > > > > > memory right? > > > > > > > > > > > > > > Well, my understanding of the physical one is the maximum > > > > > > > physical address > > > > > > > width. Sorry, this explain seems nonsense... I mean, it's not > > > > > > > just about > > > > > > > the max supported memory, but also covers MMIO. It shall be > > > > > > > detectable > > > > > > > from cpuid, or ACPI's DMAR table, instead of calculated by the > > > > > > > max memory > > > > > > > size. One common usage of this value is to tell the paging > > > > > > > structure entries( > > > > > > > CPU's or IOMMU's) which bits shall be reserved. There are also > > > > > > > some registers > > > > > > > e.g. apic base reg etc, whose contents are physical addresses, > > > > > > > therefore also > > > > > > > need to follow the similar requirement for the reserved bits. > > > > > > > > > > > > > > So I think the correct direction might be to define this property > > > > > > > in the > > > > > > > machine status level, instead of the CPU level. Is this > > > > > > > reasonable to you? > > > > > > > > > > > > At that level yes. But isn't this already specified by > > > > > > "pci-hole64-end"? > > > > > > > > > > But this value is set by guest firmware? Will PCI hotplug change this > > > > > address? > > > > > > > > > > @Eduardo, do you have any plan to calculate the phys-bits by > > > > > "pci-hole64-end"? > > > > > Or introduce another property, say "max-phys-bits" in machine status? > > > > > > > > I agree it may make sense to make the machine code control > > > > phys-bits instead of the CPU object. A machine property sounds > > > > like the simplest solution. > > > > > > > > But I don't think we can have a meaningful discussion about > > > > implementation if we don't agree about the command-line > > > > interface. We must decide what will happen to the CPU and iommu > > > > physical address width in cases like: > > > > > > Thanks, Eduardo. > > > > > > What about we just use "-machine phys-bits=52", and remove the > > > "phys-bits" from CPU parameter? > > > > Maybe we can deprecate it, but we can't remove it immediately. > > We still need to decide what to do on the cases below, while the > > option is still available. > > I saw the ACPI DMAR is ininitialized in acpi_build(), which is called > by pc_machine_done(). I guess this is done after the initialization of > vCPU and vIOMMU. > > So I am wondering, instead of moving "phys-bits" from X86CPU into the > MachineState, maybe we could: > > 1> Define a "phys_bits" in MachineState or PCMachineState(not sure which > one is more suitable). > > 2> Set ms->phys_bits in x86_cpu_realizefn(). > > 3> Since DMAR is created after vCPU creation, we can build DMAR table > with ms->phys_bits. > > 4> Also, we can reset the hardware address width for
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 28, 2018 at 11:29:41PM -0200, Eduardo Habkost wrote: > On Fri, Dec 28, 2018 at 10:32:59AM +0800, Yu Zhang wrote: > > On Thu, Dec 27, 2018 at 01:14:11PM -0200, Eduardo Habkost wrote: > > > On Wed, Dec 26, 2018 at 01:30:00PM +0800, Yu Zhang wrote: > > > > On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > > > > > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > > > > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > > > > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin > > > > > > > > wrote: > > > > > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > > > > > Well, my understanding of the vt-d spec is that the address > > > > > > > > > > limitation in > > > > > > > > > > DMAR are referring to the same concept of > > > > > > > > > > CPUID.MAXPHYSADDR. I do not think > > > > > > > > > > there's any different in the native scenario. :) > > > > > > > > > > > > > > > > > > I think native machines exist on which the two values are > > > > > > > > > different. > > > > > > > > > Is that true? > > > > > > > > > > > > > > > > I think the answer is not. My understanding is that HAW(host > > > > > > > > address wdith) is > > > > > > > > the maximum physical address width a CPU can detects(by > > > > > > > > cpuid.0x8008). > > > > > > > > > > > > > > > > I agree there are some addresses the CPU does not touch, but > > > > > > > > they are still in > > > > > > > > the physical address space, and there's only one physical > > > > > > > > address space... > > > > > > > > > > > > > > > > B.R. > > > > > > > > Yu > > > > > > > > > > > > > > Ouch I thought we are talking about the virtual address size. > > > > > > > I think I did have a box where VTD's virtual address size was > > > > > > > smaller than CPU's. > > > > > > > For physical one - we just need to make it as big as max supported > > > > > > > memory right? > > > > > > > > > > > > Well, my understanding of the physical one is the maximum physical > > > > > > address > > > > > > width. Sorry, this explain seems nonsense... I mean, it's not just > > > > > > about > > > > > > the max supported memory, but also covers MMIO. It shall be > > > > > > detectable > > > > > > from cpuid, or ACPI's DMAR table, instead of calculated by the max > > > > > > memory > > > > > > size. One common usage of this value is to tell the paging > > > > > > structure entries( > > > > > > CPU's or IOMMU's) which bits shall be reserved. There are also some > > > > > > registers > > > > > > e.g. apic base reg etc, whose contents are physical addresses, > > > > > > therefore also > > > > > > need to follow the similar requirement for the reserved bits. > > > > > > > > > > > > So I think the correct direction might be to define this property > > > > > > in the > > > > > > machine status level, instead of the CPU level. Is this reasonable > > > > > > to you? > > > > > > > > > > At that level yes. But isn't this already specified by > > > > > "pci-hole64-end"? > > > > > > > > But this value is set by guest firmware? Will PCI hotplug change this > > > > address? > > > > > > > > @Eduardo, do you have any plan to calculate the phys-bits by > > > > "pci-hole64-end"? > > > > Or introduce another property, say "max-phys-bits" in machine status? > > > > > > I agree it may make sense to make the machine code control > > > phys-bits instead of the CPU object. A machine property sounds > > > like the simplest solution. > > > > > > But I don't think we can have a meaningful discussion about > > > implementation if we don't agree about the command-line > > > interface. We must decide what will happen to the CPU and iommu > > > physical address width in cases like: > > > > Thanks, Eduardo. > > > > What about we just use "-machine phys-bits=52", and remove the > > "phys-bits" from CPU parameter? > > Maybe we can deprecate it, but we can't remove it immediately. > We still need to decide what to do on the cases below, while the > option is still available. I saw the ACPI DMAR is ininitialized in acpi_build(), which is called by pc_machine_done(). I guess this is done after the initialization of vCPU and vIOMMU. So I am wondering, instead of moving "phys-bits" from X86CPU into the MachineState, maybe we could: 1> Define a "phys_bits" in MachineState or PCMachineState(not sure which one is more suitable). 2> Set ms->phys_bits in x86_cpu_realizefn(). 3> Since DMAR is created after vCPU creation, we can build DMAR table with ms->phys_bits. 4> Also, we can reset the hardware address width for vIOMMU(and the vtd_paging_entry_rsvd_field array) in pc_machine_done(), based on the value of ms->phys_bits, or from ACPI DMAR table(from spec point of view, address width limitation of IOMMU shall come from DMAR, yet I have not figured out any simple approach to probe the ACPI property). This way,
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 28, 2018 at 10:32:59AM +0800, Yu Zhang wrote: > On Thu, Dec 27, 2018 at 01:14:11PM -0200, Eduardo Habkost wrote: > > On Wed, Dec 26, 2018 at 01:30:00PM +0800, Yu Zhang wrote: > > > On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > > > > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > > > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > > > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin > > > > > > > wrote: > > > > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > > > > Well, my understanding of the vt-d spec is that the address > > > > > > > > > limitation in > > > > > > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. > > > > > > > > > I do not think > > > > > > > > > there's any different in the native scenario. :) > > > > > > > > > > > > > > > > I think native machines exist on which the two values are > > > > > > > > different. > > > > > > > > Is that true? > > > > > > > > > > > > > > I think the answer is not. My understanding is that HAW(host > > > > > > > address wdith) is > > > > > > > the maximum physical address width a CPU can detects(by > > > > > > > cpuid.0x8008). > > > > > > > > > > > > > > I agree there are some addresses the CPU does not touch, but they > > > > > > > are still in > > > > > > > the physical address space, and there's only one physical address > > > > > > > space... > > > > > > > > > > > > > > B.R. > > > > > > > Yu > > > > > > > > > > > > Ouch I thought we are talking about the virtual address size. > > > > > > I think I did have a box where VTD's virtual address size was > > > > > > smaller than CPU's. > > > > > > For physical one - we just need to make it as big as max supported > > > > > > memory right? > > > > > > > > > > Well, my understanding of the physical one is the maximum physical > > > > > address > > > > > width. Sorry, this explain seems nonsense... I mean, it's not just > > > > > about > > > > > the max supported memory, but also covers MMIO. It shall be detectable > > > > > from cpuid, or ACPI's DMAR table, instead of calculated by the max > > > > > memory > > > > > size. One common usage of this value is to tell the paging structure > > > > > entries( > > > > > CPU's or IOMMU's) which bits shall be reserved. There are also some > > > > > registers > > > > > e.g. apic base reg etc, whose contents are physical addresses, > > > > > therefore also > > > > > need to follow the similar requirement for the reserved bits. > > > > > > > > > > So I think the correct direction might be to define this property in > > > > > the > > > > > machine status level, instead of the CPU level. Is this reasonable to > > > > > you? > > > > > > > > At that level yes. But isn't this already specified by "pci-hole64-end"? > > > > > > But this value is set by guest firmware? Will PCI hotplug change this > > > address? > > > > > > @Eduardo, do you have any plan to calculate the phys-bits by > > > "pci-hole64-end"? > > > Or introduce another property, say "max-phys-bits" in machine status? > > > > I agree it may make sense to make the machine code control > > phys-bits instead of the CPU object. A machine property sounds > > like the simplest solution. > > > > But I don't think we can have a meaningful discussion about > > implementation if we don't agree about the command-line > > interface. We must decide what will happen to the CPU and iommu > > physical address width in cases like: > > Thanks, Eduardo. > > What about we just use "-machine phys-bits=52", and remove the > "phys-bits" from CPU parameter? Maybe we can deprecate it, but we can't remove it immediately. We still need to decide what to do on the cases below, while the option is still available. > > > > > $QEMU -device intel-iommu > > $QEMU -cpu ...,phys-bits=50 -device intel-iommu > > $QEMU -cpu ...,host-phys-bits=on -device intel-iommu > > $QEMU -machine phys-bits=50 -device intel-iommu > > $QEMU -machine phys-bits=50 -cpu ...,phys-bits=48 -device intel-iommu > > > > -- > > Eduardo > > > > B.R. > Yu -- Eduardo
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Thu, 27 Dec 2018 12:54:02 -0200 Eduardo Habkost wrote: > On Fri, Dec 21, 2018 at 03:13:25PM +0100, Igor Mammedov wrote: > > On Thu, 20 Dec 2018 19:18:01 -0200 > > Eduardo Habkost wrote: > > > > > On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > > > > On Wed, 19 Dec 2018 10:57:17 +0800 > > > > Yu Zhang wrote: > > > > > > > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > > > instead of > > > > > > > > > the host address width(HAW) to calculate the number of > > > > > > > > > reserved bits in > > > > > > > > > data structures such as root entries, context entries, and > > > > > > > > > entries of > > > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > > > equal. For > > > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses > > > > > > > > > no wider than > > > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved > > > > > > > > > bit may result > > > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > > > IntelIOMMUState, > > > > > > > > > whose value is initialized based on the maximum physical > > > > > > > > > address set to > > > > > > > > > guest CPU. > > > > > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are > > > > > > > > > renamed > > > > > > > > > to clarify. > > > > > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > > > Reviewed-by: Peter Xu > > > > > > > > > --- > > > > > > > > [...] > > > > > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier > > > > > > > > > *n) > > > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > > > { > > > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > > > +CPUState *cs = first_cpu; > > > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState > > > > > > > > > *s) > > > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > > > } > > > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > > > altogether > > > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > > > deciding > > > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > > > cpuid.MAXPHYSADDR > > > > > > > are referring to the same concept. In VM, both are the maximum > > > > > > > guest physical > > > > > > > address width. If we do not check the CPU field here, we will > > > > > > > still have to > > > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > > > reset the > > > > > > > s->haw_bits again. > > > > > > > > > > > > > > Is this explanation convincing enough? :) > > > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > > > series that > > > > > > contains not acceptable direct access from one device (iommu) to > > > > > > another (cpu). > > > > > > Proper way would be for the owner of iommu to fish limits from > > > > > > somewhere and set > > > > > > values during iommu creation. > > > > > > > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > > > > incorrect value. :) > > > > > According to the spec, the host address width is the maximum physical > > > > > address width, > > > > > yet current implementation is using the DMA address width. For me, > > > > > this is not only > > > > > wrong, but also unsecure. For this point, I think we all agree
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Thu, Dec 27, 2018 at 01:14:11PM -0200, Eduardo Habkost wrote: > On Wed, Dec 26, 2018 at 01:30:00PM +0800, Yu Zhang wrote: > > On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > > > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > > > Well, my understanding of the vt-d spec is that the address > > > > > > > > limitation in > > > > > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I > > > > > > > > do not think > > > > > > > > there's any different in the native scenario. :) > > > > > > > > > > > > > > I think native machines exist on which the two values are > > > > > > > different. > > > > > > > Is that true? > > > > > > > > > > > > I think the answer is not. My understanding is that HAW(host > > > > > > address wdith) is > > > > > > the maximum physical address width a CPU can detects(by > > > > > > cpuid.0x8008). > > > > > > > > > > > > I agree there are some addresses the CPU does not touch, but they > > > > > > are still in > > > > > > the physical address space, and there's only one physical address > > > > > > space... > > > > > > > > > > > > B.R. > > > > > > Yu > > > > > > > > > > Ouch I thought we are talking about the virtual address size. > > > > > I think I did have a box where VTD's virtual address size was > > > > > smaller than CPU's. > > > > > For physical one - we just need to make it as big as max supported > > > > > memory right? > > > > > > > > Well, my understanding of the physical one is the maximum physical > > > > address > > > > width. Sorry, this explain seems nonsense... I mean, it's not just about > > > > the max supported memory, but also covers MMIO. It shall be detectable > > > > from cpuid, or ACPI's DMAR table, instead of calculated by the max > > > > memory > > > > size. One common usage of this value is to tell the paging structure > > > > entries( > > > > CPU's or IOMMU's) which bits shall be reserved. There are also some > > > > registers > > > > e.g. apic base reg etc, whose contents are physical addresses, > > > > therefore also > > > > need to follow the similar requirement for the reserved bits. > > > > > > > > So I think the correct direction might be to define this property in the > > > > machine status level, instead of the CPU level. Is this reasonable to > > > > you? > > > > > > At that level yes. But isn't this already specified by "pci-hole64-end"? > > > > But this value is set by guest firmware? Will PCI hotplug change this > > address? > > > > @Eduardo, do you have any plan to calculate the phys-bits by > > "pci-hole64-end"? > > Or introduce another property, say "max-phys-bits" in machine status? > > I agree it may make sense to make the machine code control > phys-bits instead of the CPU object. A machine property sounds > like the simplest solution. > > But I don't think we can have a meaningful discussion about > implementation if we don't agree about the command-line > interface. We must decide what will happen to the CPU and iommu > physical address width in cases like: Thanks, Eduardo. What about we just use "-machine phys-bits=52", and remove the "phys-bits" from CPU parameter? > > $QEMU -device intel-iommu > $QEMU -cpu ...,phys-bits=50 -device intel-iommu > $QEMU -cpu ...,host-phys-bits=on -device intel-iommu > $QEMU -machine phys-bits=50 -device intel-iommu > $QEMU -machine phys-bits=50 -cpu ...,phys-bits=48 -device intel-iommu > > -- > Eduardo > B.R. Yu
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, Dec 26, 2018 at 01:30:00PM +0800, Yu Zhang wrote: > On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > > Well, my understanding of the vt-d spec is that the address > > > > > > > limitation in > > > > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do > > > > > > > not think > > > > > > > there's any different in the native scenario. :) > > > > > > > > > > > > I think native machines exist on which the two values are different. > > > > > > Is that true? > > > > > > > > > > I think the answer is not. My understanding is that HAW(host address > > > > > wdith) is > > > > > the maximum physical address width a CPU can detects(by > > > > > cpuid.0x8008). > > > > > > > > > > I agree there are some addresses the CPU does not touch, but they are > > > > > still in > > > > > the physical address space, and there's only one physical address > > > > > space... > > > > > > > > > > B.R. > > > > > Yu > > > > > > > > Ouch I thought we are talking about the virtual address size. > > > > I think I did have a box where VTD's virtual address size was > > > > smaller than CPU's. > > > > For physical one - we just need to make it as big as max supported > > > > memory right? > > > > > > Well, my understanding of the physical one is the maximum physical address > > > width. Sorry, this explain seems nonsense... I mean, it's not just about > > > the max supported memory, but also covers MMIO. It shall be detectable > > > from cpuid, or ACPI's DMAR table, instead of calculated by the max memory > > > size. One common usage of this value is to tell the paging structure > > > entries( > > > CPU's or IOMMU's) which bits shall be reserved. There are also some > > > registers > > > e.g. apic base reg etc, whose contents are physical addresses, therefore > > > also > > > need to follow the similar requirement for the reserved bits. > > > > > > So I think the correct direction might be to define this property in the > > > machine status level, instead of the CPU level. Is this reasonable to you? > > > > At that level yes. But isn't this already specified by "pci-hole64-end"? > > But this value is set by guest firmware? Will PCI hotplug change this address? > > @Eduardo, do you have any plan to calculate the phys-bits by "pci-hole64-end"? > Or introduce another property, say "max-phys-bits" in machine status? I agree it may make sense to make the machine code control phys-bits instead of the CPU object. A machine property sounds like the simplest solution. But I don't think we can have a meaningful discussion about implementation if we don't agree about the command-line interface. We must decide what will happen to the CPU and iommu physical address width in cases like: $QEMU -device intel-iommu $QEMU -cpu ...,phys-bits=50 -device intel-iommu $QEMU -cpu ...,host-phys-bits=on -device intel-iommu $QEMU -machine phys-bits=50 -device intel-iommu $QEMU -machine phys-bits=50 -cpu ...,phys-bits=48 -device intel-iommu -- Eduardo
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 21, 2018 at 03:13:25PM +0100, Igor Mammedov wrote: > On Thu, 20 Dec 2018 19:18:01 -0200 > Eduardo Habkost wrote: > > > On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > > > On Wed, 19 Dec 2018 10:57:17 +0800 > > > Yu Zhang wrote: > > > > > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > > Yu Zhang wrote: > > > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > > instead of > > > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > > > bits in > > > > > > > > data structures such as root entries, context entries, and > > > > > > > > entries of > > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > > equal. For > > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > > > wider than > > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit > > > > > > > > may result > > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > > IntelIOMMUState, > > > > > > > > whose value is initialized based on the maximum physical > > > > > > > > address set to > > > > > > > > guest CPU. > > > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > > > to clarify. > > > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > > Reviewed-by: Peter Xu > > > > > > > > --- > > > > > > > [...] > > > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > > { > > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > > +CPUState *cs = first_cpu; > > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > > } > > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > > altogether > > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > > deciding > > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > > cpuid.MAXPHYSADDR > > > > > > are referring to the same concept. In VM, both are the maximum > > > > > > guest physical > > > > > > address width. If we do not check the CPU field here, we will still > > > > > > have to > > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > > reset the > > > > > > s->haw_bits again. > > > > > > > > > > > > Is this explanation convincing enough? :) > > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > > series that > > > > > contains not acceptable direct access from one device (iommu) to > > > > > another (cpu). > > > > > Proper way would be for the owner of iommu to fish limits from > > > > > somewhere and set > > > > > values during iommu creation. > > > > > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > > > incorrect value. :) > > > > According to the spec, the host address width is the maximum physical > > > > address width, > > > > yet current implementation is using the DMA address width. For me, this > > > > is not only > > > > wrong, but also unsecure. For this point, I think we all agree this > > > > need to be fixed. > > > > > > > > As to how to fix it - should we query the cpu fields, I still do not > > > > understand why > > > > this is not acceptable. :) > > > > > > > > I had thought of other approaches before, yet I did not choose: > > > > > > > > 1> Introduce a new parameter, say,
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > Well, my understanding of the vt-d spec is that the address > > > > > > limitation in > > > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do > > > > > > not think > > > > > > there's any different in the native scenario. :) > > > > > > > > > > I think native machines exist on which the two values are different. > > > > > Is that true? > > > > > > > > I think the answer is not. My understanding is that HAW(host address > > > > wdith) is > > > > the maximum physical address width a CPU can detects(by > > > > cpuid.0x8008). > > > > > > > > I agree there are some addresses the CPU does not touch, but they are > > > > still in > > > > the physical address space, and there's only one physical address > > > > space... > > > > > > > > B.R. > > > > Yu > > > > > > Ouch I thought we are talking about the virtual address size. > > > I think I did have a box where VTD's virtual address size was > > > smaller than CPU's. > > > For physical one - we just need to make it as big as max supported > > > memory right? > > > > Well, my understanding of the physical one is the maximum physical address > > width. Sorry, this explain seems nonsense... I mean, it's not just about > > the max supported memory, but also covers MMIO. It shall be detectable > > from cpuid, or ACPI's DMAR table, instead of calculated by the max memory > > size. One common usage of this value is to tell the paging structure > > entries( > > CPU's or IOMMU's) which bits shall be reserved. There are also some > > registers > > e.g. apic base reg etc, whose contents are physical addresses, therefore > > also > > need to follow the similar requirement for the reserved bits. > > > > So I think the correct direction might be to define this property in the > > machine status level, instead of the CPU level. Is this reasonable to you? > > At that level yes. But isn't this already specified by "pci-hole64-end"? But this value is set by guest firmware? Will PCI hotplug change this address? @Eduardo, do you have any plan to calculate the phys-bits by "pci-hole64-end"? Or introduce another property, say "max-phys-bits" in machine status? > > > > > > > > > -- > > > MST > > > > B.R. > > Yu > B.R. Yu
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > Well, my understanding of the vt-d spec is that the address > > > > > limitation in > > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not > > > > > think > > > > > there's any different in the native scenario. :) > > > > > > > > I think native machines exist on which the two values are different. > > > > Is that true? > > > > > > I think the answer is not. My understanding is that HAW(host address > > > wdith) is > > > the maximum physical address width a CPU can detects(by cpuid.0x8008). > > > > > > I agree there are some addresses the CPU does not touch, but they are > > > still in > > > the physical address space, and there's only one physical address space... > > > > > > B.R. > > > Yu > > > > Ouch I thought we are talking about the virtual address size. > > I think I did have a box where VTD's virtual address size was > > smaller than CPU's. > > For physical one - we just need to make it as big as max supported > > memory right? > > Well, my understanding of the physical one is the maximum physical address > width. Sorry, this explain seems nonsense... I mean, it's not just about > the max supported memory, but also covers MMIO. It shall be detectable > from cpuid, or ACPI's DMAR table, instead of calculated by the max memory > size. One common usage of this value is to tell the paging structure entries( > CPU's or IOMMU's) which bits shall be reserved. There are also some registers > e.g. apic base reg etc, whose contents are physical addresses, therefore also > need to follow the similar requirement for the reserved bits. > > So I think the correct direction might be to define this property in the > machine status level, instead of the CPU level. Is this reasonable to you? At that level yes. But isn't this already specified by "pci-hole64-end"? > > > > -- > > MST > > B.R. > Yu
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > Well, my understanding of the vt-d spec is that the address limitation > > > > in > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not > > > > think > > > > there's any different in the native scenario. :) > > > > > > I think native machines exist on which the two values are different. > > > Is that true? > > > > I think the answer is not. My understanding is that HAW(host address wdith) > > is > > the maximum physical address width a CPU can detects(by cpuid.0x8008). > > > > I agree there are some addresses the CPU does not touch, but they are still > > in > > the physical address space, and there's only one physical address space... > > > > B.R. > > Yu > > Ouch I thought we are talking about the virtual address size. > I think I did have a box where VTD's virtual address size was > smaller than CPU's. > For physical one - we just need to make it as big as max supported > memory right? Well, my understanding of the physical one is the maximum physical address width. Sorry, this explain seems nonsense... I mean, it's not just about the max supported memory, but also covers MMIO. It shall be detectable from cpuid, or ACPI's DMAR table, instead of calculated by the max memory size. One common usage of this value is to tell the paging structure entries( CPU's or IOMMU's) which bits shall be reserved. There are also some registers e.g. apic base reg etc, whose contents are physical addresses, therefore also need to follow the similar requirement for the reserved bits. So I think the correct direction might be to define this property in the machine status level, instead of the CPU level. Is this reasonable to you? > > -- > MST B.R. Yu
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > Well, my understanding of the vt-d spec is that the address limitation > > > > in > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not > > > > think > > > > there's any different in the native scenario. :) > > > > > > I think native machines exist on which the two values are different. > > > Is that true? > > > > I think the answer is not. My understanding is that HAW(host address wdith) > > is > > the maximum physical address width a CPU can detects(by cpuid.0x8008). > > > > I agree there are some addresses the CPU does not touch, but they are still > > in > > the physical address space, and there's only one physical address space... > > > > B.R. > > Yu > > Ouch I thought we are talking about the virtual address size. > I think I did have a box where VTD's virtual address size was > smaller than CPU's. > For physical one - we just need to make it as big as max supported > memory right? What exactly do you mean by "max supported memory"? -- Eduardo
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > Well, my understanding of the vt-d spec is that the address limitation in > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not > > > think > > > there's any different in the native scenario. :) > > > > I think native machines exist on which the two values are different. > > Is that true? > > I think the answer is not. My understanding is that HAW(host address wdith) is > the maximum physical address width a CPU can detects(by cpuid.0x8008). > > I agree there are some addresses the CPU does not touch, but they are still in > the physical address space, and there's only one physical address space... > > B.R. > Yu Ouch I thought we are talking about the virtual address size. I think I did have a box where VTD's virtual address size was smaller than CPU's. For physical one - we just need to make it as big as max supported memory right? -- MST
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > Well, my understanding of the vt-d spec is that the address limitation in > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not think > > there's any different in the native scenario. :) > > I think native machines exist on which the two values are different. > Is that true? I think the answer is not. My understanding is that HAW(host address wdith) is the maximum physical address width a CPU can detects(by cpuid.0x8008). I agree there are some addresses the CPU does not touch, but they are still in the physical address space, and there's only one physical address space... B.R. Yu
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > Well, my understanding of the vt-d spec is that the address limitation in > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not think > there's any different in the native scenario. :) I think native machines exist on which the two values are different. Is that true?
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Fri, Dec 21, 2018 at 03:13:25PM +0100, Igor Mammedov wrote: > On Thu, 20 Dec 2018 19:18:01 -0200 > Eduardo Habkost wrote: > > > On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > > > On Wed, 19 Dec 2018 10:57:17 +0800 > > > Yu Zhang wrote: > > > > > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > > Yu Zhang wrote: > > > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > > instead of > > > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > > > bits in > > > > > > > > data structures such as root entries, context entries, and > > > > > > > > entries of > > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > > equal. For > > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > > > wider than > > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit > > > > > > > > may result > > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > > IntelIOMMUState, > > > > > > > > whose value is initialized based on the maximum physical > > > > > > > > address set to > > > > > > > > guest CPU. > > > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > > > to clarify. > > > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > > Reviewed-by: Peter Xu > > > > > > > > --- > > > > > > > [...] > > > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > > { > > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > > +CPUState *cs = first_cpu; > > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > > } > > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > > altogether > > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > > deciding > > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > > cpuid.MAXPHYSADDR > > > > > > are referring to the same concept. In VM, both are the maximum > > > > > > guest physical > > > > > > address width. If we do not check the CPU field here, we will still > > > > > > have to > > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > > reset the > > > > > > s->haw_bits again. > > > > > > > > > > > > Is this explanation convincing enough? :) > > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > > series that > > > > > contains not acceptable direct access from one device (iommu) to > > > > > another (cpu). > > > > > Proper way would be for the owner of iommu to fish limits from > > > > > somewhere and set > > > > > values during iommu creation. > > > > > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > > > incorrect value. :) > > > > According to the spec, the host address width is the maximum physical > > > > address width, > > > > yet current implementation is using the DMA address width. For me, this > > > > is not only > > > > wrong, but also unsecure. For this point, I think we all agree this > > > > need to be fixed. > > > > > > > > As to how to fix it - should we query the cpu fields, I still do not > > > > understand why > > > > this is not acceptable. :) > > > > > > > > I had thought of other approaches before, yet I did not choose: > > > > > > > > 1> Introduce a new parameter, say,
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Thu, 20 Dec 2018 19:18:01 -0200 Eduardo Habkost wrote: > On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > > On Wed, 19 Dec 2018 10:57:17 +0800 > > Yu Zhang wrote: > > > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > Yu Zhang wrote: > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > instead of > > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > > bits in > > > > > > > data structures such as root entries, context entries, and > > > > > > > entries of > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > equal. For > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > > wider than > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit > > > > > > > may result > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > IntelIOMMUState, > > > > > > > whose value is initialized based on the maximum physical address > > > > > > > set to > > > > > > > guest CPU. > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > > to clarify. > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > Reviewed-by: Peter Xu > > > > > > > --- > > > > > > [...] > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > { > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > +CPUState *cs = first_cpu; > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > } > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > altogether > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > deciding > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > cpuid.MAXPHYSADDR > > > > > are referring to the same concept. In VM, both are the maximum guest > > > > > physical > > > > > address width. If we do not check the CPU field here, we will still > > > > > have to > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > reset the > > > > > s->haw_bits again. > > > > > > > > > > Is this explanation convincing enough? :) > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > series that > > > > contains not acceptable direct access from one device (iommu) to > > > > another (cpu). > > > > Proper way would be for the owner of iommu to fish limits from > > > > somewhere and set > > > > values during iommu creation. > > > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > > incorrect value. :) > > > According to the spec, the host address width is the maximum physical > > > address width, > > > yet current implementation is using the DMA address width. For me, this > > > is not only > > > wrong, but also unsecure. For this point, I think we all agree this need > > > to be fixed. > > > > > > As to how to fix it - should we query the cpu fields, I still do not > > > understand why > > > this is not acceptable. :) > > > > > > I had thought of other approaches before, yet I did not choose: > > > > > > 1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu > > > to limit its > > > physical address width(similar to the "x-aw-bits" for IOVA). But what > > > should we check > > > this parameter or not? What if this parameter is set to sth. different > > > than the "phys-bits" > > > or not? > > > > > >
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > On Wed, 19 Dec 2018 10:57:17 +0800 > Yu Zhang wrote: > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > Yu Zhang wrote: > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > Yu Zhang wrote: > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead > > > > > > of > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > bits in > > > > > > data structures such as root entries, context entries, and entries > > > > > > of > > > > > > DMA paging structures etc. > > > > > > > > > > > > However values of IOVA address width and of the HAW may not equal. > > > > > > For > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > wider than > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > > > result > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > IntelIOMMUState, > > > > > > whose value is initialized based on the maximum physical address > > > > > > set to > > > > > > guest CPU. > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > to clarify. > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > Reviewed-by: Peter Xu > > > > > > --- > > > > > [...] > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > { > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > +CPUState *cs = first_cpu; > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > } > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > altogether > > > > > and set phys_bits when iommu is created? > > > > > > > > Thanks for your comments, Igor. > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > deciding > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > cpuid.MAXPHYSADDR > > > > are referring to the same concept. In VM, both are the maximum guest > > > > physical > > > > address width. If we do not check the CPU field here, we will still > > > > have to > > > > check the CPU field in other places such as build_dmar_q35(), and reset > > > > the > > > > s->haw_bits again. > > > > > > > > Is this explanation convincing enough? :) > > > current build_dmar_q35() doesn't do it, it's all new code in this series > > > that > > > contains not acceptable direct access from one device (iommu) to another > > > (cpu). > > > Proper way would be for the owner of iommu to fish limits from somewhere > > > and set > > > values during iommu creation. > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > incorrect value. :) > > According to the spec, the host address width is the maximum physical > > address width, > > yet current implementation is using the DMA address width. For me, this is > > not only > > wrong, but also unsecure. For this point, I think we all agree this need to > > be fixed. > > > > As to how to fix it - should we query the cpu fields, I still do not > > understand why > > this is not acceptable. :) > > > > I had thought of other approaches before, yet I did not choose: > > > > 1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu to > > limit its > > physical address width(similar to the "x-aw-bits" for IOVA). But what > > should we check > > this parameter or not? What if this parameter is set to sth. different than > > the "phys-bits" > > or not? > > > > 2> Another choice I had thought of is, to query the physical iommu. I > > abandoned this > > idea because my understanding is that vIOMMU is not a passthrued device, it > > is emulated. > > > So Igor, may I ask why you think checking against the cpu fields so not > > acceptable? :) > Because accessing private
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 18, 2018 at 05:27:23PM +0800, Yu Zhang wrote: > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > On Wed, 12 Dec 2018 21:05:38 +0800 > > Yu Zhang wrote: > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > the host address width(HAW) to calculate the number of reserved bits in > > > data structures such as root entries, context entries, and entries of > > > DMA paging structures etc. > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > example, a 48-bit IOVA can only be mapped to host addresses no wider than > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may result > > > in an invalid IOVA being accepted. > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > IntelIOMMUState, > > > whose value is initialized based on the maximum physical address set to > > > guest CPU. > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > to clarify. > > > > > > Signed-off-by: Yu Zhang > > > Reviewed-by: Peter Xu > > > --- > > [...] > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > *iommu_mr, IOMMUNotifier *n) > > > static void vtd_init(IntelIOMMUState *s) > > > { > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > +CPUState *cs = first_cpu; > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > } > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > +s->haw_bits = cpu->phys_bits; > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > and set phys_bits when iommu is created? > > Thanks for your comments, Igor. > > Well, I guess you prefer not to query the CPU capabilities while deciding > the vIOMMU features. But to me, they are not that irrelevant.:) > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > are referring to the same concept. In VM, both are the maximum guest physical > address width. If we do not check the CPU field here, we will still have to > check the CPU field in other places such as build_dmar_q35(), and reset the > s->haw_bits again. > > Is this explanation convincing enough? :) > > > > > Perhaps Eduardo > > can suggest better approach, since he's more familiar with phys_bits topic > > @Eduardo, any comments? Thanks! Configuring IOMMU phys-bits automatically depending on the configured CPU is OK, but accessing first_cpu directly in iommu code is. I suggest delegating this to the machine object, e.g.: uint32_t pc_max_phys_bits(PCMachineState *pcms) { return object_property_get_uint(OBJECT(first_cpu), "phys-bits", _abort); } as the machine itself is responsible for creating the CPU objects, and I believe there are other places in PC code where we do physical address calculations that could be affected by the physical address space size. -- Eduardo
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, Dec 19, 2018 at 11:47:23AM -0500, Michael S. Tsirkin wrote: > On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > > On Wed, 19 Dec 2018 10:57:17 +0800 > > Yu Zhang wrote: > > > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > Yu Zhang wrote: > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > instead of > > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > > bits in > > > > > > > data structures such as root entries, context entries, and > > > > > > > entries of > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > equal. For > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > > wider than > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit > > > > > > > may result > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > IntelIOMMUState, > > > > > > > whose value is initialized based on the maximum physical address > > > > > > > set to > > > > > > > guest CPU. > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > > to clarify. > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > Reviewed-by: Peter Xu > > > > > > > --- > > > > > > [...] > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > { > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > +CPUState *cs = first_cpu; > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > } > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > altogether > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > deciding > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > cpuid.MAXPHYSADDR > > > > > are referring to the same concept. In VM, both are the maximum guest > > > > > physical > > > > > address width. If we do not check the CPU field here, we will still > > > > > have to > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > reset the > > > > > s->haw_bits again. > > > > > > > > > > Is this explanation convincing enough? :) > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > series that > > > > contains not acceptable direct access from one device (iommu) to > > > > another (cpu). > > > > Proper way would be for the owner of iommu to fish limits from > > > > somewhere and set > > > > values during iommu creation. > > > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > > incorrect value. :) > > > According to the spec, the host address width is the maximum physical > > > address width, > > > yet current implementation is using the DMA address width. For me, this > > > is not only > > > wrong, but also unsecure. For this point, I think we all agree this need > > > to be fixed. > > > > > > As to how to fix it - should we query the cpu fields, I still do not > > > understand why > > > this is not acceptable. :) > > > > > > I had thought of other approaches before, yet I did not choose: > > > > > > 1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu > > > to limit its > > > physical address width(similar to the "x-aw-bits" for IOVA). But what > > > should we check > > > this parameter or not? What if this parameter is set to sth. different > > > than the "phys-bits" > > > or not? > > > > > > 2> Another choice I had
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, Dec 19, 2018 at 11:40:37AM +0100, Igor Mammedov wrote: > On Wed, 19 Dec 2018 10:57:17 +0800 > Yu Zhang wrote: > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > Yu Zhang wrote: > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > Yu Zhang wrote: > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead > > > > > > of > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > bits in > > > > > > data structures such as root entries, context entries, and entries > > > > > > of > > > > > > DMA paging structures etc. > > > > > > > > > > > > However values of IOVA address width and of the HAW may not equal. > > > > > > For > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > wider than > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > > > result > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > IntelIOMMUState, > > > > > > whose value is initialized based on the maximum physical address > > > > > > set to > > > > > > guest CPU. > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > to clarify. > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > Reviewed-by: Peter Xu > > > > > > --- > > > > > [...] > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > { > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > +CPUState *cs = first_cpu; > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > } > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > altogether > > > > > and set phys_bits when iommu is created? > > > > > > > > Thanks for your comments, Igor. > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > deciding > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > cpuid.MAXPHYSADDR > > > > are referring to the same concept. In VM, both are the maximum guest > > > > physical > > > > address width. If we do not check the CPU field here, we will still > > > > have to > > > > check the CPU field in other places such as build_dmar_q35(), and reset > > > > the > > > > s->haw_bits again. > > > > > > > > Is this explanation convincing enough? :) > > > current build_dmar_q35() doesn't do it, it's all new code in this series > > > that > > > contains not acceptable direct access from one device (iommu) to another > > > (cpu). > > > Proper way would be for the owner of iommu to fish limits from somewhere > > > and set > > > values during iommu creation. > > > > Well, current build_dmar_q35() doesn't do it, because it is using the > > incorrect value. :) > > According to the spec, the host address width is the maximum physical > > address width, > > yet current implementation is using the DMA address width. For me, this is > > not only > > wrong, but also unsecure. For this point, I think we all agree this need to > > be fixed. > > > > As to how to fix it - should we query the cpu fields, I still do not > > understand why > > this is not acceptable. :) > > > > I had thought of other approaches before, yet I did not choose: > > > > 1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu to > > limit its > > physical address width(similar to the "x-aw-bits" for IOVA). But what > > should we check > > this parameter or not? What if this parameter is set to sth. different than > > the "phys-bits" > > or not? > > > > 2> Another choice I had thought of is, to query the physical iommu. I > > abandoned this > > idea because my understanding is that vIOMMU is not a passthrued device, it > > is emulated. > > > So Igor, may I ask why you think checking against the cpu fields so not > > acceptable? :) > Because accessing private
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, Dec 19, 2018 at 02:28:10PM +0800, Yu Zhang wrote: > On Tue, Dec 18, 2018 at 10:12:45PM -0500, Michael S. Tsirkin wrote: > > On Wed, Dec 19, 2018 at 11:03:58AM +0800, Yu Zhang wrote: > > > On Tue, Dec 18, 2018 at 09:58:35AM -0500, Michael S. Tsirkin wrote: > > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > > Yu Zhang wrote: > > > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > > instead of > > > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > > > bits in > > > > > > > > data structures such as root entries, context entries, and > > > > > > > > entries of > > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > > equal. For > > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > > > wider than > > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit > > > > > > > > may result > > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > > IntelIOMMUState, > > > > > > > > whose value is initialized based on the maximum physical > > > > > > > > address set to > > > > > > > > guest CPU. > > > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > > > to clarify. > > > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > > Reviewed-by: Peter Xu > > > > > > > > --- > > > > > > > [...] > > > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > > { > > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > > +CPUState *cs = first_cpu; > > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > > } > > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > > altogether > > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > > deciding > > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > > cpuid.MAXPHYSADDR > > > > > > are referring to the same concept. In VM, both are the maximum > > > > > > guest physical > > > > > > address width. If we do not check the CPU field here, we will still > > > > > > have to > > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > > reset the > > > > > > s->haw_bits again. > > > > > > > > > > > > Is this explanation convincing enough? :) > > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > > series that > > > > > contains not acceptable direct access from one device (iommu) to > > > > > another (cpu). > > > > > Proper way would be for the owner of iommu to fish limits from > > > > > somewhere and set > > > > > values during iommu creation. > > > > > > > > Maybe it's a good idea to add documentation for now. > > > > > > Thanks Michael. So what kind of documentation do you refer? > > > > The idea would be to have two properties, AW for the CPU and > > the IOMMU. In the documentation explain that they > > should normally be set to the same value. > > > > > > > > > > It would be nice not to push this stuff up the stack, > > > > it's unfortunate that our internal APIs make it hard. > > > > > > Sorry, I do not quite get it. What do you mean "internal APIs make it > > > hard"? :) > > > > The API doesn't actually guarantee any initialization order. > > CPU happens to be initialized first but I do not > > think there's a guarantee that it will keep being the case. > > This makes it hard
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, 19 Dec 2018 10:57:17 +0800 Yu Zhang wrote: > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > On Tue, 18 Dec 2018 17:27:23 +0800 > > Yu Zhang wrote: > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > Yu Zhang wrote: > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > > > the host address width(HAW) to calculate the number of reserved bits > > > > > in > > > > > data structures such as root entries, context entries, and entries of > > > > > DMA paging structures etc. > > > > > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > > > example, a 48-bit IOVA can only be mapped to host addresses no wider > > > > > than > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > > result > > > > > in an invalid IOVA being accepted. > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > IntelIOMMUState, > > > > > whose value is initialized based on the maximum physical address set > > > > > to > > > > > guest CPU. > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > to clarify. > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > Reviewed-by: Peter Xu > > > > > --- > > > > [...] > > > > > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > > > *iommu_mr, IOMMUNotifier *n) > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > { > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > +CPUState *cs = first_cpu; > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > } > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > +s->haw_bits = cpu->phys_bits; > > > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > > > and set phys_bits when iommu is created? > > > > > > Thanks for your comments, Igor. > > > > > > Well, I guess you prefer not to query the CPU capabilities while deciding > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > > > are referring to the same concept. In VM, both are the maximum guest > > > physical > > > address width. If we do not check the CPU field here, we will still have > > > to > > > check the CPU field in other places such as build_dmar_q35(), and reset > > > the > > > s->haw_bits again. > > > > > > Is this explanation convincing enough? :) > > current build_dmar_q35() doesn't do it, it's all new code in this series > > that > > contains not acceptable direct access from one device (iommu) to another > > (cpu). > > Proper way would be for the owner of iommu to fish limits from somewhere > > and set > > values during iommu creation. > > Well, current build_dmar_q35() doesn't do it, because it is using the > incorrect value. :) > According to the spec, the host address width is the maximum physical address > width, > yet current implementation is using the DMA address width. For me, this is > not only > wrong, but also unsecure. For this point, I think we all agree this need to > be fixed. > > As to how to fix it - should we query the cpu fields, I still do not > understand why > this is not acceptable. :) > > I had thought of other approaches before, yet I did not choose: > > 1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu to > limit its > physical address width(similar to the "x-aw-bits" for IOVA). But what should > we check > this parameter or not? What if this parameter is set to sth. different than > the "phys-bits" > or not? > > 2> Another choice I had thought of is, to query the physical iommu. I > abandoned this > idea because my understanding is that vIOMMU is not a passthrued device, it > is emulated. > So Igor, may I ask why you think checking against the cpu fields so not > acceptable? :) Because accessing private fields of device from another random device is not robust and a subject to breaking in unpredictable manner when field meaning or initialization order changes. (analogy to baremetal: one does not solder wire to a CPU die to let access some piece of data from random device). I've looked at intel-iommu code and how it's created so here is a way to do the thing
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 18, 2018 at 10:12:45PM -0500, Michael S. Tsirkin wrote: > On Wed, Dec 19, 2018 at 11:03:58AM +0800, Yu Zhang wrote: > > On Tue, Dec 18, 2018 at 09:58:35AM -0500, Michael S. Tsirkin wrote: > > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > > Yu Zhang wrote: > > > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > > Yu Zhang wrote: > > > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, > > > > > > > instead of > > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > > bits in > > > > > > > data structures such as root entries, context entries, and > > > > > > > entries of > > > > > > > DMA paging structures etc. > > > > > > > > > > > > > > However values of IOVA address width and of the HAW may not > > > > > > > equal. For > > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > > wider than > > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit > > > > > > > may result > > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > > IntelIOMMUState, > > > > > > > whose value is initialized based on the maximum physical address > > > > > > > set to > > > > > > > guest CPU. > > > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > > to clarify. > > > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > > Reviewed-by: Peter Xu > > > > > > > --- > > > > > > [...] > > > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > > { > > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > > +CPUState *cs = first_cpu; > > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > > } > > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > > altogether > > > > > > and set phys_bits when iommu is created? > > > > > > > > > > Thanks for your comments, Igor. > > > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > > deciding > > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > > cpuid.MAXPHYSADDR > > > > > are referring to the same concept. In VM, both are the maximum guest > > > > > physical > > > > > address width. If we do not check the CPU field here, we will still > > > > > have to > > > > > check the CPU field in other places such as build_dmar_q35(), and > > > > > reset the > > > > > s->haw_bits again. > > > > > > > > > > Is this explanation convincing enough? :) > > > > current build_dmar_q35() doesn't do it, it's all new code in this > > > > series that > > > > contains not acceptable direct access from one device (iommu) to > > > > another (cpu). > > > > Proper way would be for the owner of iommu to fish limits from > > > > somewhere and set > > > > values during iommu creation. > > > > > > Maybe it's a good idea to add documentation for now. > > > > Thanks Michael. So what kind of documentation do you refer? > > The idea would be to have two properties, AW for the CPU and > the IOMMU. In the documentation explain that they > should normally be set to the same value. > > > > > > > It would be nice not to push this stuff up the stack, > > > it's unfortunate that our internal APIs make it hard. > > > > Sorry, I do not quite get it. What do you mean "internal APIs make it > > hard"? :) > > The API doesn't actually guarantee any initialization order. > CPU happens to be initialized first but I do not > think there's a guarantee that it will keep being the case. > This makes it hard to get properties from one device > and use in another one. > Oops... Then there can be no easy way in the runtime to gurantee this. BTW, could we initialize CPU before other components? Is it hard to do, or not reasonable to do so? I have plan to draft a doc in qemu on
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 18, 2018 at 09:58:35AM -0500, Michael S. Tsirkin wrote: > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > On Tue, 18 Dec 2018 17:27:23 +0800 > > Yu Zhang wrote: > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > Yu Zhang wrote: > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > > > the host address width(HAW) to calculate the number of reserved bits > > > > > in > > > > > data structures such as root entries, context entries, and entries of > > > > > DMA paging structures etc. > > > > > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > > > example, a 48-bit IOVA can only be mapped to host addresses no wider > > > > > than > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > > result > > > > > in an invalid IOVA being accepted. > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > IntelIOMMUState, > > > > > whose value is initialized based on the maximum physical address set > > > > > to > > > > > guest CPU. > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > to clarify. > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > Reviewed-by: Peter Xu > > > > > --- > > > > [...] > > > > > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > > > *iommu_mr, IOMMUNotifier *n) > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > { > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > +CPUState *cs = first_cpu; > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > } > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > +s->haw_bits = cpu->phys_bits; > > > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > > > and set phys_bits when iommu is created? > > > > > > Thanks for your comments, Igor. > > > > > > Well, I guess you prefer not to query the CPU capabilities while deciding > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > > > are referring to the same concept. In VM, both are the maximum guest > > > physical > > > address width. If we do not check the CPU field here, we will still have > > > to > > > check the CPU field in other places such as build_dmar_q35(), and reset > > > the > > > s->haw_bits again. > > > > > > Is this explanation convincing enough? :) > > current build_dmar_q35() doesn't do it, it's all new code in this series > > that > > contains not acceptable direct access from one device (iommu) to another > > (cpu). > > Proper way would be for the owner of iommu to fish limits from somewhere > > and set > > values during iommu creation. > > Maybe it's a good idea to add documentation for now. Thanks Michael. So what kind of documentation do you refer? > > It would be nice not to push this stuff up the stack, > it's unfortunate that our internal APIs make it hard. Sorry, I do not quite get it. What do you mean "internal APIs make it hard"? :) > > > > > > > > > > Perhaps Eduardo > > > > can suggest better approach, since he's more familiar with phys_bits > > > > topic > > > > > > @Eduardo, any comments? Thanks! > > > > > > > > > > > > /* > > > > > * Rsvd field masks for spte > > > > > */ > > > > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > > > > -vtd_paging_entry_rsvd_field[1] = > > > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[2] = > > > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[3] = > > > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[4] = > > > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[5] = > > > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[6] = > > > > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[7] = > > > > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits); > > > > > -vtd_paging_entry_rsvd_field[8] = > > > > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits); > > > > > +vtd_paging_entry_rsvd_field[1] = > > > > >
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, Dec 19, 2018 at 11:03:58AM +0800, Yu Zhang wrote: > On Tue, Dec 18, 2018 at 09:58:35AM -0500, Michael S. Tsirkin wrote: > > On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > > > On Tue, 18 Dec 2018 17:27:23 +0800 > > > Yu Zhang wrote: > > > > > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > > > Yu Zhang wrote: > > > > > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead > > > > > > of > > > > > > the host address width(HAW) to calculate the number of reserved > > > > > > bits in > > > > > > data structures such as root entries, context entries, and entries > > > > > > of > > > > > > DMA paging structures etc. > > > > > > > > > > > > However values of IOVA address width and of the HAW may not equal. > > > > > > For > > > > > > example, a 48-bit IOVA can only be mapped to host addresses no > > > > > > wider than > > > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > > > result > > > > > > in an invalid IOVA being accepted. > > > > > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > > > IntelIOMMUState, > > > > > > whose value is initialized based on the maximum physical address > > > > > > set to > > > > > > guest CPU. > > > > > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > > > to clarify. > > > > > > > > > > > > Signed-off-by: Yu Zhang > > > > > > Reviewed-by: Peter Xu > > > > > > --- > > > > > [...] > > > > > > > > > > > @@ -3100,6 +3104,8 @@ static void > > > > > > vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > > > > > > static void vtd_init(IntelIOMMUState *s) > > > > > > { > > > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > > > +CPUState *cs = first_cpu; > > > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > > > } > > > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > > > +s->haw_bits = cpu->phys_bits; > > > > > Is it possible to avoid accessing CPU fields directly or cpu > > > > > altogether > > > > > and set phys_bits when iommu is created? > > > > > > > > Thanks for your comments, Igor. > > > > > > > > Well, I guess you prefer not to query the CPU capabilities while > > > > deciding > > > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > > > > > Here the hardware address width in vt-d, and the one in > > > > cpuid.MAXPHYSADDR > > > > are referring to the same concept. In VM, both are the maximum guest > > > > physical > > > > address width. If we do not check the CPU field here, we will still > > > > have to > > > > check the CPU field in other places such as build_dmar_q35(), and reset > > > > the > > > > s->haw_bits again. > > > > > > > > Is this explanation convincing enough? :) > > > current build_dmar_q35() doesn't do it, it's all new code in this series > > > that > > > contains not acceptable direct access from one device (iommu) to another > > > (cpu). > > > Proper way would be for the owner of iommu to fish limits from somewhere > > > and set > > > values during iommu creation. > > > > Maybe it's a good idea to add documentation for now. > > Thanks Michael. So what kind of documentation do you refer? The idea would be to have two properties, AW for the CPU and the IOMMU. In the documentation explain that they should normally be set to the same value. > > > > It would be nice not to push this stuff up the stack, > > it's unfortunate that our internal APIs make it hard. > > Sorry, I do not quite get it. What do you mean "internal APIs make it hard"? > :) The API doesn't actually guarantee any initialization order. CPU happens to be initialized first but I do not think there's a guarantee that it will keep being the case. This makes it hard to get properties from one device and use in another one. > > > > > > > > > > > > > > Perhaps Eduardo > > > > > can suggest better approach, since he's more familiar with phys_bits > > > > > topic > > > > > > > > @Eduardo, any comments? Thanks! > > > > > > > > > > > > > > > /* > > > > > > * Rsvd field masks for spte > > > > > > */ > > > > > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > > > > > -vtd_paging_entry_rsvd_field[1] = > > > > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > > > > > -
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > On Tue, 18 Dec 2018 17:27:23 +0800 > Yu Zhang wrote: > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > Yu Zhang wrote: > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > > the host address width(HAW) to calculate the number of reserved bits in > > > > data structures such as root entries, context entries, and entries of > > > > DMA paging structures etc. > > > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > > example, a 48-bit IOVA can only be mapped to host addresses no wider > > > > than > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > result > > > > in an invalid IOVA being accepted. > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > IntelIOMMUState, > > > > whose value is initialized based on the maximum physical address set to > > > > guest CPU. > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > to clarify. > > > > > > > > Signed-off-by: Yu Zhang > > > > Reviewed-by: Peter Xu > > > > --- > > > [...] > > > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > > *iommu_mr, IOMMUNotifier *n) > > > > static void vtd_init(IntelIOMMUState *s) > > > > { > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > +CPUState *cs = first_cpu; > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > } > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > +s->haw_bits = cpu->phys_bits; > > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > > and set phys_bits when iommu is created? > > > > Thanks for your comments, Igor. > > > > Well, I guess you prefer not to query the CPU capabilities while deciding > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > > are referring to the same concept. In VM, both are the maximum guest > > physical > > address width. If we do not check the CPU field here, we will still have to > > check the CPU field in other places such as build_dmar_q35(), and reset the > > s->haw_bits again. > > > > Is this explanation convincing enough? :) > current build_dmar_q35() doesn't do it, it's all new code in this series that > contains not acceptable direct access from one device (iommu) to another > (cpu). > Proper way would be for the owner of iommu to fish limits from somewhere and > set > values during iommu creation. Well, current build_dmar_q35() doesn't do it, because it is using the incorrect value. :) According to the spec, the host address width is the maximum physical address width, yet current implementation is using the DMA address width. For me, this is not only wrong, but also unsecure. For this point, I think we all agree this need to be fixed. As to how to fix it - should we query the cpu fields, I still do not understand why this is not acceptable. :) I had thought of other approaches before, yet I did not choose: 1> Introduce a new parameter, say, "x-haw-bits" which is used for iommu to limit its physical address width(similar to the "x-aw-bits" for IOVA). But what should we check this parameter or not? What if this parameter is set to sth. different than the "phys-bits" or not? 2> Another choice I had thought of is, to query the physical iommu. I abandoned this idea because my understanding is that vIOMMU is not a passthrued device, it is emulated. So Igor, may I ask why you think checking against the cpu fields so not acceptable? :) > > > > > > > Perhaps Eduardo > > > can suggest better approach, since he's more familiar with phys_bits > > > topic > > > > @Eduardo, any comments? Thanks! > > > > > > > > > /* > > > > * Rsvd field masks for spte > > > > */ > > > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > > > -vtd_paging_entry_rsvd_field[1] = > > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[2] = > > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[3] = > > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[4] = > > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > > > > -
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 18, 2018 at 03:55:36PM +0100, Igor Mammedov wrote: > On Tue, 18 Dec 2018 17:27:23 +0800 > Yu Zhang wrote: > > > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > > On Wed, 12 Dec 2018 21:05:38 +0800 > > > Yu Zhang wrote: > > > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > > the host address width(HAW) to calculate the number of reserved bits in > > > > data structures such as root entries, context entries, and entries of > > > > DMA paging structures etc. > > > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > > example, a 48-bit IOVA can only be mapped to host addresses no wider > > > > than > > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may > > > > result > > > > in an invalid IOVA being accepted. > > > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > > IntelIOMMUState, > > > > whose value is initialized based on the maximum physical address set to > > > > guest CPU. > > > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > > to clarify. > > > > > > > > Signed-off-by: Yu Zhang > > > > Reviewed-by: Peter Xu > > > > --- > > > [...] > > > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > > *iommu_mr, IOMMUNotifier *n) > > > > static void vtd_init(IntelIOMMUState *s) > > > > { > > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > > +CPUState *cs = first_cpu; > > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > > } > > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > > +s->haw_bits = cpu->phys_bits; > > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > > and set phys_bits when iommu is created? > > > > Thanks for your comments, Igor. > > > > Well, I guess you prefer not to query the CPU capabilities while deciding > > the vIOMMU features. But to me, they are not that irrelevant.:) > > > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > > are referring to the same concept. In VM, both are the maximum guest > > physical > > address width. If we do not check the CPU field here, we will still have to > > check the CPU field in other places such as build_dmar_q35(), and reset the > > s->haw_bits again. > > > > Is this explanation convincing enough? :) > current build_dmar_q35() doesn't do it, it's all new code in this series that > contains not acceptable direct access from one device (iommu) to another > (cpu). > Proper way would be for the owner of iommu to fish limits from somewhere and > set > values during iommu creation. Maybe it's a good idea to add documentation for now. It would be nice not to push this stuff up the stack, it's unfortunate that our internal APIs make it hard. > > > > > > Perhaps Eduardo > > > can suggest better approach, since he's more familiar with phys_bits > > > topic > > > > @Eduardo, any comments? Thanks! > > > > > > > > > /* > > > > * Rsvd field masks for spte > > > > */ > > > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > > > -vtd_paging_entry_rsvd_field[1] = > > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[2] = > > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[3] = > > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[4] = > > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[5] = > > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[6] = > > > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[7] = > > > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits); > > > > -vtd_paging_entry_rsvd_field[8] = > > > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits); > > > > +vtd_paging_entry_rsvd_field[1] = > > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->haw_bits); > > > > +vtd_paging_entry_rsvd_field[2] = > > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->haw_bits); > > > > +vtd_paging_entry_rsvd_field[3] = > > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->haw_bits); > > > > +vtd_paging_entry_rsvd_field[4] = > > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->haw_bits); > > > > +vtd_paging_entry_rsvd_field[5] = > > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->haw_bits); > > > > +vtd_paging_entry_rsvd_field[6] = > > > >
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, 18 Dec 2018 17:27:23 +0800 Yu Zhang wrote: > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > On Wed, 12 Dec 2018 21:05:38 +0800 > > Yu Zhang wrote: > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > the host address width(HAW) to calculate the number of reserved bits in > > > data structures such as root entries, context entries, and entries of > > > DMA paging structures etc. > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > example, a 48-bit IOVA can only be mapped to host addresses no wider than > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may result > > > in an invalid IOVA being accepted. > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > IntelIOMMUState, > > > whose value is initialized based on the maximum physical address set to > > > guest CPU. > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > to clarify. > > > > > > Signed-off-by: Yu Zhang > > > Reviewed-by: Peter Xu > > > --- > > [...] > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > *iommu_mr, IOMMUNotifier *n) > > > static void vtd_init(IntelIOMMUState *s) > > > { > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > +CPUState *cs = first_cpu; > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > } > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > +s->haw_bits = cpu->phys_bits; > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > and set phys_bits when iommu is created? > > Thanks for your comments, Igor. > > Well, I guess you prefer not to query the CPU capabilities while deciding > the vIOMMU features. But to me, they are not that irrelevant.:) > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > are referring to the same concept. In VM, both are the maximum guest physical > address width. If we do not check the CPU field here, we will still have to > check the CPU field in other places such as build_dmar_q35(), and reset the > s->haw_bits again. > > Is this explanation convincing enough? :) current build_dmar_q35() doesn't do it, it's all new code in this series that contains not acceptable direct access from one device (iommu) to another (cpu). Proper way would be for the owner of iommu to fish limits from somewhere and set values during iommu creation. > > > > Perhaps Eduardo > > can suggest better approach, since he's more familiar with phys_bits topic > > @Eduardo, any comments? Thanks! > > > > > > /* > > > * Rsvd field masks for spte > > > */ > > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > > -vtd_paging_entry_rsvd_field[1] = > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[2] = > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[3] = > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[4] = > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[5] = > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[6] = > > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[7] = > > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[8] = > > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits); > > > +vtd_paging_entry_rsvd_field[1] = > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[2] = > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[3] = > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[4] = > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[5] = > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[6] = > > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[7] = > > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[8] = > > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->haw_bits); > > > > > > if (x86_iommu->intr_supported) { > > > s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; > > > @@ -3261,10 +3268,10 @@ static bool vtd_decide_config(IntelIOMMUState *s, > > > Error **errp) > > > } > > > > > > /* Currently only address widths
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Tue, Dec 18, 2018 at 05:27:23PM +0800, Yu Zhang wrote: > On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > > On Wed, 12 Dec 2018 21:05:38 +0800 > > Yu Zhang wrote: > > > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > > the host address width(HAW) to calculate the number of reserved bits in > > > data structures such as root entries, context entries, and entries of > > > DMA paging structures etc. > > > > > > However values of IOVA address width and of the HAW may not equal. For > > > example, a 48-bit IOVA can only be mapped to host addresses no wider than > > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may result > > > in an invalid IOVA being accepted. > > > > > > To fix this, a new field - haw_bits is introduced in struct > > > IntelIOMMUState, > > > whose value is initialized based on the maximum physical address set to > > > guest CPU. > > > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > > to clarify. > > > > > > Signed-off-by: Yu Zhang > > > Reviewed-by: Peter Xu > > > --- > > [...] > > > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > > *iommu_mr, IOMMUNotifier *n) > > > static void vtd_init(IntelIOMMUState *s) > > > { > > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > > +CPUState *cs = first_cpu; > > > +X86CPU *cpu = X86_CPU(cs); > > > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > > memset(s->wmask, 0, DMAR_REG_SIZE); > > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > > +if (s->aw_bits == VTD_AW_48BIT) { > > > s->cap |= VTD_CAP_SAGAW_48bit; > > > } > > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > > +s->haw_bits = cpu->phys_bits; > > Is it possible to avoid accessing CPU fields directly or cpu altogether > > and set phys_bits when iommu is created? > > Thanks for your comments, Igor. > > Well, I guess you prefer not to query the CPU capabilities while deciding > the vIOMMU features. But to me, they are not that irrelevant.:) > > Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR > are referring to the same concept. In VM, both are the maximum guest physical > address width. If we do not check the CPU field here, we will still have to > check the CPU field in other places such as build_dmar_q35(), and reset the > s->haw_bits again. > > Is this explanation convincing enough? :) So what happens if these don't match? I guess guest can configure the vtd to put data into some memory which isn't then accessible to the cpu, or cpu can use some memory not accessible to devices. I guess some guests might be confused - is this what you observe? If yes some comments that tell people which guests get confused would be nice. Is windows happy? Is linux happy? > > > > Perhaps Eduardo > > can suggest better approach, since he's more familiar with phys_bits topic > > @Eduardo, any comments? Thanks! > > > > > > /* > > > * Rsvd field masks for spte > > > */ > > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > > -vtd_paging_entry_rsvd_field[1] = > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[2] = > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[3] = > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[4] = > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[5] = > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[6] = > > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[7] = > > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits); > > > -vtd_paging_entry_rsvd_field[8] = > > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits); > > > +vtd_paging_entry_rsvd_field[1] = > > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[2] = > > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[3] = > > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[4] = > > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[5] = > > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[6] = > > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[7] = > > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->haw_bits); > > > +vtd_paging_entry_rsvd_field[8] = > > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->haw_bits); > > > > > > if (x86_iommu->intr_supported) { > > > s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; > > > @@ -3261,10 +3268,10 @@ static bool
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Mon, Dec 17, 2018 at 02:17:40PM +0100, Igor Mammedov wrote: > On Wed, 12 Dec 2018 21:05:38 +0800 > Yu Zhang wrote: > > > Currently, vIOMMU is using the value of IOVA address width, instead of > > the host address width(HAW) to calculate the number of reserved bits in > > data structures such as root entries, context entries, and entries of > > DMA paging structures etc. > > > > However values of IOVA address width and of the HAW may not equal. For > > example, a 48-bit IOVA can only be mapped to host addresses no wider than > > 46 bits. Using 48, instead of 46 to calculate the reserved bit may result > > in an invalid IOVA being accepted. > > > > To fix this, a new field - haw_bits is introduced in struct IntelIOMMUState, > > whose value is initialized based on the maximum physical address set to > > guest CPU. > > > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > > to clarify. > > > > Signed-off-by: Yu Zhang > > Reviewed-by: Peter Xu > > --- > [...] > > > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > > *iommu_mr, IOMMUNotifier *n) > > static void vtd_init(IntelIOMMUState *s) > > { > > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > > +CPUState *cs = first_cpu; > > +X86CPU *cpu = X86_CPU(cs); > > > > memset(s->csr, 0, DMAR_REG_SIZE); > > memset(s->wmask, 0, DMAR_REG_SIZE); > > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > > +if (s->aw_bits == VTD_AW_48BIT) { > > s->cap |= VTD_CAP_SAGAW_48bit; > > } > > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > > +s->haw_bits = cpu->phys_bits; > Is it possible to avoid accessing CPU fields directly or cpu altogether > and set phys_bits when iommu is created? Thanks for your comments, Igor. Well, I guess you prefer not to query the CPU capabilities while deciding the vIOMMU features. But to me, they are not that irrelevant.:) Here the hardware address width in vt-d, and the one in cpuid.MAXPHYSADDR are referring to the same concept. In VM, both are the maximum guest physical address width. If we do not check the CPU field here, we will still have to check the CPU field in other places such as build_dmar_q35(), and reset the s->haw_bits again. Is this explanation convincing enough? :) > > Perhaps Eduardo > can suggest better approach, since he's more familiar with phys_bits topic @Eduardo, any comments? Thanks! > > > /* > > * Rsvd field masks for spte > > */ > > vtd_paging_entry_rsvd_field[0] = ~0ULL; > > -vtd_paging_entry_rsvd_field[1] = > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[2] = > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[3] = > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[4] = > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[5] = > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[6] = > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[7] = > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits); > > -vtd_paging_entry_rsvd_field[8] = > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits); > > +vtd_paging_entry_rsvd_field[1] = > > VTD_SPTE_PAGE_L1_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[2] = > > VTD_SPTE_PAGE_L2_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[3] = > > VTD_SPTE_PAGE_L3_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[4] = > > VTD_SPTE_PAGE_L4_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[5] = > > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[6] = > > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[7] = > > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->haw_bits); > > +vtd_paging_entry_rsvd_field[8] = > > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->haw_bits); > > > > if (x86_iommu->intr_supported) { > > s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; > > @@ -3261,10 +3268,10 @@ static bool vtd_decide_config(IntelIOMMUState *s, > > Error **errp) > > } > > > > /* Currently only address widths supported are 39 and 48 bits */ > > -if ((s->aw_bits != VTD_HOST_AW_39BIT) && > > -(s->aw_bits != VTD_HOST_AW_48BIT)) { > > +if ((s->aw_bits != VTD_AW_39BIT) && > > +(s->aw_bits != VTD_AW_48BIT)) { > > error_setg(errp, "Supported values for x-aw-bits are: %d, %d", > > - VTD_HOST_AW_39BIT, VTD_HOST_AW_48BIT); > > + VTD_AW_39BIT, VTD_AW_48BIT); > > return false; > > } > > > > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h > > index
Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
On Wed, 12 Dec 2018 21:05:38 +0800 Yu Zhang wrote: > Currently, vIOMMU is using the value of IOVA address width, instead of > the host address width(HAW) to calculate the number of reserved bits in > data structures such as root entries, context entries, and entries of > DMA paging structures etc. > > However values of IOVA address width and of the HAW may not equal. For > example, a 48-bit IOVA can only be mapped to host addresses no wider than > 46 bits. Using 48, instead of 46 to calculate the reserved bit may result > in an invalid IOVA being accepted. > > To fix this, a new field - haw_bits is introduced in struct IntelIOMMUState, > whose value is initialized based on the maximum physical address set to > guest CPU. > Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed > to clarify. > > Signed-off-by: Yu Zhang > Reviewed-by: Peter Xu > --- [...] > @@ -3100,6 +3104,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion > *iommu_mr, IOMMUNotifier *n) > static void vtd_init(IntelIOMMUState *s) > { > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > +CPUState *cs = first_cpu; > +X86CPU *cpu = X86_CPU(cs); > > memset(s->csr, 0, DMAR_REG_SIZE); > memset(s->wmask, 0, DMAR_REG_SIZE); > @@ -3119,23 +3125,24 @@ static void vtd_init(IntelIOMMUState *s) > s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND | > VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS | > VTD_CAP_SAGAW_39bit | VTD_CAP_MGAW(s->aw_bits); > -if (s->aw_bits == VTD_HOST_AW_48BIT) { > +if (s->aw_bits == VTD_AW_48BIT) { > s->cap |= VTD_CAP_SAGAW_48bit; > } > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; > +s->haw_bits = cpu->phys_bits; Is it possible to avoid accessing CPU fields directly or cpu altogether and set phys_bits when iommu is created? Perhaps Eduardo can suggest better approach, since he's more familiar with phys_bits topic > /* > * Rsvd field masks for spte > */ > vtd_paging_entry_rsvd_field[0] = ~0ULL; > -vtd_paging_entry_rsvd_field[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[5] = VTD_SPTE_LPAGE_L1_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[6] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[7] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits); > -vtd_paging_entry_rsvd_field[8] = VTD_SPTE_LPAGE_L4_RSVD_MASK(s->aw_bits); > +vtd_paging_entry_rsvd_field[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[5] = > VTD_SPTE_LPAGE_L1_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[6] = > VTD_SPTE_LPAGE_L2_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[7] = > VTD_SPTE_LPAGE_L3_RSVD_MASK(s->haw_bits); > +vtd_paging_entry_rsvd_field[8] = > VTD_SPTE_LPAGE_L4_RSVD_MASK(s->haw_bits); > > if (x86_iommu->intr_supported) { > s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; > @@ -3261,10 +3268,10 @@ static bool vtd_decide_config(IntelIOMMUState *s, > Error **errp) > } > > /* Currently only address widths supported are 39 and 48 bits */ > -if ((s->aw_bits != VTD_HOST_AW_39BIT) && > -(s->aw_bits != VTD_HOST_AW_48BIT)) { > +if ((s->aw_bits != VTD_AW_39BIT) && > +(s->aw_bits != VTD_AW_48BIT)) { > error_setg(errp, "Supported values for x-aw-bits are: %d, %d", > - VTD_HOST_AW_39BIT, VTD_HOST_AW_48BIT); > + VTD_AW_39BIT, VTD_AW_48BIT); > return false; > } > > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h > index ed4e758..820451c 100644 > --- a/include/hw/i386/intel_iommu.h > +++ b/include/hw/i386/intel_iommu.h > @@ -47,9 +47,9 @@ > #define VTD_SID_TO_DEVFN(sid) ((sid) & 0xff) > > #define DMAR_REG_SIZE 0x230 > -#define VTD_HOST_AW_39BIT 39 > -#define VTD_HOST_AW_48BIT 48 > -#define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT > +#define VTD_AW_39BIT39 > +#define VTD_AW_48BIT48 > +#define VTD_ADDRESS_WIDTH VTD_AW_39BIT > #define VTD_HAW_MASK(aw)((1ULL << (aw)) - 1) > > #define DMAR_REPORT_F_INTR (1) > @@ -244,7 +244,8 @@ struct IntelIOMMUState { > bool intr_eime; /* Extended interrupt mode enabled */ > OnOffAuto intr_eim; /* Toggle for EIM cabability */ > bool buggy_eim; /* Force buggy EIM unless eim=off */ > -uint8_t
[Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width.
Currently, vIOMMU is using the value of IOVA address width, instead of the host address width(HAW) to calculate the number of reserved bits in data structures such as root entries, context entries, and entries of DMA paging structures etc. However values of IOVA address width and of the HAW may not equal. For example, a 48-bit IOVA can only be mapped to host addresses no wider than 46 bits. Using 48, instead of 46 to calculate the reserved bit may result in an invalid IOVA being accepted. To fix this, a new field - haw_bits is introduced in struct IntelIOMMUState, whose value is initialized based on the maximum physical address set to guest CPU. Also, definitions such as VTD_HOST_AW_39/48BIT etc. are renamed to clarify. Signed-off-by: Yu Zhang Reviewed-by: Peter Xu --- Cc: "Michael S. Tsirkin" Cc: Igor Mammedov Cc: Marcel Apfelbaum Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Cc: Peter Xu --- hw/i386/acpi-build.c | 2 +- hw/i386/intel_iommu.c | 55 --- include/hw/i386/intel_iommu.h | 9 +++ 3 files changed, 37 insertions(+), 29 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 236a20e..b989523 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2431,7 +2431,7 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker) } dmar = acpi_data_push(table_data, sizeof(*dmar)); -dmar->host_address_width = intel_iommu->aw_bits - 1; +dmar->host_address_width = intel_iommu->haw_bits - 1; dmar->flags = dmar_flags; /* DMAR Remapping Hardware Unit Definition structure */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index d97bcbc..0e88c63 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -707,7 +707,8 @@ static VTDBus *vtd_find_as_from_bus_num(IntelIOMMUState *s, uint8_t bus_num) */ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t iova, bool is_write, uint64_t *slptep, uint32_t *slpte_level, - bool *reads, bool *writes, uint8_t aw_bits) + bool *reads, bool *writes, uint8_t aw_bits, + uint8_t haw_bits) { dma_addr_t addr = vtd_ce_get_slpt_base(ce); uint32_t level = vtd_ce_get_level(ce); @@ -760,7 +761,7 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t iova, bool is_write, *slpte_level = level; return 0; } -addr = vtd_get_slpte_addr(slpte, aw_bits); +addr = vtd_get_slpte_addr(slpte, haw_bits); level--; } } @@ -783,6 +784,7 @@ typedef struct { void *private; bool notify_unmap; uint8_t aw; +uint8_t haw; uint16_t domain_id; } vtd_page_walk_info; @@ -925,7 +927,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, * This is a valid PDE (or even bigger than PDE). We need * to walk one further level. */ -ret = vtd_page_walk_level(vtd_get_slpte_addr(slpte, info->aw), +ret = vtd_page_walk_level(vtd_get_slpte_addr(slpte, info->haw), iova, MIN(iova_next, end), level - 1, read_cur, write_cur, info); } else { @@ -942,7 +944,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur); entry.addr_mask = ~subpage_mask; /* NOTE: this is only meaningful if entry_valid == true */ -entry.translated_addr = vtd_get_slpte_addr(slpte, info->aw); +entry.translated_addr = vtd_get_slpte_addr(slpte, info->haw); ret = vtd_page_walk_one(, info); } @@ -1002,7 +1004,7 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num, return -VTD_FR_ROOT_ENTRY_P; } -if (re.rsvd || (re.val & VTD_ROOT_ENTRY_RSVD(s->aw_bits))) { +if (re.rsvd || (re.val & VTD_ROOT_ENTRY_RSVD(s->haw_bits))) { trace_vtd_re_invalid(re.rsvd, re.val); return -VTD_FR_ROOT_ENTRY_RSVD; } @@ -1019,7 +1021,7 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num, } if ((ce->hi & VTD_CONTEXT_ENTRY_RSVD_HI) || - (ce->lo & VTD_CONTEXT_ENTRY_RSVD_LO(s->aw_bits))) { + (ce->lo & VTD_CONTEXT_ENTRY_RSVD_LO(s->haw_bits))) { trace_vtd_ce_invalid(ce->hi, ce->lo); return -VTD_FR_CONTEXT_ENTRY_RSVD; } @@ -1056,6 +1058,7 @@ static int vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as, .private = (void *)_as->iommu, .notify_unmap = true, .aw = s->aw_bits, +.haw = s->haw_bits, .as = vtd_as, .domain_id = VTD_CONTEXT_ENTRY_DID(ce->hi), }; @@ -1360,7 +1363,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, } ret_fr =