Re: [virtio-dev] [RFC] virtio-iommu version 0.4
Hi Eric, On 03/10/17 14:04, Auger Eric wrote: > When rebasing the v0.4 driver on master I observe a regression: commands > are not received properly by QEMU (typically an attach command is > received with a type of 0). After a bisection of the guest kernel the > first commit the problem appears is: > > commit e3067861ba6650a566a6273738c23c956ad55c02 > arm64: add basic VMAP_STACK support Indeed, thanks for the report. I can reproduce the problem with 4.14 and kvmtool. We should allocate buffers on the heap for any request (see for example 9472fe704 or e37e2ff35). I rebased onto 4.14 and pushed the following patch at: git://linux-arm.org/linux-jpb.git virtio-iommu/v0.4.1 It's not very nice, but should fix the issue. I didn't manage to measure the difference though, my benchmark isn't precise enough. So I don't know if we should try to use a kmem cache instead of kmalloc. Thanks, Jean --- 8< --- >From 3fc957560e1e6f070a0468bf75ebc4862d37ff82 Mon Sep 17 00:00:00 2001 From: Jean-Philippe Brucker Date: Mon, 9 Oct 2017 20:13:57 +0100 Subject: [PATCH] iommu/virtio-iommu: Allocate all requests on the heap When CONFIG_VMAP_STACK is enabled, DMA on the stack isn't allowed. Instead of allocating virtio-iommu requests on the stack, use kmalloc. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/virtio-iommu.c | 86 +--- 1 file changed, 58 insertions(+), 28 deletions(-) diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c index ec1cfaba5997..2d8fd5e99fa7 100644 --- a/drivers/iommu/virtio-iommu.c +++ b/drivers/iommu/virtio-iommu.c @@ -473,13 +473,10 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev) { int i; int ret = 0; + struct virtio_iommu_req_attach *req; struct iommu_fwspec *fwspec = dev->iommu_fwspec; struct viommu_endpoint *vdev = fwspec->iommu_priv; struct viommu_domain *vdomain = to_viommu_domain(domain); - struct virtio_iommu_req_attach req = { - .head.type = VIRTIO_IOMMU_T_ATTACH, - .address_space = cpu_to_le32(vdomain->id), - }; mutex_lock(&vdomain->mutex); if (!vdomain->viommu) { @@ -531,14 +528,25 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev) dev_dbg(dev, "attach to domain %llu\n", vdomain->id); + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + *req = (struct virtio_iommu_req_attach) { + .head.type = VIRTIO_IOMMU_T_ATTACH, + .address_space = cpu_to_le32(vdomain->id), + }; + for (i = 0; i < fwspec->num_ids; i++) { - req.device = cpu_to_le32(fwspec->ids[i]); + req->device = cpu_to_le32(fwspec->ids[i]); - ret = viommu_send_req_sync(vdomain->viommu, &req); + ret = viommu_send_req_sync(vdomain->viommu, req); if (ret) break; } + kfree(req); + vdomain->attached++; vdev->vdomain = vdomain; @@ -550,13 +558,7 @@ static int viommu_map(struct iommu_domain *domain, unsigned long iova, { int ret; struct viommu_domain *vdomain = to_viommu_domain(domain); - struct virtio_iommu_req_map req = { - .head.type = VIRTIO_IOMMU_T_MAP, - .address_space = cpu_to_le32(vdomain->id), - .virt_addr = cpu_to_le64(iova), - .phys_addr = cpu_to_le64(paddr), - .size = cpu_to_le64(size), - }; + struct virtio_iommu_req_map *req; pr_debug("map %llu 0x%lx -> 0x%llx (%zu)\n", vdomain->id, iova, paddr, size); @@ -564,17 +566,30 @@ static int viommu_map(struct iommu_domain *domain, unsigned long iova, if (!vdomain->attached) return -ENODEV; - if (prot & IOMMU_READ) - req.flags |= cpu_to_le32(VIRTIO_IOMMU_MAP_F_READ); - - if (prot & IOMMU_WRITE) - req.flags |= cpu_to_le32(VIRTIO_IOMMU_MAP_F_WRITE); - ret = viommu_tlb_map(vdomain, iova, paddr, size); if (ret) return ret; - ret = viommu_send_req_sync(vdomain->viommu, &req); + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + *req = (struct virtio_iommu_req_map) { + .head.type = VIRTIO_IOMMU_T_MAP, + .address_space = cpu_to_le32(vdomain->id), + .virt_addr = cpu_to_le64(iova), + .phys_addr = cpu_to_le64(paddr), + .size = cpu_to_le64(size), + }; + + if (prot & IOMMU_READ) + req->flags |= cpu_to_le32(VIRTIO_IOMMU_MAP_F_READ); + + if (prot & IOMMU_WRITE) + req->flags |= cpu_to_le32(VIRTIO_IOMMU_MAP_F_WRITE); + + ret = viommu_send_req_sync(vdomain->viommu, req
Re: [virtio-dev] [RFC] virtio-iommu version 0.4
Hi Jean, On 04/08/2017 20:19, Jean-Philippe Brucker wrote: > This is the continuation of my proposal for virtio-iommu, the para- > virtualized IOMMU. Here is a summary of the changes since last time [1]: > > * The virtio-iommu document now resembles an actual specification. It is > split into a formal description of the virtio device, and implementation > notes. Please find sources and binaries at [2]. > > * Added a probe request to describe to the guest different properties that > do not fit in firmware or in the virtio config space. This is a > necessary stepping stone for extending the virtio-iommu. > > * There is a working Qemu prototype [3], thanks to Eric Auger and Bharat > Bhushan. > > You can find the Linux driver and kvmtool device at [4] and [5]. I > plan to rework driver and kvmtool device slightly before sending the > patches. > > To understand the virtio-iommu, I advise to first read introduction and > motivation, then skim through implementation notes and finally look at the > device specification. > > I wasn't sure how to organize the review. For those who prefer to comment > inline, I attached v0.4 of device-operations.tex and topology.tex+MSI.tex > to this thread. They are the biggest chunks of the document. But LaTeX > isn't very pleasant to read, so you can simply send a list of comments in > relation to section numbers and a few words of context, we'll manage. > > --- > Version numbers 0.1-0.4 are arbitrary. I'm hoping they allow to compare > more easily differences since the RFC (see [6]), but haven't been made > public so far. This is the first public posting since initial proposal > [1], and the following describes all changes. > > ## v0.1 ## > > Content is the same as the RFC, but formatted to LaTeX. 'make' generates > one PDF and one HTML document. > > ## v0.2 ## > > Add introductions, improve topology example and firmware description based > on feedback and a number of useful discussions. > > ## v0.3 ## > > Add normative sections (MUST, SHOULD, etc). Clarify some things, tighten > the device and driver behaviour. Unmap semantics are consolidated; they > are now closer to VFIO Type1 v2 semantics. > > ## v0.4 ## > > Introduce PROBE requests. They provide per-endpoint information to the > driver that couldn't be described otherwise. > > For the moment, they allow to handle MSIs on x86 virtual platforms (see > 3.2). To do that we communicate reserved IOVA regions, that will also be > useful for describing regions that cannot be mapped for a given endpoint, > for instance addresses that correspond to a PCI bridge window. > > Introducing such a large framework for this tiny feature may seem > overkill, but it is needed for future extensions of the virtio-iommu and I > believe it really is worth the effort. > > ## Future ## > > Other extensions are in preparation. I won't detail them here because v0.4 > already is a lot to digest, but in short, building on top of PROBE: > > * First, since the IOMMU is paravirtualized, the device can expose some > properties of the physical topology to the guest, and let it allocate > resources more efficiently. For example, when the virtio-iommu manages > both physical and emulated endpoints, with different underlying IOMMUs, > we now have a way to describe multiple page and block granularities, > instead of forcing the guest to use the most restricted one for all > endpoints. This will most likely be in v0.5. > > * Then on top of that, a major improvement will describe hardware > acceleration features available to the guest. There is what I call "Page > Table Handover" (or simply, from the host POV, "Nested"), the ability > for the guest to manipulate its own page tables instead of sending > MAP/UNMAP requests to the host. This, along with IO Page Fault > reporting, will also permit SVM virtualization on different platforms. > > Thanks, > Jean > > [1] http://www.spinics.net/lists/kvm/msg147990.html > [2] git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 > > http://www.linux-arm.org/git?p=virtio-iommu.git;a=blob;f=dist/v0.4/virtio-iommu-v0.4.pdf > I reiterate the disclaimers: don't use this document as a reference, > it's a draft. It's also not an OASIS document yet. It may be riddled > with mistakes. As this is a working draft, it is unstable and I do not > guarantee backward compatibility of future versions. > [3] https://lists.gnu.org/archive/html/qemu-arm/2017-08/msg4.html > [4] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.4 > Warning: UAPI headers have changed! They didn't follow the spec, > please update. (Use branch v0.1, that has the old headers, for the > Qemu prototype [3]) When rebasing the v0.4 driver on master I observe a regression: commands are not received properly by QEMU (typically an attach command is received with a type of 0). After a bisection of the guest kernel the first commit the problem appears is: commit e3067861ba6650a566a
Re: [virtio-dev] [RFC] virtio-iommu version 0.4
On 21/09/17 07:41, Tian, Kevin wrote: >> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] >> Sent: Wednesday, September 6, 2017 7:49 PM >> >> >>> 2.6.8.2.1 >>> Multiple overlapping RESV_MEM properties are merged together. Device >>> requirement? if same types I assume? >> >> Combination rules apply to device and driver. When the driver puts >> multiple endpoints in the same domain, combination rules apply. They will >> become important once the guest attempts to do things like combining >> endpoints with different PASID capabilities in the same domain. I'm >> considering these endpoints to be behind different physical IOMMUs. >> >> For reserved regions, we specify what the driver should do and what the >> device should enforce with regard to mapping IOVAs of overlapping regions. >> For example, if a device has some virtual address RESV_T_MSI and an other >> device has the same virtual address RESV_T_IDENTITY, what should the >> driver do? >> >> I think it should apply the RESV_T_IDENTITY. RESV_T_MSI is just a special >> case of RESV_T_RESERVED, it's a hint for the IRQ subsystem and doesn't >> have a meaning within a domain. From DMA mappings point of view, it is >> effectively the same as RESV_T_RESERVED. When merging >> RESV_T_RESERVED and >> RESV_T_IDENTITY, we should make it RESV_T_IDENTITY. Because it is >> required >> for one endpoint to work (the one with IDENTITY) and is not accessed by >> the other (the driver must not allocate this IOVA range for DMA). >> >> More generally, I'd like to add the following combination table to the >> spec, that shows the resulting reserved region type after merging regions >> from two endpoints. N: no reserved region, R: RESV_T_RESERVED, M: >> RESV_T_MSI, I: RESV_T_IDENTITY. It is symmetric so I didn't fill the >> bottom half. >> >> | N | R | M | I >> -- >> N | N | R | M | ? >> -- >> R | | R | R | I >> -- >> M | | | M | I >> -- >> I | | | | I >> >> The 'I' column is problematic. If one endpoint has an IDENTITY region and >> the other doesn't have anything at that virtual address, then making that >> region an identity mapping will result in the second endpoint being able >> to access firmware memory. If using nested translation, stage-2 can help >> us here. But in general we should allow the device to reject an attach >> that would result in a N+I combination if the host considers it too dodgy. >> I think the other combinations are safe enough. >> > > will overlap happen in real? Can we simplify the spec to have device > not reporting overlapping regions? I think it's likely to happen, to have two endpoints with different overlapping types. Reporting is easy, but handling mismatched reserved regions is hard. So what I can do is leave these combination rules as implementation detail, and let the device reject any domain attach that overlaps mismatched RESV regions. After all, putting multiple devices in the same domain is a nice feature but might not be widely used in reality. It's not worth spending too much time describing all possible behaviors for it (but still worth thinking about to avoid introducing security holes). Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [virtio-dev] [RFC] virtio-iommu version 0.4
> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > Sent: Wednesday, September 6, 2017 7:49 PM > > > > 2.6.8.2.1 > > Multiple overlapping RESV_MEM properties are merged together. Device > > requirement? if same types I assume? > > Combination rules apply to device and driver. When the driver puts > multiple endpoints in the same domain, combination rules apply. They will > become important once the guest attempts to do things like combining > endpoints with different PASID capabilities in the same domain. I'm > considering these endpoints to be behind different physical IOMMUs. > > For reserved regions, we specify what the driver should do and what the > device should enforce with regard to mapping IOVAs of overlapping regions. > For example, if a device has some virtual address RESV_T_MSI and an other > device has the same virtual address RESV_T_IDENTITY, what should the > driver do? > > I think it should apply the RESV_T_IDENTITY. RESV_T_MSI is just a special > case of RESV_T_RESERVED, it's a hint for the IRQ subsystem and doesn't > have a meaning within a domain. From DMA mappings point of view, it is > effectively the same as RESV_T_RESERVED. When merging > RESV_T_RESERVED and > RESV_T_IDENTITY, we should make it RESV_T_IDENTITY. Because it is > required > for one endpoint to work (the one with IDENTITY) and is not accessed by > the other (the driver must not allocate this IOVA range for DMA). > > More generally, I'd like to add the following combination table to the > spec, that shows the resulting reserved region type after merging regions > from two endpoints. N: no reserved region, R: RESV_T_RESERVED, M: > RESV_T_MSI, I: RESV_T_IDENTITY. It is symmetric so I didn't fill the > bottom half. > > | N | R | M | I > -- > N | N | R | M | ? > -- > R | | R | R | I > -- > M | | | M | I > -- > I | | | | I > > The 'I' column is problematic. If one endpoint has an IDENTITY region and > the other doesn't have anything at that virtual address, then making that > region an identity mapping will result in the second endpoint being able > to access firmware memory. If using nested translation, stage-2 can help > us here. But in general we should allow the device to reject an attach > that would result in a N+I combination if the host considers it too dodgy. > I think the other combinations are safe enough. > will overlap happen in real? Can we simplify the spec to have device not reporting overlapping regions? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [virtio-dev] [RFC] virtio-iommu version 0.4
Hi Eric, On 23/08/17 14:55, Auger Eric wrote: > Please find some comments/questions below: Thanks a lot for this. Sorry for the delay, I was on holiday and it took me a while to sort out the details. > 2.6.7:1 > I do not understand the footnode #6 sentence: 'Without a specific > definition of the ACK requirements for a given property type, it simply > means that the driver read all fields of that property." That doesn't serve any purpose, I'll remove the footnote. > 2.6.7:2 > what if the driver has not provided a big enough buffer and the device > cannot report all supported properties? Good point, how about adding the following to 2.6.7.2 - Device Requirements: PROBE Request If the properties array is smaller than \field{probe_size}, then the device SHOULD NOT write any property and SHOULD set the request \field{status} to VIRTIO_IOMMU_S_INVAL. > 2.6.8.2: > Couldn't we use the same terminology as iommu_resv_type in iommu.h? > the negative form is not the easiest to understand for me. > Why F_MSI? It also needs to be understandable by someone not familiar with the Linux APIs. It's not entirely obvious to me what the iommu.h regions are without looking at the implementation, maybe using names from the device point of view is more clear. I don't mind simplifying though, as I don't see a reason for the driver to know about BYPASS regions other than HW MSIs. How about I remove ABORT and BYPASS, make the only subtype RESV_MEM_T_MSI (1)? We'd use subtype 0 as "don't care, just don't map this". Can call it RESV_MEM_T_RESERVED. Flags won't be used for these subtypes. Another subtype will be RESV_MEM_T_IDENTITY (see my reply to Kevin). IDENTITY regions must be identity-mapped by the guest and, as I understand it, are virtual addresses used for DMA by firmware. The flags field will then contain read/write protection flags. > 2.6.8.2.1 > Multiple overlapping RESV_MEM properties are merged together. Device > requirement? if same types I assume? Combination rules apply to device and driver. When the driver puts multiple endpoints in the same domain, combination rules apply. They will become important once the guest attempts to do things like combining endpoints with different PASID capabilities in the same domain. I'm considering these endpoints to be behind different physical IOMMUs. For reserved regions, we specify what the driver should do and what the device should enforce with regard to mapping IOVAs of overlapping regions. For example, if a device has some virtual address RESV_T_MSI and an other device has the same virtual address RESV_T_IDENTITY, what should the driver do? I think it should apply the RESV_T_IDENTITY. RESV_T_MSI is just a special case of RESV_T_RESERVED, it's a hint for the IRQ subsystem and doesn't have a meaning within a domain. From DMA mappings point of view, it is effectively the same as RESV_T_RESERVED. When merging RESV_T_RESERVED and RESV_T_IDENTITY, we should make it RESV_T_IDENTITY. Because it is required for one endpoint to work (the one with IDENTITY) and is not accessed by the other (the driver must not allocate this IOVA range for DMA). More generally, I'd like to add the following combination table to the spec, that shows the resulting reserved region type after merging regions from two endpoints. N: no reserved region, R: RESV_T_RESERVED, M: RESV_T_MSI, I: RESV_T_IDENTITY. It is symmetric so I didn't fill the bottom half. | N | R | M | I -- N | N | R | M | ? -- R | | R | R | I -- M | | | M | I -- I | | | | I The 'I' column is problematic. If one endpoint has an IDENTITY region and the other doesn't have anything at that virtual address, then making that region an identity mapping will result in the second endpoint being able to access firmware memory. If using nested translation, stage-2 can help us here. But in general we should allow the device to reject an attach that would result in a N+I combination if the host considers it too dodgy. I think the other combinations are safe enough. > what if the device is a physical assigned device. does the device report > the host doorbell or the guest doorbell, may be worth to clarify. Indeed. For a physical assigned device it is the guest doorbell as it corresponds to the virtual ITS or APIC (though with the APIC it's the same region). > 3.1.1 last paragraph > you talk about device ID but this is a terminology only known by ARM I > think. Actually it is introduced later in 3.1.2. Indeed, I'll move the explanation up. > 3.1.2 > About ACPI integration. Isn't IORT ARM specific? IORT is currently edited by ARM, as is virtio-iommu. Their contents are not ARM specific however, they describe software interfaces that solve mostly generic problems, and aim to be open. IORT describes the relation between endpoints and their translation agents. This problem has already been solved by IVRS, DMAR and IORT. From
Re: [virtio-dev] [RFC] virtio-iommu version 0.4
Hi Jean-Philippe, On 04/08/2017 20:19, Jean-Philippe Brucker wrote: > This is the continuation of my proposal for virtio-iommu, the para- > virtualized IOMMU. Here is a summary of the changes since last time [1]: > > * The virtio-iommu document now resembles an actual specification. It is > split into a formal description of the virtio device, and implementation > notes. Please find sources and binaries at [2]. > > * Added a probe request to describe to the guest different properties that > do not fit in firmware or in the virtio config space. This is a > necessary stepping stone for extending the virtio-iommu. > > * There is a working Qemu prototype [3], thanks to Eric Auger and Bharat > Bhushan. > > You can find the Linux driver and kvmtool device at [4] and [5]. I > plan to rework driver and kvmtool device slightly before sending the > patches. > > To understand the virtio-iommu, I advise to first read introduction and > motivation, then skim through implementation notes and finally look at the > device specification. > > I wasn't sure how to organize the review. For those who prefer to comment > inline, I attached v0.4 of device-operations.tex and topology.tex+MSI.tex > to this thread. They are the biggest chunks of the document. But LaTeX > isn't very pleasant to read, so you can simply send a list of comments in > relation to section numbers and a few words of context, we'll manage. Please find some comments/questions below: 2.6.7:1 I do not understand the footnode #6 sentence: 'Without a specific definition of the ACK requirements for a given property type, it simply means that the driver read all fields of that property." 2.6.7:2 what if the driver has not provided a big enough buffer and the device cannot report all supported properties? 2.6.8.2: Couldn't we use the same terminology as iommu_resv_type in iommu.h? the negative form is not the easiest to understand for me. Why F_MSI? 2.6.8.2.1 Multiple overlapping RESV_MEM properties are merged together. Device requirement? if same types I assume? what if the device is a physical assigned device. does the device report the host doorbell or the guest doorbell, may be worth to clarify. 3.1.1 last paragraph you talk about device ID but this is a terminology only known by ARM I think. Actually it is introduced later in 3.1.2. 3.1.2 About ACPI integration. Isn't IORT ARM specific? Will other architectures use that table? In IORT we also have Named Component node which look pretty the same. Could you elaborate of the difference? Device Object name: isn't it the path to the LNR0005 in the namespace? Thanks Eric > > --- > Version numbers 0.1-0.4 are arbitrary. I'm hoping they allow to compare > more easily differences since the RFC (see [6]), but haven't been made > public so far. This is the first public posting since initial proposal > [1], and the following describes all changes. > > ## v0.1 ## > > Content is the same as the RFC, but formatted to LaTeX. 'make' generates > one PDF and one HTML document. > > ## v0.2 ## > > Add introductions, improve topology example and firmware description based > on feedback and a number of useful discussions. > > ## v0.3 ## > > Add normative sections (MUST, SHOULD, etc). Clarify some things, tighten > the device and driver behaviour. Unmap semantics are consolidated; they > are now closer to VFIO Type1 v2 semantics. > > ## v0.4 ## > > Introduce PROBE requests. They provide per-endpoint information to the > driver that couldn't be described otherwise. > > For the moment, they allow to handle MSIs on x86 virtual platforms (see > 3.2). To do that we communicate reserved IOVA regions, that will also be > useful for describing regions that cannot be mapped for a given endpoint, > for instance addresses that correspond to a PCI bridge window. > > Introducing such a large framework for this tiny feature may seem > overkill, but it is needed for future extensions of the virtio-iommu and I > believe it really is worth the effort. > > ## Future ## > > Other extensions are in preparation. I won't detail them here because v0.4 > already is a lot to digest, but in short, building on top of PROBE: > > * First, since the IOMMU is paravirtualized, the device can expose some > properties of the physical topology to the guest, and let it allocate > resources more efficiently. For example, when the virtio-iommu manages > both physical and emulated endpoints, with different underlying IOMMUs, > we now have a way to describe multiple page and block granularities, > instead of forcing the guest to use the most restricted one for all > endpoints. This will most likely be in v0.5. > > * Then on top of that, a major improvement will describe hardware > acceleration features available to the guest. There is what I call "Page > Table Handover" (or simply, from the host POV, "Nested"), the ability > for the guest to manipulate its own page tables instead of sending > MAP/UNMAP requests to
RE: [virtio-dev] [RFC] virtio-iommu version 0.4
Hi, > -Original Message- > From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > Sent: Thursday, August 17, 2017 3:42 PM > To: Adam Tao > Cc: iommu@lists.linux-foundation.org; k...@vger.kernel.org; > virtualizat...@lists.linux-foundation.org; virtio-...@lists.oasis-open.org; > will.dea...@arm.com; robin.mur...@arm.com; lorenzo.pieral...@arm.com; > m...@redhat.com; jasow...@redhat.com; marc.zyng...@arm.com; > eric.au...@redhat.com; eric.auger@gmail.com; Bharat Bhushan > ; pet...@redhat.com; kevin.t...@intel.com > Subject: Re: [virtio-dev] [RFC] virtio-iommu version 0.4 > > Hi Adam, > > On 16/08/17 05:08, Adam Tao wrote: > >> * There is a working Qemu prototype [3], thanks to Eric Auger and Bharat > >> Bhushan. > > Hi, Brucker > > I read the related spec for virtio IOMMU, I am wondering if we support > > both the virtual and physical devices in the guest to use the virtio > > IOMMU, how can we config the related address translation for the > > different type of devices. > > e.g. > > 1. for Virtio devs(e.g. virtio-net), we can change the memmap table > > used by the vhost device. > > 2. for physical devices we can directly config the IOMMU/SMMU on the > > host. > > > > So do you know are there any RFCs for the back-end implementation for > > the virtio-IOMMU thanks Adam > > Eric's series [3] adds support for virtual devices (I'v only had time to test > virtio-net so far, but as vhost-iotlb seems to be supported by Qemu, it should > work with vhost-net as well). Bharat is working on the VFIO backend for > physical devices. As far as I'm aware, his latest version is: > > [7] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04094.html I am working on newer version which will be based on Eric-v3. Actually I wanted to include the fix for the issue reported by Eric, it fails if we assign more than one interface. I reproduced the issues on my side and know the root cause. I am working on fixing this. Thanks -Bharat > > Thanks, > Jean > > > >> [1] http://www.spinics.net/lists/kvm/msg147990.html > >> [2] git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 > >> http://www.linux-arm.org/git?p=virtio- > iommu.git;a=blob;f=dist/v0.4/virtio-iommu-v0.4.pdf > >> I reiterate the disclaimers: don't use this document as a reference, > >> it's a draft. It's also not an OASIS document yet. It may be riddled > >> with mistakes. As this is a working draft, it is unstable and I do not > >> guarantee backward compatibility of future versions. > >> [3] https://lists.gnu.org/archive/html/qemu-arm/2017-08/msg4.html > >> [4] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.4 > >> Warning: UAPI headers have changed! They didn't follow the spec, > >> please update. (Use branch v0.1, that has the old headers, for the > >> Qemu prototype [3]) > >> [5] git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.4 > >> Warning: command-line has changed! Use --viommu vfio[,opts] and > >> --viommu virtio[,opts] to instantiate a device. > >> [6] > >> http://www.linux-arm.org/git?p=virtio-iommu.git;a=tree;f=dist/diffs ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [virtio-dev] [RFC] virtio-iommu version 0.4
Hi Adam, On 16/08/17 05:08, Adam Tao wrote: >> * There is a working Qemu prototype [3], thanks to Eric Auger and Bharat >> Bhushan. > Hi, Brucker > I read the related spec for virtio IOMMU, > I am wondering if we support both the virtual and physical devices in > the guest to use the virtio IOMMU, how can we config the related address > translation for the different type of devices. > e.g. > 1. for Virtio devs(e.g. virtio-net), we can change the memmap table used > by the vhost device. > 2. for physical devices we can directly config the IOMMU/SMMU on the > host. > > So do you know are there any RFCs for the back-end implementation for the > virtio-IOMMU > thanks > Adam Eric's series [3] adds support for virtual devices (I'v only had time to test virtio-net so far, but as vhost-iotlb seems to be supported by Qemu, it should work with vhost-net as well). Bharat is working on the VFIO backend for physical devices. As far as I'm aware, his latest version is: [7] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04094.html Thanks, Jean >> [1] http://www.spinics.net/lists/kvm/msg147990.html >> [2] git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 >> >> http://www.linux-arm.org/git?p=virtio-iommu.git;a=blob;f=dist/v0.4/virtio-iommu-v0.4.pdf >> I reiterate the disclaimers: don't use this document as a reference, >> it's a draft. It's also not an OASIS document yet. It may be riddled >> with mistakes. As this is a working draft, it is unstable and I do not >> guarantee backward compatibility of future versions. >> [3] https://lists.gnu.org/archive/html/qemu-arm/2017-08/msg4.html >> [4] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.4 >> Warning: UAPI headers have changed! They didn't follow the spec, >> please update. (Use branch v0.1, that has the old headers, for the >> Qemu prototype [3]) >> [5] git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.4 >> Warning: command-line has changed! Use --viommu vfio[,opts] and >> --viommu virtio[,opts] to instantiate a device. >> [6] http://www.linux-arm.org/git?p=virtio-iommu.git;a=tree;f=dist/diffs ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [virtio-dev] [RFC] virtio-iommu version 0.4
On Fri, Aug 04, 2017 at 07:19:25PM +0100, Jean-Philippe Brucker wrote: > This is the continuation of my proposal for virtio-iommu, the para- > virtualized IOMMU. Here is a summary of the changes since last time [1]: > > * The virtio-iommu document now resembles an actual specification. It is > split into a formal description of the virtio device, and implementation > notes. Please find sources and binaries at [2]. > > * Added a probe request to describe to the guest different properties that > do not fit in firmware or in the virtio config space. This is a > necessary stepping stone for extending the virtio-iommu. > > * There is a working Qemu prototype [3], thanks to Eric Auger and Bharat > Bhushan. Hi, Brucker I read the related spec for virtio IOMMU, I am wondering if we support both the virtual and physical devices in the guest to use the virtio IOMMU, how can we config the related address translation for the different type of devices. e.g. 1. for Virtio devs(e.g. virtio-net), we can change the memmap table used by the vhost device. 2. for physical devices we can directly config the IOMMU/SMMU on the host. So do you know are there any RFCs for the back-end implementation for the virtio-IOMMU thanks Adam > > You can find the Linux driver and kvmtool device at [4] and [5]. I > plan to rework driver and kvmtool device slightly before sending the > patches. > > To understand the virtio-iommu, I advise to first read introduction and > motivation, then skim through implementation notes and finally look at the > device specification. > > I wasn't sure how to organize the review. For those who prefer to comment > inline, I attached v0.4 of device-operations.tex and topology.tex+MSI.tex > to this thread. They are the biggest chunks of the document. But LaTeX > isn't very pleasant to read, so you can simply send a list of comments in > relation to section numbers and a few words of context, we'll manage. > > --- > Version numbers 0.1-0.4 are arbitrary. I'm hoping they allow to compare > more easily differences since the RFC (see [6]), but haven't been made > public so far. This is the first public posting since initial proposal > [1], and the following describes all changes. > > ## v0.1 ## > > Content is the same as the RFC, but formatted to LaTeX. 'make' generates > one PDF and one HTML document. > > ## v0.2 ## > > Add introductions, improve topology example and firmware description based > on feedback and a number of useful discussions. > > ## v0.3 ## > > Add normative sections (MUST, SHOULD, etc). Clarify some things, tighten > the device and driver behaviour. Unmap semantics are consolidated; they > are now closer to VFIO Type1 v2 semantics. > > ## v0.4 ## > > Introduce PROBE requests. They provide per-endpoint information to the > driver that couldn't be described otherwise. > > For the moment, they allow to handle MSIs on x86 virtual platforms (see > 3.2). To do that we communicate reserved IOVA regions, that will also be > useful for describing regions that cannot be mapped for a given endpoint, > for instance addresses that correspond to a PCI bridge window. > > Introducing such a large framework for this tiny feature may seem > overkill, but it is needed for future extensions of the virtio-iommu and I > believe it really is worth the effort. > > ## Future ## > > Other extensions are in preparation. I won't detail them here because v0.4 > already is a lot to digest, but in short, building on top of PROBE: > > * First, since the IOMMU is paravirtualized, the device can expose some > properties of the physical topology to the guest, and let it allocate > resources more efficiently. For example, when the virtio-iommu manages > both physical and emulated endpoints, with different underlying IOMMUs, > we now have a way to describe multiple page and block granularities, > instead of forcing the guest to use the most restricted one for all > endpoints. This will most likely be in v0.5. > > * Then on top of that, a major improvement will describe hardware > acceleration features available to the guest. There is what I call "Page > Table Handover" (or simply, from the host POV, "Nested"), the ability > for the guest to manipulate its own page tables instead of sending > MAP/UNMAP requests to the host. This, along with IO Page Fault > reporting, will also permit SVM virtualization on different platforms. > > Thanks, > Jean > > [1] http://www.spinics.net/lists/kvm/msg147990.html > [2] git://linux-arm.org/virtio-iommu.git branch viommu/v0.4 > > http://www.linux-arm.org/git?p=virtio-iommu.git;a=blob;f=dist/v0.4/virtio-iommu-v0.4.pdf > I reiterate the disclaimers: don't use this document as a reference, > it's a draft. It's also not an OASIS document yet. It may be riddled > with mistakes. As this is a working draft, it is unstable and I do not > guarantee backward compatibility of future versions. > [3] https://lists.gnu.org/archive/html/qe