Re: [PATCH 03/10] vfio: add external user support
On 07/23/2013 12:23 PM, Alex Williamson wrote: On Tue, 2013-07-16 at 10:53 +1000, Alexey Kardashevskiy wrote: VFIO is designed to be used via ioctls on file descriptors returned by VFIO. However in some situations support for an external user is required. The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to use the existing VFIO groups for exclusive access in real/virtual mode on a host to avoid passing map/unmap requests to the user space which would made things pretty slow. The protocol includes: 1. do normal VFIO init operation: - opening a new container; - attaching group(s) to it; - setting an IOMMU driver for a container. When IOMMU is set for a container, all groups in it are considered ready to use by an external user. 2. User space passes a group fd to an external user. The external user calls vfio_group_get_external_user() to verify that: - the group is initialized; - IOMMU is set for it. If both checks passed, vfio_group_get_external_user() increments the container user counter to prevent the VFIO group from disposal before KVM exits. 3. The external user calls vfio_external_user_iommu_id() to know an IOMMU ID. PPC64 KVM uses it to link logical bus number (LIOBN) with IOMMU ID. 4. When the external KVM finishes, it calls vfio_group_put_external_user() to release the VFIO group. This call decrements the container user counter. Everything gets released. The vfio: Limit group opens patch is also required for the consistency. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru This looks fine to me. Is the plan to add this through the ppc tree again? Thanks, Nope, better to add this through your tree. And faster for sure :) Thanks! -- Alexey -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 10:48 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan Bharat- R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { Please use page_is_ram(), which is what gets used when setting the WIMG for the host userspace mapping. We want to make sure the two are consistent. + /* Pages not managed by Kernel are treated as I/O, set I + G */ + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ + mas2_attr |= MAS2_M; #endif Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. Scott, can you please point to the code where MAS2_M is always set for TLB0? -Bharat The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32- bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v1][PATCH 1/1] powerpc/ppc64: rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE
On Tue, Jul 16, 2013 at 11:09:30AM +0800, Tiejun Chen wrote: The SOFT_DISABLE_INTS seems an odd name for something that updates the software state to be consistent with interrupts being hard disabled, so rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE to avoid this confusion. Yes! I have a similar but unfinished patch locally. cheers -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). W or M, not W and M. I meant that each one, separately, is in a similar situation as the I bit. None of this is about invalid combinations of attributes on a single TLB entry (though there are architectural restrictions there as well). -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 10:15 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). W or M, not W and M. I meant that each one, separately, is in a similar situation as the I bit. None of this is about invalid combinations of attributes on a single TLB entry (though there are architectural restrictions there as well). Ok, I misread again :(. The second part of comment was (looks like you missed so copy pasted below) When we say all RAM (page_is_ram() is true) will be having M bit, then same RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? -Bharat -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/23/2013 11:50:35 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 10:15 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/22/2013 10:39:16 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, July 23, 2013 12:18 AM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; Alexander Graf; kvm-ppc@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/21/2013 11:39:45 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Thursday, July 18, 2013 11:09 PM To: Alexander Graf Cc: Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. I am trying to list the invalid mixing of WIMG: 1) I M 2) W I 3) W M (Scott mentioned that he observed issues when mixing these two) 4) is there any other? That's not what I was talking about (and I don't think I mentioned W at all, though it is also potentially problematic). Here is cut paste of your one response: The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. So I added not mixing W M. But at that time I missed to understood why mixing M I for same physical address can be issue :). W or M, not W and M. I meant that each one, separately, is in a similar situation as the I bit. None of this is about invalid combinations of attributes on a single TLB entry (though there are architectural restrictions there as well). Ok, I misread again :(. The second part of comment was (looks like you missed so copy pasted below) When we say all RAM (page_is_ram() is true) will be having M bit, then same RAM physical address will not have M mixed with any other, right? Similarly, For IO (which is not RAM), we will set I+G, so I will not be mixed with M. Is not that? I didn't miss it; it just seemed moot given the earlier confusion. But yes, for now we will set all RAM to M, and all I/O to I+G. Eventually that will change if/when we do vfio for QMan portals or other devices that require cacheable I/O. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 06:27 PM, Alexander Graf wrote: On 18.07.2013, at 12:19, “tiejun.chen” wrote: On 07/18/2013 06:12 PM, Alexander Graf wrote: On 18.07.2013, at 12:08, “tiejun.chen” wrote: On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) You're perfectly right, but this is generic KVM code. So it gets run across all