Re: [PATCH v2 2/4] eventfd: simplify eventfd_signal()
On Wed, Nov 22, 2023 at 1:49 PM Christian Brauner wrote: > > Ever since the evenfd type was introduced back in 2007 in commit > e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() > function only ever passed 1 as a value for @n. There's no point in > keeping that additional argument. > > Signed-off-by: Christian Brauner > --- > arch/x86/kvm/hyperv.c | 2 +- > arch/x86/kvm/xen.c| 2 +- > virt/kvm/eventfd.c| 4 ++-- > 30 files changed, 60 insertions(+), 63 deletions(-) For KVM: Acked-by: Paolo Bonzini
Re: [Intel-gfx] [PATCH v2 1/5] KVM: do not allow mapping valid but non-refcounted pages
On 25/06/21 09:58, Christian Borntraeger wrote: On 25.06.21 09:36, David Stevens wrote: From: Nicholas Piggin It's possible to create a region which maps valid but non-refcounted pages (e.g., tail pages of non-compound higher order allocations). These host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family of APIs, which take a reference to the page, which takes it from 0 to 1. When the reference is dropped, this will free the page incorrectly. Fix this by only taking a reference on the page if it was non-zero, which indicates it is participating in normal refcounting (and can be released with put_page). Signed-off-by: Nicholas Piggin I guess this would be the small fix for stable? Do we want to add that cc? Reviewed-by: Christian Borntraeger Yes, this one is going to Linus today. The rest is for 5.15. Paolo --- virt/kvm/kvm_main.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3dcc2abbfc60..f7445c3bcd90 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2175,6 +2175,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault) return true; } +static int kvm_try_get_pfn(kvm_pfn_t pfn) +{ + if (kvm_is_reserved_pfn(pfn)) + return 1; + return get_page_unless_zero(pfn_to_page(pfn)); +} + static int hva_to_pfn_remapped(struct vm_area_struct *vma, unsigned long addr, bool *async, bool write_fault, bool *writable, @@ -2224,13 +2231,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. + * + * Certain IO or PFNMAP mappings can be backed with valid + * struct pages, but be allocated without refcounting e.g., + * tail pages of non-compound higher order allocations, which + * would then underflow the refcount when the caller does the + * required put_page. Don't allow those pages here. */ - kvm_get_pfn(pfn); + if (!kvm_try_get_pfn(pfn)) + r = -EFAULT; out: pte_unmap_unlock(ptep, ptl); *p_pfn = pfn; - return 0; + + return r; } /* ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
On 24/06/21 14:57, Nicholas Piggin wrote: KVM: Fix page ref underflow for regions with valid but non-refcounted pages It doesn't really fix the underflow, it disallows mapping them in the first place. Since in principle things can break, I'd rather be explicit, so let's go with "KVM: do not allow mapping valid but non-reference-counted pages". It's possible to create a region which maps valid but non-refcounted pages (e.g., tail pages of non-compound higher order allocations). These host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family of APIs, which take a reference to the page, which takes it from 0 to 1. When the reference is dropped, this will free the page incorrectly. Fix this by only taking a reference on the page if it was non-zero, s/on the page/on valid pages/ (makes clear that invalid pages are fine without refcounting). Thank you *so* much, I'm awful at Linux mm. Paolo which indicates it is participating in normal refcounting (and can be released with put_page). Signed-off-by: Nicholas Piggin ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
On 24/06/21 13:42, Nicholas Piggin wrote: Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm: Excerpts from David Stevens's message of June 24, 2021 1:57 pm: KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using follow_pte in gfn_to_pfn. However, the resolved pfns may not have assoicated struct pages, so they should not be passed to pfn_to_page. This series removes such calls from the x86 and arm64 secondary MMU. To do this, this series modifies gfn_to_pfn to return a struct page in addition to a pfn, if the hva was resolved by gup. This allows the caller to call put_page only when necessated by gup. This series provides a helper function that unwraps the new return type of gfn_to_pfn to provide behavior identical to the old behavior. As I have no hardware to test powerpc/mips changes, the function is used there for minimally invasive changes. Additionally, as gfn_to_page and gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be easily changed over to only use pfns. This addresses CVE-2021-22543 on x86 and arm64. Does this fix the problem? (untested I don't have a POC setup at hand, but at least in concept) This one actually compiles at least. Unfortunately I don't have much time in the near future to test, and I only just found out about this CVE a few hours ago. And it also works (the reproducer gets an infinite stream of userspace exits and especially does not crash). We can still go for David's solution later since MMU notifiers are able to deal with this pages, but it's a very nice patch for stable kernels. If you provide a Signed-off-by, I can integrate it. Paolo --- It's possible to create a region which maps valid but non-refcounted pages (e.g., tail pages of non-compound higher order allocations). These host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family of APIs, which take a reference to the page, which takes it from 0 to 1. When the reference is dropped, this will free the page incorrectly. Fix this by only taking a reference on the page if it was non-zero, which indicates it is participating in normal refcounting (and can be released with put_page). --- virt/kvm/kvm_main.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 6a6bc7af0e28..46fb042837d2 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault) return true; } +static int kvm_try_get_pfn(kvm_pfn_t pfn) +{ + if (kvm_is_reserved_pfn(pfn)) + return 1; + return get_page_unless_zero(pfn_to_page(pfn)); +} + static int hva_to_pfn_remapped(struct vm_area_struct *vma, unsigned long addr, bool *async, bool write_fault, bool *writable, @@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. +* +* Certain IO or PFNMAP mappings can be backed with valid +* struct pages, but be allocated without refcounting e.g., +* tail pages of non-compound higher order allocations, which +* would then underflow the refcount when the caller does the +* required put_page. Don't allow those pages here. */ - kvm_get_pfn(pfn); + if (!kvm_try_get_pfn(pfn)) + r = -EFAULT; out: pte_unmap_unlock(ptep, ptl); *p_pfn = pfn; - return 0; + + return r; } /* ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
On 24/06/21 13:42, Nicholas Piggin wrote: +static int kvm_try_get_pfn(kvm_pfn_t pfn) +{ + if (kvm_is_reserved_pfn(pfn)) + return 1; So !pfn_valid would always return true. Yeah, this should work and is certainly appealing! Paolo + return get_page_unless_zero(pfn_to_page(pfn)); +} + static int hva_to_pfn_remapped(struct vm_area_struct *vma, unsigned long addr, bool *async, bool write_fault, bool *writable, @@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. +* +* Certain IO or PFNMAP mappings can be backed with valid +* struct pages, but be allocated without refcounting e.g., +* tail pages of non-compound higher order allocations, which +* would then underflow the refcount when the caller does the +* required put_page. Don't allow those pages here. */ - kvm_get_pfn(pfn); + if (!kvm_try_get_pfn(pfn)) + r = -EFAULT; out: pte_unmap_unlock(ptep, ptl); *p_pfn = pfn; - return 0; + + return r; } /* ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn
On 24/06/21 12:17, Nicholas Piggin wrote: If all callers were updated that is one thing, but from the changelog it sounds like that would not happen and there would be some gfn_to_pfn users left over. But yes in the end you would either need to make gfn_to_pfn never return a page found via follow_pte, or change all callers to the new way. If the plan is for the latter then I guess that's fine. Actually in that case anyway I don't see the need -- the existence of gfn_to_pfn is enough to know it might be buggy. It can just as easily be grepped for as kvm_pfn_page_unwrap. Sure, but that would leave us with longer function names (gfn_to_pfn_page* instead of gfn_to_pfn*). So the "safe" use is the one that looks worse and the unsafe use is the one that looks safe. And are gfn_to_page cases also vulernable to the same issue? No, they're just broken for the VM_IO|VM_PFNMAP case. Paolo So I think it could be marked deprecated or something if not everything will be converted in the one series, and don't need to touch all that arch code with this patch. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/6] KVM: x86/mmu: avoid struct page in MMU
On 24/06/21 12:06, Marc Zyngier wrote: On Thu, 24 Jun 2021 09:58:00 +0100, Nicholas Piggin wrote: Excerpts from David Stevens's message of June 24, 2021 1:57 pm: From: David Stevens out_unlock: if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) read_unlock(&vcpu->kvm->mmu_lock); else write_unlock(&vcpu->kvm->mmu_lock); - kvm_release_pfn_clean(pfn); + if (pfnpg.page) + put_page(pfnpg.page); return r; } How about kvm_release_pfn_page_clean(pfnpg); I'm not sure. I always found kvm_release_pfn_clean() ugly, because it doesn't mark the page 'clean'. I find put_page() more correct. Something like 'kvm_put_pfn_page()' would make more sense, but I'm so bad at naming things that I could just as well call it 'bob()'. The best way to go would be to get rid of kvm_release_pfn_clean() and always go through a pfn_page. Then we could or could not introduce wrappers kvm_put_pfn_page{,_dirty}. I think for now it's best to limit the churn since these patches will go in the stable releases too, and clean up the resulting API once we have a clear idea of how all architectures are using kvm_pfn_page. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn
On 24/06/21 11:57, Nicholas Piggin wrote: Needing kvm_pfn_page_unwrap is a sign that something might be buggy, so it's a good idea to move the short name to the common case and the ugly kvm_pfn_page_unwrap(gfn_to_pfn(...)) for the weird one. In fact I'm not sure there should be any kvm_pfn_page_unwrap in the end. If all callers were updated that is one thing, but from the changelog it sounds like that would not happen and there would be some gfn_to_pfn users left over. In this patches there are, so yeah the plan is to always change the callers to the new way. But yes in the end you would either need to make gfn_to_pfn never return a page found via follow_pte, or change all callers to the new way. If the plan is for the latter then I guess that's fine. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/6] KVM: x86/mmu: release audited pfns
On 24/06/21 10:43, Nicholas Piggin wrote: Excerpts from David Stevens's message of June 24, 2021 1:57 pm: From: David Stevens Changelog? This looks like a bug, should it have a Fixes: tag? Probably has been there forever... The best way to fix the bug would be to nuke mmu_audit.c, which I've threatened to do many times but never followed up on. Paolo Thanks, Nick Signed-off-by: David Stevens --- arch/x86/kvm/mmu/mmu_audit.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu_audit.c b/arch/x86/kvm/mmu/mmu_audit.c index cedc17b2f60e..97ff184084b4 100644 --- a/arch/x86/kvm/mmu/mmu_audit.c +++ b/arch/x86/kvm/mmu/mmu_audit.c @@ -121,6 +121,8 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level) audit_printk(vcpu->kvm, "levels %d pfn %llx hpa %llx " "ent %llxn", vcpu->arch.mmu->root_level, pfn, hpa, *sptep); + + kvm_release_pfn_clean(pfn); } static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep) -- 2.32.0.93.g670b81a890-goog ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn
On 24/06/21 10:52, Nicholas Piggin wrote: For now, wrap all calls to gfn_to_pfn functions in the new helper function. Callers which don't need the page struct will be updated in follow-up patches. Hmm. You mean callers that do need the page will be updated? Normally if there will be leftover users that don't need the struct page then you would go the other way and keep the old call the same, and add a new one (gfn_to_pfn_page) just for those that need it. Needing kvm_pfn_page_unwrap is a sign that something might be buggy, so it's a good idea to move the short name to the common case and the ugly kvm_pfn_page_unwrap(gfn_to_pfn(...)) for the weird one. In fact I'm not sure there should be any kvm_pfn_page_unwrap in the end. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/6] KVM: x86/mmu: avoid struct page in MMU
On 24/06/21 05:57, David Stevens wrote: static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, - int map_writable, int max_level, kvm_pfn_t pfn, + int map_writable, int max_level, + const struct kvm_pfn_page *pfnpg, bool prefault, bool is_tdp) { bool nx_huge_page_workaround_enabled = is_nx_huge_pa So the main difference with my tentative patches is that here I was passing the struct by value. I'll try to clean this up for 5.15, but for now it's all good like this. Thanks again! Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
On 24/06/21 05:57, David Stevens wrote: KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using follow_pte in gfn_to_pfn. However, the resolved pfns may not have assoicated struct pages, so they should not be passed to pfn_to_page. This series removes such calls from the x86 and arm64 secondary MMU. To do this, this series modifies gfn_to_pfn to return a struct page in addition to a pfn, if the hva was resolved by gup. This allows the caller to call put_page only when necessated by gup. This series provides a helper function that unwraps the new return type of gfn_to_pfn to provide behavior identical to the old behavior. As I have no hardware to test powerpc/mips changes, the function is used there for minimally invasive changes. Additionally, as gfn_to_page and gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be easily changed over to only use pfns. This addresses CVE-2021-22543 on x86 and arm64. Thank you very much for this. I agree that it makes sense to have a minimal change; I had similar changes almost ready, but was stuck with deadlocks in the gfn_to_pfn_cache case. In retrospect I should have posted something similar to your patches. I have started reviewing the patches, and they look good. I will try to include them in 5.13. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [FYI PATCH] i915: kvmgt: the KVM mmu_lock is now an rwlock
Adjust the KVMGT page tracking callbacks. Cc: Zhenyu Wang Cc: Zhi Wang Cc: intel-gvt-...@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Signed-off-by: Paolo Bonzini --- drivers/gpu/drm/i915/gvt/kvmgt.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 60f1a386dd06..b4348256ae95 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1703,7 +1703,7 @@ static int kvmgt_page_track_add(unsigned long handle, u64 gfn) return -EINVAL; } - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); if (kvmgt_gfn_is_write_protected(info, gfn)) goto out; @@ -1712,7 +1712,7 @@ static int kvmgt_page_track_add(unsigned long handle, u64 gfn) kvmgt_protect_table_add(info, gfn); out: - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, idx); return 0; } @@ -1737,7 +1737,7 @@ static int kvmgt_page_track_remove(unsigned long handle, u64 gfn) return -EINVAL; } - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); if (!kvmgt_gfn_is_write_protected(info, gfn)) goto out; @@ -1746,7 +1746,7 @@ static int kvmgt_page_track_remove(unsigned long handle, u64 gfn) kvmgt_protect_table_del(info, gfn); out: - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, idx); return 0; } @@ -1772,7 +1772,7 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm, struct kvmgt_guest_info *info = container_of(node, struct kvmgt_guest_info, track_node); - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); for (i = 0; i < slot->npages; i++) { gfn = slot->base_gfn + i; if (kvmgt_gfn_is_write_protected(info, gfn)) { @@ -1781,7 +1781,7 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm, kvmgt_protect_table_del(info, gfn); } } - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } static bool __kvmgt_vgpu_exist(struct intel_vgpu *vgpu, struct kvm *kvm) -- 2.26.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Possible use_mm() mis-uses
On 23/08/2018 08:07, Zhenyu Wang wrote: > On 2018.08.22 20:20:46 +0200, Paolo Bonzini wrote: >> On 22/08/2018 18:44, Linus Torvalds wrote: >>> An example of something that *isn't* right, is the i915 kvm interface, >>> which does >>> >>> use_mm(kvm->mm); >>> >>> on an mm that was initialized in virt/kvm/kvm_main.c using >>> >>> mmgrab(current->mm); >>> kvm->mm = current->mm; >>> >>> which is *not* right. Using "mmgrab()" does indeed guarantee the >>> lifetime of the 'struct mm_struct' itself, but it does *not* guarantee >>> the lifetime of the page tables. You need to use "mmget()" and >>> "mmput()", which get the reference to the actual process address >>> space! >>> >>> Now, it is *possible* that the kvm use is correct too, because kvm >>> does register a mmu_notifier chain, and in theory you can avoid the >>> proper refcounting by just making sure the mmu "release" notifier >>> kills any existing uses, but I don't really see kvm doing that. Kvm >>> does register a release notifier, but that just flushes the shadow >>> page tables, it doesn't kill any use_mm() use by some i915 use case. >> >> Yes, KVM is correct but the i915 bits are at least fishy. It's probably >> as simple as adding a mmget/mmput pair respectively in kvmgt_guest_init >> and kvmgt_guest_exit, or maybe mmget_not_zero. >> > > yeah, that's the clear way to fix this imo. We only depend on guest > life cycle to access guest memory properly. Here's proposed fix, will > verify and integrate it later. > > Thanks! > > From 5e5a8d0409aa150884adf5a4d0b956fd0b9906b3 Mon Sep 17 00:00:00 2001 > From: Zhenyu Wang > Date: Thu, 23 Aug 2018 14:08:06 +0800 > Subject: [PATCH] drm/i915/gvt: Fix life cycle reference on KVM mm > > Handle guest mm access life cycle properly with mmget()/mmput() > through guest init()/exit(). As noted by Linus, use_mm() depends > on valid live page table but KVM's mmgrab() doesn't guarantee that. > As vGPU usage depends on guest VM life cycle, need to make sure to > use mmget()/mmput() to guarantee VM address access. > > Cc: Linus Torvalds > Cc: Paolo Bonzini > Cc: Zhi Wang > Signed-off-by: Zhenyu Wang Reviewed-by: Paolo Bonzini > --- > drivers/gpu/drm/i915/gvt/kvmgt.c | 12 +++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c > b/drivers/gpu/drm/i915/gvt/kvmgt.c > index 71751be329e3..4a0988747d08 100644 > --- a/drivers/gpu/drm/i915/gvt/kvmgt.c > +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c > @@ -32,6 +32,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1614,9 +1615,16 @@ static int kvmgt_guest_init(struct mdev_device *mdev) > if (__kvmgt_vgpu_exist(vgpu, kvm)) > return -EEXIST; > > + if (!mmget_not_zero(kvm->mm)) { > + gvt_vgpu_err("Can't get KVM mm reference\n"); > + return -EINVAL; > + } > + > info = vzalloc(sizeof(struct kvmgt_guest_info)); > - if (!info) > + if (!info) { > + mmput(kvm->mm); > return -ENOMEM; > + } > > vgpu->handle = (unsigned long)info; > info->vgpu = vgpu; > @@ -1647,6 +1655,8 @@ static bool kvmgt_guest_exit(struct kvmgt_guest_info > *info) > debugfs_remove(info->debugfs_cache_entries); > > kvm_page_track_unregister_notifier(info->kvm, &info->track_node); > + if (info->kvm->mm) > + mmput(info->kvm->mm); > kvm_put_kvm(info->kvm); > kvmgt_protect_table_destroy(info); > gvt_cache_destroy(info->vgpu); > signature.asc Description: OpenPGP digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Possible use_mm() mis-uses
On 22/08/2018 18:44, Linus Torvalds wrote: > An example of something that *isn't* right, is the i915 kvm interface, > which does > > use_mm(kvm->mm); > > on an mm that was initialized in virt/kvm/kvm_main.c using > > mmgrab(current->mm); > kvm->mm = current->mm; > > which is *not* right. Using "mmgrab()" does indeed guarantee the > lifetime of the 'struct mm_struct' itself, but it does *not* guarantee > the lifetime of the page tables. You need to use "mmget()" and > "mmput()", which get the reference to the actual process address > space! > > Now, it is *possible* that the kvm use is correct too, because kvm > does register a mmu_notifier chain, and in theory you can avoid the > proper refcounting by just making sure the mmu "release" notifier > kills any existing uses, but I don't really see kvm doing that. Kvm > does register a release notifier, but that just flushes the shadow > page tables, it doesn't kill any use_mm() use by some i915 use case. Yes, KVM is correct but the i915 bits are at least fishy. It's probably as simple as adding a mmget/mmput pair respectively in kvmgt_guest_init and kvmgt_guest_exit, or maybe mmget_not_zero. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [GIT PULL] kvm/page_track changes for i915 KVMGT
On 09/11/2016 16:15, Daniel Vetter wrote: > On Wed, Nov 09, 2016 at 03:25:19PM +0100, Paolo Bonzini wrote: >> Daniel, >> >> The following changes since commit d9092f52d7e61dd1557f2db2400ddb430e85937e: >> >> kvm: x86: Check memopp before dereference (CVE-2016-8630) (2016-11-02 >> 21:31:53 +0100) >> >> are available in the git repository at: >> >> git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-kvmgt >> >> for you to fetch changes up to 871b7ef2a1850d0b435c8b324bf4a5d391adde3f: >> >> kvm/page_track: export symbols for external usage (2016-11-04 12:13:20 >> +0100) > > Pulled into drm-intel, thanks. Please also pull this into your kvm-next > tree to make sure we can land kvm/drm trees in any order for the 4.10 > merge window. Yes, I've already done that (though I have not pushed yet). Paolo > Thanks, Daniel > >> >> >> The three KVM patches that KVMGT needs. >> >> >> Jike Song (2): >> kvm/page_track: call notifiers with kvm_page_track_notifier_node >> kvm/page_track: export symbols for external usage >> >> Xiaoguang Chen (1): >> KVM: x86: add track_flush_slot page track notifier >> >> arch/x86/include/asm/kvm_page_track.h | 14 +- >> arch/x86/kvm/mmu.c| 11 ++- >> arch/x86/kvm/page_track.c | 31 ++- >> arch/x86/kvm/x86.c| 2 +- >> 4 files changed, 54 insertions(+), 4 deletions(-) > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [GIT PULL] kvm/page_track changes for i915 KVMGT
Daniel, The following changes since commit d9092f52d7e61dd1557f2db2400ddb430e85937e: kvm: x86: Check memopp before dereference (CVE-2016-8630) (2016-11-02 21:31:53 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-kvmgt for you to fetch changes up to 871b7ef2a1850d0b435c8b324bf4a5d391adde3f: kvm/page_track: export symbols for external usage (2016-11-04 12:13:20 +0100) The three KVM patches that KVMGT needs. Jike Song (2): kvm/page_track: call notifiers with kvm_page_track_notifier_node kvm/page_track: export symbols for external usage Xiaoguang Chen (1): KVM: x86: add track_flush_slot page track notifier arch/x86/include/asm/kvm_page_track.h | 14 +- arch/x86/kvm/mmu.c| 11 ++- arch/x86/kvm/page_track.c | 31 ++- arch/x86/kvm/x86.c| 2 +- 4 files changed, 54 insertions(+), 4 deletions(-) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] extend page_track for external usage
On 09/11/2016 03:03, Zhenyu Wang wrote: > On 2016.11.07 10:17:54 +0100, Daniel Vetter wrote: Paolo, for this case, do you think it's feasible we pick them through drm/i915 merge path? As currently initial KVMGT patch sets require these exported symbols, that's why I ask how we should handle this dependency. >>> >>> Then it's actually a good thing that I dropped from kvm/queue! You can >>> certainly include these patches, but please do that through a topic branch. >>> >>> I've prepared a branch for you >>> (git://git.kernel.org/pub/scm/virt/kvm/kvm.git branch for-kvmgt). Once >>> Linus processes my outstanding pull request, the branch will only >>> include the three page-tracking patches. Please pull that topic branch >>> into your own branch, and ensure you have a merge commit when you send >>> the pull request to Daniel. The merge commit ensures that the workflow >>> was correct; use --no-ff if necessary. >>> >>> You can do the same for Jike's patches for the KVM-VFIO device, when >>> Alex reviews them, and I suppose you'll need a topic branch for mdev >>> too? I didn't know that KVMGT was planned for 4.10. In the future, >>> let's synchronize ahead so that we can prepare topic branches for you. >> >> Ok, back from the useless wifi at plumbers, I can mail again. Zhenyu >> confirmed on irc that the initial code pile only needs this. For the >> cross-maintainer topic tree I prefer a formal pull request with stable >> tag. Please also cc: intel-gfx on that, since I plan to merge that one >> directly into i915. >> > > Paolo, could you help to do this for Daniel? Daniel would like to merge > current KVMGT required change for KVM directly, then I'd base KVMGT change > on that. Yes, I'll send it today. Paolo signature.asc Description: OpenPGP digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/10] mm: remove write/force parameters from __get_user_pages_unlocked()
_t rc = 0; > unsigned long max_pages_per_loop = PVM_MAX_KMALLOC_PAGES > / sizeof(struct pages *); > + unsigned int flags = FOLL_REMOTE; > > /* Work out address and page range required */ > if (len == 0) > return 0; > nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1; > > + if (vm_write) > + flags |= FOLL_WRITE; > + > while (!rc && nr_pages && iov_iter_count(iter)) { > int pages = min(nr_pages, max_pages_per_loop); > size_t bytes; > @@ -104,8 +108,7 @@ static int process_vm_rw_single_vec(unsigned long addr, >* current/current->mm >*/ > pages = __get_user_pages_unlocked(task, mm, pa, pages, > - vm_write, 0, process_pages, > - FOLL_REMOTE); > + process_pages, flags); > if (pages <= 0) > return -EFAULT; > > diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c > index db96688..8035cc1 100644 > --- a/virt/kvm/async_pf.c > +++ b/virt/kvm/async_pf.c > @@ -84,7 +84,8 @@ static void async_pf_execute(struct work_struct *work) >* mm and might be done in another context, so we must >* use FOLL_REMOTE. >*/ > - __get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL, FOLL_REMOTE); > + __get_user_pages_unlocked(NULL, mm, addr, 1, NULL, > + FOLL_WRITE | FOLL_REMOTE); > > kvm_async_page_present_sync(vcpu, apf); > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 81dfc73..28510e7 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -1416,10 +1416,15 @@ static int hva_to_pfn_slow(unsigned long addr, bool > *async, bool write_fault, > down_read(¤t->mm->mmap_sem); > npages = get_user_page_nowait(addr, write_fault, page); > up_read(¤t->mm->mmap_sem); > - } else > + } else { > + unsigned int flags = FOLL_TOUCH | FOLL_HWPOISON; > + > + if (write_fault) > + flags |= FOLL_WRITE; > + > npages = __get_user_pages_unlocked(current, current->mm, addr, > 1, > -write_fault, 0, page, > -FOLL_TOUCH|FOLL_HWPOISON); > +page, flags); > + } > if (npages != 1) > return npages; > > Acked-by: Paolo Bonzini ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel
On 19/11/2015 16:32, Stefano Stabellini wrote: > > In addition to Kevin's replies, I have a high-level question: can VFIO > > be used by QEMU for both KVM and Xen? > > No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU > is owned by Xen. I don't think QEMU command line compatibility between KVM and Xen should be a design goal for GVT-g. Nevertheless, it shouldn't be a problem to use a "virtual" VFIO (which doesn't need the IOMMU, because it uses the MMU in the physical GPU) even under Xen. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel
On 19/11/2015 09:40, Gerd Hoffmann wrote: >> > But this code should be >> > minor to be maintained in libvirt. > As far I know libvirt only needs to discover those devices. If they > look like sr/iov devices in sysfs this might work without any changes to > libvirt. I don't think they will look like SR/IOV devices. The interface may look a little like the sysfs interface that GVT-g is already using. However, it should at least be extended to support multiple vGPUs in a single VM. This might not be possible for Intel integrated graphics, but it should definitely be possible for discrete graphics cards. Another nit is that the VM id should probably be replaced by a UUID (because it's too easy to stumble on an existing VM id), assuming a VM id is needed at all. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On 11/12/2014 01:33, Tian, Kevin wrote: > My point is that KVMGT doesn't introduce new requirements as what's > required in IGD passthrough case, because all the hacks you see now > is to satisfy guest graphics driver's expectation. I haven't follow up the > KVM IGD passthrough progress, but if it doesn't require ISA bridge hacking > the same trick can be adopted by KVMGT too. Right now it did require ISA bridge hacking. > You may know Allen is > working on driver changes to avoid causing those hacks in Qemu side. > That effort will benefit us too. That's good to know, thanks! Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On 09/12/2014 03:49, Tian, Kevin wrote: > - Now we have XenGT/KVMGT separately maintained, and KVMGT lags > behind XenGT regarding to features and qualities. Likely you'll continue > see stale code (like Xen inst decoder) for some time. In the future we > plan to maintain a single kernel repo for both, so KVMGT can share > same quality as XenGT once KVM in-kernel dm framework is stable. > > - Regarding to Qemu hacks, KVMGT really doesn't have any different > requirements as what have been discussed for GPU pass-through, e.g. > about ISA bridge. Our implementation is based on an old Qemu repo, > and honestly speaking not cleanly developed, because we know we > can leverage from GPU pass-through support once it's in Qemu. At > that time we'll leverage the same logic with minimal changes to > hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So > we can ignore this area for now. :-) Could the virtual device model introduce new registers in order to avoid poking at the ISA bridge? I'm not sure that you "can leverage from GPU pass-through support once it's in Qemu", since the Xen IGD passthrough support is being added to a separate machine that is specific to Xen IGD passthrough; no ISA bridge hacking will probably be allowed on the "-M pc" and "-M q35" machine types. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
On 05/12/2014 09:50, Gerd Hoffmann wrote: > A few comments on the kernel stuff (brief look so far, also > compile-tested only, intel gfx on my test machine is too old). > > * Noticed the kernel bits don't even compile when configured as >module. Everything (vgt, i915, kvm) must be compiled into the >kernel. I'll add that the patch is basically impossible to review with all the XenGT bits still in. For example, the x86 emulator seems to be unnecessary for KVMGT, but I am not 100% sure. I would like a clear understanding of why/how Andrew Barnes was able to do i915 passthrough (GVT-d) without hacking the ISA bridge, and why this does not apply to GVT-g. Paolo > * Design approach still seems to be i915 on vgt not the other way >around. > > Qemu/SeaBIOS bits: > > I've seen the host bridge changes identity from i440fx to > copy-pci-ids-from-host. Guess the reason for this is that seabios uses > this device to figure whenever it is running on i440fx or q35. Correct? > > What are the exact requirements for the device? Must it match the host > exactly, to not confuse the guest intel graphics driver? Or would > something more recent -- such as the q35 emulation qemu has -- be good > enough to make things work (assuming we add support for the > graphic-related pci config space registers there)? > > The patch also adds a dummy isa bridge at 0x1f. Simliar question here: > What exactly is needed here? Would things work if we simply use the q35 > lpc device here? > > more to come after I've read the paper linked above ... > > cheers, > Gerd > > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Qemu-devel] [IGDVFIO] [PATCH 3/8] RFC and help completing: Intel IGD Direct Assignment with VFIO
Il 24/09/2014 22:57, Alex Williamson ha scritto: > Right, that's the physical mapping, Andy's patches are mimicking that > behavior virtually. Seabios reserves memory, creates e820 entries, and > "maps" the hardware by writing to these registers. That triggers QEMU > to adjust the MemoryRegion in the guest address space which is an mmap > to the host address space, using /dev/mem for now, but hopefully the > vfio file descriptor in the future (I should be careful what I hope > for). Yeah, I remember discussing that with Andrew on IRC. So he did implement that idea. > The opregion is pretty trivial because the write is to the IGD itself. > The others are to the host bridge, so we need to figure out what sort of > abstraction makes sense to get that back into vfio code. Do we have to support all chipsets? IIUC the more recent devices need fewer and fewer "backdoors". Paolo > Perhaps vfio creates all the memory regions and registers them into an > igd service and the host bridge can make calls like: > > gtt = igd_get_gtt_mr(); > > which returns NULL and nothing happens or the registered MemoryRegion > and the host bridge places it into the address space. Thanks, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [IGDVFIO] [PATCH 3/8] RFC and help completing: Intel IGD Direct Assignment with VFIO
Il 24/09/2014 21:47, Alex Williamson ha scritto: > So the opregion is mapped by a config write on the IGD device itself and > the other 3 regions, that we know about so far, are mapped via writes to > the host bridge. AFAIU the opregion is mapped by the (host) BIOS, that writes the address to a well-known scratch dword in the configuration space. The host reads from the dword and finds the opregion that way. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
Il 29/07/2014 08:59, Chen, Tiejun ha scritto: >> >> (see https://lkml.org/lkml/2014/6/19/121) >>> "gpu:drm:i915:intel_detect_pch: back to check devfn instead of check >>> class >>> type". Because Windows always use this way, so I think this point >>> should be >>> same between Linux and Windows. >> >> Didn't we discuss that we did not want to pass in PCH at all if we can? > > I'm a bit confused since I guess I'm missing something important in this > long discussion. I guess we just fake a simple PCIe device but just has > PCI_VENDOR_ID_XEN/Real_PCH_Device_ID, and then well probe such a PCIe > device inside intel_detect_pch(), right? Yes, for the special PIIX4 legacy machine type we want to do that and place the device at 00:1f.0. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
Il 07/07/2014 19:54, Daniel Vetter ha scritto: On Mon, Jul 07, 2014 at 04:57:45PM +0200, Paolo Bonzini wrote: Il 07/07/2014 16:49, Daniel Vetter ha scritto: So the correct fix to forward intel gpus to guests is indeed to somehow fake the pch pci ids since the driver really needs them. Gross design, but that's how the hardware works. A way that could work for virtualization is this: if you find the card has a magic subsystem vendor id, fetch the subsystem device id and use _that_ as the PCH device id. Would that work for you? I guess for quemu it also depends upon what windows does since we can't change that. If we care about that part. Another consideration is supporting older kernels, if that's possible at all. Yes, but right now it's more important to get something that's not too gross for the future, for both Linux and Windows. Hacks for existing guests can be done separately, especially since they might differ between Linux (check ISA bridge) and Windows (check 1f.0). Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
Il 07/07/2014 16:49, Daniel Vetter ha scritto: So the correct fix to forward intel gpus to guests is indeed to somehow fake the pch pci ids since the driver really needs them. Gross design, but that's how the hardware works. A way that could work for virtualization is this: if you find the card has a magic subsystem vendor id, fetch the subsystem device id and use _that_ as the PCH device id. Would that work for you? Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
Il 03/07/2014 21:09, Jesse Barnes ha scritto: Practically speaking, we could probably assume specific CPU/PCH combos, as I don't think they're generally mixed across generations, though SNB and IVB did have compatible sockets, so there is the possibility of mixing CPT and PPT PCHs, but those are register identical as far as the graphics driver is concerned, so even that should be safe. I guess the driver could do that if it finds an unknown PCH device ID. But encoding it in the subsystem device ID could also work and it would be easy to do in the hypervisor. Beyond that, the other MCH data we need to look at is mirrored into the GPU's MMIO space on current gens. Heh, that's exactly the same as the paravirtualized solution we were suggesting. ;) Paolo On older gens, we do need to poke at the memory controller a bit to get some info (see intel_setup_mchbar()), but that's not true of newer stuff. Looks like we only short circuit that on VLV though; we could probably do it on SNB+. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
Il 02/07/2014 18:23, Konrad Rzeszutek Wilk ha scritto: diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 651e65e..03f2829 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -433,6 +433,8 @@ void intel_detect_pch(struct drm_device *dev) unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK; dev_priv->pch_id = id; + if (pch->subsystem_vendor == PCI_VENDOR_ID_XEN) + id = pch->device & INTEL_PCH_DEVICE_ID_MASK; Actually you could look at *dev*'s subsystem IDs and skip the pch lookup completely. Paolo if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_IBX; DRM_DEBUG_KMS("Found Ibex Peak PCH\n"); > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
Il 30/06/2014 05:13, Chen, Tiejun ha scritto: After I discuss internal, we think even we just set the real vendor/device ids to this ISA bridge at 00:1f.0, guest firmware should still work well with these pair of real vendor/device ids. So if you think something would conflict or be broken, could you tell us what's exactly that? Then we will double check. The Xen hvmloader doesn't break since it only supports one chipset. But SeaBIOS checks for the exact vendor/device ids since Q35 support was added. If you want to add this feature, try to implement it in a way that is a bit more forward-looking. I'm sure that Xen sooner or later will want a PCIe chipset, otherwise things such as AER forwarding are impossible. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
Il 25/06/2014 09:34, Chen, Tiejun ha scritto: On 2014/6/25 14:48, Paolo Bonzini wrote: Second problem. Your IGD passthrough code currently works with QEMU's PIIX4-based machine. But what happens if you try to extend it, so that Yes, current xen machine, xenpv, is based on pii4, and also I don't known if we will plan to migrate to q35 or others. So its hard to further say more now. it works with QEMU's ICH9-based machine? That's a more modern machine that has a PCIe chipset and hence has its ISA bridge at 00:1f.0. Now But even in this case, could we set the real vendor/device ids for that ISA bridge at 00:1f.0? If not, what's broken? The config space layout changes for different vendor/device ids, so the guest firmware only works if you have the right vendor/device id. It is only slightly better, but the right solution is to fix the driver. There is absolutely zero reason why a graphics driver should know about the vendor/device ids of the PCH. This means we have to fix this both on Linux and Windows but I'm not sure if this is feasible to us. You have to do it if you want this feature in QEMU in a future-proof way. You _can_ provide the ugly PIIX4-specific hack as a compatibility fallback (and this patch is okay to make the compatibility fallback less hacky). However, I don't think QEMU should accept the patch for IGD passthrough unless Intel is willing to make drivers virtualization-friendly. Once you assign the IGD, it is not that integrated anymore and the drivers must take that into account. It is worthwhile pointing out that neither AMD nor nVidia need any of this. The right way could be to make QEMU add a vendor-specific capability to the video device. The driver can probe for that capability before Do you mean we can pick two unused offsets in the configuration space of the video device as a vendor-specific capability to hold the vendor/device ids of the PCH? Yes, either that or add a new capability (which lets you choose the offsets more freely). If the IGD driver needs config space fields of the MCH, those fields could also be mirrored in the new capability. QEMU would forward them automatically. It could even be a new BAR, which gives even more freedom to allocate the fields. looking at the PCI bus. QEMU can add the capability to the list, it is easy because all accesses to the video device's configuration space trap to QEMU. Then you do not need to add fake devices to the machine. In fact, it would be nice if Intel added such a capability on the next generation of integrated graphics, too. On real hardware, ACPI or some Maybe, but even this would be implemented, shouldn't we need to be compatible with those old generations? Yes. - old generation / old driver: use 00:1f.0 hack, only guaranteed to work on PIIX4-based virtual guest - old generation / new driver: use 00:1f.0 hack on real hardware, use capability on 00:02.0 on virtual guest, can work on PCIe virtual guest - new generation / old driver: doesn't exist - new generation / new driver: always use capability on 00:02.0, can work on PCIe virtual guest. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
Il 22/06/2014 10:25, Chen, Tiejun ha scritto: In qemu-upstream, as you commented we can't create this as a ISA class type explicitly. Note I didn't say that QEMU doesn't like having two ISA bridges. I commented that the firmware will see two ISA bridges and will try to initialize both of them. It currently doesn't just because it only knows of two southbridge PCI IDs, and both of them are old or relatively old (PIIX3/4 and ICH9). But what would happen if you did graphics passthrough on a machine with an ICH9 southbridge? Your code will create the PIIX4 ISA bridge, and will create an ICH9 southbridge just to please the i915 driver. The BIOS will recognize the ICH9's vendor/device ids and try to initialize it. It will write to the configuration space to set up PCI PIRQ[A-H] routing, for example, which makes no sense since your ICH9 is just a dummy device. Second problem. Your IGD passthrough code currently works with QEMU's PIIX4-based machine. But what happens if you try to extend it, so that it works with QEMU's ICH9-based machine? That's a more modern machine that has a PCIe chipset and hence has its ISA bridge at 00:1f.0. Now you have a conflict. In other words, if you want IGD passthrough in QEMU, you need a solution that is future-proof. So we compromise by faking this ISA bridge without ISA class type setting (as I recall you already said this way is slightly better). It is only slightly better, but the right solution is to fix the driver. There is absolutely zero reason why a graphics driver should know about the vendor/device ids of the PCH. The right way could be to make QEMU add a vendor-specific capability to the video device. The driver can probe for that capability before looking at the PCI bus. QEMU can add the capability to the list, it is easy because all accesses to the video device's configuration space trap to QEMU. Then you do not need to add fake devices to the machine. In fact, it would be nice if Intel added such a capability on the next generation of integrated graphics, too. On real hardware, ACPI or some other kind of firmware can probe the PCH at 00:1f.0 and initialize that vendor-specific capability. QEMU would just forward the value from the host, so that virtualized guests never depend on the PCH at 00:1f.0. Paolo Maybe we will figure better way in the future. But anyway, this is always registered as 00:15.0, right? So I think the i915 driver can go back to probe the devfn like the native environment. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
Il 19/06/2014 11:53, Tiejun Chen ha scritto: so this mean that isa bridge is still represented with Dev31:Func0 like the native OS. Furthermore, currently we're pushing VGA passthrough support into qemu upstream, and with some discussion, we wouldn't set the bridge class type and just expose this devfn. Even this is not really optimal. It just happens to work just because QEMU's machine is currently a PCI machine with the ISA bridge on 00:01.0. As soon as you'll try doing integrated graphics passthrough on a PCIe machine type (such as QEMU's "-M q35") things will break down just as badly. Paolo ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx