Re: [PATCH v2 2/4] eventfd: simplify eventfd_signal()

2024-02-07 Thread Paolo Bonzini
On Wed, Nov 22, 2023 at 1:49 PM Christian Brauner  wrote:
>
> Ever since the evenfd type was introduced back in 2007 in commit
> e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal()
> function only ever passed 1 as a value for @n. There's no point in
> keeping that additional argument.
>
> Signed-off-by: Christian Brauner 
> ---
>  arch/x86/kvm/hyperv.c |  2 +-
>  arch/x86/kvm/xen.c|  2 +-
>  virt/kvm/eventfd.c|  4 ++--
>  30 files changed, 60 insertions(+), 63 deletions(-)

For KVM:

Acked-by: Paolo Bonzini 



Re: [Intel-gfx] [PATCH v2 1/5] KVM: do not allow mapping valid but non-refcounted pages

2021-06-25 Thread Paolo Bonzini

On 25/06/21 09:58, Christian Borntraeger wrote:



On 25.06.21 09:36, David Stevens wrote:

From: Nicholas Piggin 

It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

Signed-off-by: Nicholas Piggin 


I guess this would be the small fix for stable? Do we want to add that cc?

Reviewed-by: Christian Borntraeger 


Yes, this one is going to Linus today.  The rest is for 5.15.

Paolo


---
  virt/kvm/kvm_main.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3dcc2abbfc60..f7445c3bcd90 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2175,6 +2175,13 @@ static bool vma_is_valid(struct vm_area_struct 
*vma, bool write_fault)

  return true;
  }

+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+    if (kvm_is_reserved_pfn(pfn))
+    return 1;
+    return get_page_unless_zero(pfn_to_page(pfn));
+}
+
  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 unsigned long addr, bool *async,
 bool write_fault, bool *writable,
@@ -2224,13 +2231,21 @@ static int hva_to_pfn_remapped(struct 
vm_area_struct *vma,

   * Whoever called remap_pfn_range is also going to call e.g.
   * unmap_mapping_range before the underlying pages are freed,
   * causing a call to our MMU notifier.
+ *
+ * Certain IO or PFNMAP mappings can be backed with valid
+ * struct pages, but be allocated without refcounting e.g.,
+ * tail pages of non-compound higher order allocations, which
+ * would then underflow the refcount when the caller does the
+ * required put_page. Don't allow those pages here.
   */
-    kvm_get_pfn(pfn);
+    if (!kvm_try_get_pfn(pfn))
+    r = -EFAULT;

  out:
  pte_unmap_unlock(ptep, ptl);
  *p_pfn = pfn;
-    return 0;
+
+    return r;
  }

  /*





___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU

2021-06-24 Thread Paolo Bonzini

On 24/06/21 14:57, Nicholas Piggin wrote:

KVM: Fix page ref underflow for regions with valid but non-refcounted pages


It doesn't really fix the underflow, it disallows mapping them in the 
first place.  Since in principle things can break, I'd rather be 
explicit, so let's go with "KVM: do not allow mapping valid but 
non-reference-counted pages".



It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,


s/on the page/on valid pages/ (makes clear that invalid pages are fine 
without refcounting).


Thank you *so* much, I'm awful at Linux mm.

Paolo


which indicates it is participating in normal refcounting (and can be
released with put_page).

Signed-off-by: Nicholas Piggin


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU

2021-06-24 Thread Paolo Bonzini

On 24/06/21 13:42, Nicholas Piggin wrote:

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:

Excerpts from David Stevens's message of June 24, 2021 1:57 pm:

KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
follow_pte in gfn_to_pfn. However, the resolved pfns may not have
assoicated struct pages, so they should not be passed to pfn_to_page.
This series removes such calls from the x86 and arm64 secondary MMU. To
do this, this series modifies gfn_to_pfn to return a struct page in
addition to a pfn, if the hva was resolved by gup. This allows the
caller to call put_page only when necessated by gup.

This series provides a helper function that unwraps the new return type
of gfn_to_pfn to provide behavior identical to the old behavior. As I
have no hardware to test powerpc/mips changes, the function is used
there for minimally invasive changes. Additionally, as gfn_to_page and
gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
easily changed over to only use pfns.

This addresses CVE-2021-22543 on x86 and arm64.


Does this fix the problem? (untested I don't have a POC setup at hand,
but at least in concept)


This one actually compiles at least. Unfortunately I don't have much
time in the near future to test, and I only just found out about this
CVE a few hours ago.


And it also works (the reproducer gets an infinite stream of userspace 
exits and especially does not crash).  We can still go for David's 
solution later since MMU notifiers are able to deal with this pages, but 
it's a very nice patch for stable kernels.


If you provide a Signed-off-by, I can integrate it.

Paolo


---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
  virt/kvm/kvm_main.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, 
bool write_fault)
return true;
  }
  
+static int kvm_try_get_pfn(kvm_pfn_t pfn)

+{
+   if (kvm_is_reserved_pfn(pfn))
+   return 1;
+   return get_page_unless_zero(pfn_to_page(pfn));
+}
+
  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
   unsigned long addr, bool *async,
   bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct 
*vma,
 * Whoever called remap_pfn_range is also going to call e.g.
 * unmap_mapping_range before the underlying pages are freed,
 * causing a call to our MMU notifier.
+*
+* Certain IO or PFNMAP mappings can be backed with valid
+* struct pages, but be allocated without refcounting e.g.,
+* tail pages of non-compound higher order allocations, which
+* would then underflow the refcount when the caller does the
+* required put_page. Don't allow those pages here.
 */
-   kvm_get_pfn(pfn);
+   if (!kvm_try_get_pfn(pfn))
+   r = -EFAULT;
  
  out:

pte_unmap_unlock(ptep, ptl);
*p_pfn = pfn;
-   return 0;
+
+   return r;
  }
  
  /*




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU

2021-06-24 Thread Paolo Bonzini

On 24/06/21 13:42, Nicholas Piggin wrote:

+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+   if (kvm_is_reserved_pfn(pfn))
+   return 1;


So !pfn_valid would always return true.  Yeah, this should work and is 
certainly appealing!


Paolo



+   return get_page_unless_zero(pfn_to_page(pfn));
+}
+
  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
   unsigned long addr, bool *async,
   bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct 
*vma,
 * Whoever called remap_pfn_range is also going to call e.g.
 * unmap_mapping_range before the underlying pages are freed,
 * causing a call to our MMU notifier.
+*
+* Certain IO or PFNMAP mappings can be backed with valid
+* struct pages, but be allocated without refcounting e.g.,
+* tail pages of non-compound higher order allocations, which
+* would then underflow the refcount when the caller does the
+* required put_page. Don't allow those pages here.
 */
-   kvm_get_pfn(pfn);
+   if (!kvm_try_get_pfn(pfn))
+   r = -EFAULT;
  
  out:

pte_unmap_unlock(ptep, ptl);
*p_pfn = pfn;
-   return 0;
+
+   return r;
  }
  
  /*




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn

2021-06-24 Thread Paolo Bonzini

On 24/06/21 12:17, Nicholas Piggin wrote:

If all callers were updated that is one thing, but from the changelog
it sounds like that would not happen and there would be some gfn_to_pfn
users left over.

But yes in the end you would either need to make gfn_to_pfn never return
a page found via follow_pte, or change all callers to the new way. If
the plan is for the latter then I guess that's fine.


Actually in that case anyway I don't see the need -- the existence of
gfn_to_pfn is enough to know it might be buggy. It can just as easily
be grepped for as kvm_pfn_page_unwrap.


Sure, but that would leave us with longer function names 
(gfn_to_pfn_page* instead of gfn_to_pfn*).  So the "safe" use is the one 
that looks worse and the unsafe use is the one that looks safe.



And are gfn_to_page cases also
vulernable to the same issue?


No, they're just broken for the VM_IO|VM_PFNMAP case.

Paolo


So I think it could be marked deprecated or something if not everything
will be converted in the one series, and don't need to touch all that
arch code with this patch.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/6] KVM: x86/mmu: avoid struct page in MMU

2021-06-24 Thread Paolo Bonzini

On 24/06/21 12:06, Marc Zyngier wrote:

On Thu, 24 Jun 2021 09:58:00 +0100,
Nicholas Piggin  wrote:


Excerpts from David Stevens's message of June 24, 2021 1:57 pm:

From: David Stevens 
  out_unlock:
if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
read_unlock(>kvm->mmu_lock);
else
write_unlock(>kvm->mmu_lock);
-   kvm_release_pfn_clean(pfn);
+   if (pfnpg.page)
+   put_page(pfnpg.page);
return r;
  }


How about

   kvm_release_pfn_page_clean(pfnpg);


I'm not sure. I always found kvm_release_pfn_clean() ugly, because it
doesn't mark the page 'clean'. I find put_page() more correct.

Something like 'kvm_put_pfn_page()' would make more sense, but I'm so
bad at naming things that I could just as well call it 'bob()'.


The best way to go would be to get rid of kvm_release_pfn_clean() and 
always go through a pfn_page.  Then we could or could not introduce 
wrappers kvm_put_pfn_page{,_dirty}.


I think for now it's best to limit the churn since these patches will go 
in the stable releases too, and clean up the resulting API once we have 
a clear idea of how all architectures are using kvm_pfn_page.


Paolo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn

2021-06-24 Thread Paolo Bonzini

On 24/06/21 11:57, Nicholas Piggin wrote:

Needing kvm_pfn_page_unwrap is a sign that something might be buggy, so
it's a good idea to move the short name to the common case and the ugly
kvm_pfn_page_unwrap(gfn_to_pfn(...)) for the weird one.  In fact I'm not
sure there should be any kvm_pfn_page_unwrap in the end.


If all callers were updated that is one thing, but from the changelog
it sounds like that would not happen and there would be some gfn_to_pfn
users left over.


In this patches there are, so yeah the plan is to always change the 
callers to the new way.



But yes in the end you would either need to make gfn_to_pfn never return
a page found via follow_pte, or change all callers to the new way. If
the plan is for the latter then I guess that's fine.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] KVM: x86/mmu: release audited pfns

2021-06-24 Thread Paolo Bonzini

On 24/06/21 10:43, Nicholas Piggin wrote:

Excerpts from David Stevens's message of June 24, 2021 1:57 pm:

From: David Stevens 


Changelog? This looks like a bug, should it have a Fixes: tag?


Probably has been there forever... The best way to fix the bug would be 
to nuke mmu_audit.c, which I've threatened to do many times but never 
followed up on.


Paolo


Thanks,
Nick



Signed-off-by: David Stevens 
---
  arch/x86/kvm/mmu/mmu_audit.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu_audit.c b/arch/x86/kvm/mmu/mmu_audit.c
index cedc17b2f60e..97ff184084b4 100644
--- a/arch/x86/kvm/mmu/mmu_audit.c
+++ b/arch/x86/kvm/mmu/mmu_audit.c
@@ -121,6 +121,8 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 
*sptep, int level)
audit_printk(vcpu->kvm, "levels %d pfn %llx hpa %llx "
 "ent %llxn", vcpu->arch.mmu->root_level, pfn,
 hpa, *sptep);
+
+   kvm_release_pfn_clean(pfn);
  }
  
  static void inspect_spte_has_rmap(struct kvm *kvm, u64 *sptep)

--
2.32.0.93.g670b81a890-goog






___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn

2021-06-24 Thread Paolo Bonzini

On 24/06/21 10:52, Nicholas Piggin wrote:

For now, wrap all calls to gfn_to_pfn functions in the new helper
function. Callers which don't need the page struct will be updated in
follow-up patches.

Hmm. You mean callers that do need the page will be updated? Normally
if there will be leftover users that don't need the struct page then
you would go the other way and keep the old call the same, and add a new
one (gfn_to_pfn_page) just for those that need it.


Needing kvm_pfn_page_unwrap is a sign that something might be buggy, so 
it's a good idea to move the short name to the common case and the ugly 
kvm_pfn_page_unwrap(gfn_to_pfn(...)) for the weird one.  In fact I'm not 
sure there should be any kvm_pfn_page_unwrap in the end.


Paolo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/6] KVM: x86/mmu: avoid struct page in MMU

2021-06-24 Thread Paolo Bonzini

On 24/06/21 05:57, David Stevens wrote:

  static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
-   int map_writable, int max_level, kvm_pfn_t pfn,
+   int map_writable, int max_level,
+   const struct kvm_pfn_page *pfnpg,
bool prefault, bool is_tdp)
  {
bool nx_huge_page_workaround_enabled = is_nx_huge_pa


So the main difference with my tentative patches is that here I was 
passing the struct by value.  I'll try to clean this up for 5.15, but 
for now it's all good like this.  Thanks again!


Paolo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU

2021-06-24 Thread Paolo Bonzini

On 24/06/21 05:57, David Stevens wrote:

KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
follow_pte in gfn_to_pfn. However, the resolved pfns may not have
assoicated struct pages, so they should not be passed to pfn_to_page.
This series removes such calls from the x86 and arm64 secondary MMU. To
do this, this series modifies gfn_to_pfn to return a struct page in
addition to a pfn, if the hva was resolved by gup. This allows the
caller to call put_page only when necessated by gup.

This series provides a helper function that unwraps the new return type
of gfn_to_pfn to provide behavior identical to the old behavior. As I
have no hardware to test powerpc/mips changes, the function is used
there for minimally invasive changes. Additionally, as gfn_to_page and
gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
easily changed over to only use pfns.

This addresses CVE-2021-22543 on x86 and arm64.


Thank you very much for this.  I agree that it makes sense to have a 
minimal change; I had similar changes almost ready, but was stuck with 
deadlocks in the gfn_to_pfn_cache case.  In retrospect I should have 
posted something similar to your patches.


I have started reviewing the patches, and they look good.  I will try to 
include them in 5.13.


Paolo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [FYI PATCH] i915: kvmgt: the KVM mmu_lock is now an rwlock

2021-02-08 Thread Paolo Bonzini
Adjust the KVMGT page tracking callbacks.

Cc: Zhenyu Wang 
Cc: Zhi Wang 
Cc: intel-gvt-...@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: Paolo Bonzini 
---
 drivers/gpu/drm/i915/gvt/kvmgt.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 60f1a386dd06..b4348256ae95 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1703,7 +1703,7 @@ static int kvmgt_page_track_add(unsigned long handle, u64 
gfn)
return -EINVAL;
}
 
-   spin_lock(>mmu_lock);
+   write_lock(>mmu_lock);
 
if (kvmgt_gfn_is_write_protected(info, gfn))
goto out;
@@ -1712,7 +1712,7 @@ static int kvmgt_page_track_add(unsigned long handle, u64 
gfn)
kvmgt_protect_table_add(info, gfn);
 
 out:
-   spin_unlock(>mmu_lock);
+   write_unlock(>mmu_lock);
srcu_read_unlock(>srcu, idx);
return 0;
 }
@@ -1737,7 +1737,7 @@ static int kvmgt_page_track_remove(unsigned long handle, 
u64 gfn)
return -EINVAL;
}
 
-   spin_lock(>mmu_lock);
+   write_lock(>mmu_lock);
 
if (!kvmgt_gfn_is_write_protected(info, gfn))
goto out;
@@ -1746,7 +1746,7 @@ static int kvmgt_page_track_remove(unsigned long handle, 
u64 gfn)
kvmgt_protect_table_del(info, gfn);
 
 out:
-   spin_unlock(>mmu_lock);
+   write_unlock(>mmu_lock);
srcu_read_unlock(>srcu, idx);
return 0;
 }
@@ -1772,7 +1772,7 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm,
struct kvmgt_guest_info *info = container_of(node,
struct kvmgt_guest_info, track_node);
 
-   spin_lock(>mmu_lock);
+   write_lock(>mmu_lock);
for (i = 0; i < slot->npages; i++) {
gfn = slot->base_gfn + i;
if (kvmgt_gfn_is_write_protected(info, gfn)) {
@@ -1781,7 +1781,7 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm,
kvmgt_protect_table_del(info, gfn);
}
}
-   spin_unlock(>mmu_lock);
+   write_unlock(>mmu_lock);
 }
 
 static bool __kvmgt_vgpu_exist(struct intel_vgpu *vgpu, struct kvm *kvm)
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Possible use_mm() mis-uses

2018-08-23 Thread Paolo Bonzini
On 23/08/2018 08:07, Zhenyu Wang wrote:
> On 2018.08.22 20:20:46 +0200, Paolo Bonzini wrote:
>> On 22/08/2018 18:44, Linus Torvalds wrote:
>>> An example of something that *isn't* right, is the i915 kvm interface,
>>> which does
>>>
>>> use_mm(kvm->mm);
>>>
>>> on an mm that was initialized in virt/kvm/kvm_main.c using
>>>
>>> mmgrab(current->mm);
>>> kvm->mm = current->mm;
>>>
>>> which is *not* right. Using "mmgrab()" does indeed guarantee the
>>> lifetime of the 'struct mm_struct' itself, but it does *not* guarantee
>>> the lifetime of the page tables. You need to use "mmget()" and
>>> "mmput()", which get the reference to the actual process address
>>> space!
>>>
>>> Now, it is *possible* that the kvm use is correct too, because kvm
>>> does register a mmu_notifier chain, and in theory you can avoid the
>>> proper refcounting by just making sure the mmu "release" notifier
>>> kills any existing uses, but I don't really see kvm doing that. Kvm
>>> does register a release notifier, but that just flushes the shadow
>>> page tables, it doesn't kill any use_mm() use by some i915 use case.
>>
>> Yes, KVM is correct but the i915 bits are at least fishy.  It's probably
>> as simple as adding a mmget/mmput pair respectively in kvmgt_guest_init
>> and kvmgt_guest_exit, or maybe mmget_not_zero.
>>
> 
> yeah, that's the clear way to fix this imo. We only depend on guest
> life cycle to access guest memory properly. Here's proposed fix, will
> verify and integrate it later.
> 
> Thanks!
> 
> From 5e5a8d0409aa150884adf5a4d0b956fd0b9906b3 Mon Sep 17 00:00:00 2001
> From: Zhenyu Wang 
> Date: Thu, 23 Aug 2018 14:08:06 +0800
> Subject: [PATCH] drm/i915/gvt: Fix life cycle reference on KVM mm
> 
> Handle guest mm access life cycle properly with mmget()/mmput()
> through guest init()/exit(). As noted by Linus, use_mm() depends
> on valid live page table but KVM's mmgrab() doesn't guarantee that.
> As vGPU usage depends on guest VM life cycle, need to make sure to
> use mmget()/mmput() to guarantee VM address access.
> 
> Cc: Linus Torvalds 
> Cc: Paolo Bonzini 
> Cc: Zhi Wang 
> Signed-off-by: Zhenyu Wang 

Reviewed-by: Paolo Bonzini 

> ---
>  drivers/gpu/drm/i915/gvt/kvmgt.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c 
> b/drivers/gpu/drm/i915/gvt/kvmgt.c
> index 71751be329e3..4a0988747d08 100644
> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1614,9 +1615,16 @@ static int kvmgt_guest_init(struct mdev_device *mdev)
>   if (__kvmgt_vgpu_exist(vgpu, kvm))
>   return -EEXIST;
>  
> + if (!mmget_not_zero(kvm->mm)) {
> + gvt_vgpu_err("Can't get KVM mm reference\n");
> + return -EINVAL;
> + }
> +
>   info = vzalloc(sizeof(struct kvmgt_guest_info));
> - if (!info)
> + if (!info) {
> + mmput(kvm->mm);
>   return -ENOMEM;
> + }
>  
>   vgpu->handle = (unsigned long)info;
>   info->vgpu = vgpu;
> @@ -1647,6 +1655,8 @@ static bool kvmgt_guest_exit(struct kvmgt_guest_info 
> *info)
>   debugfs_remove(info->debugfs_cache_entries);
>  
>   kvm_page_track_unregister_notifier(info->kvm, >track_node);
> + if (info->kvm->mm)
> + mmput(info->kvm->mm);
>   kvm_put_kvm(info->kvm);
>   kvmgt_protect_table_destroy(info);
>   gvt_cache_destroy(info->vgpu);
> 




signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Possible use_mm() mis-uses

2018-08-22 Thread Paolo Bonzini
On 22/08/2018 18:44, Linus Torvalds wrote:
> An example of something that *isn't* right, is the i915 kvm interface,
> which does
> 
> use_mm(kvm->mm);
> 
> on an mm that was initialized in virt/kvm/kvm_main.c using
> 
> mmgrab(current->mm);
> kvm->mm = current->mm;
> 
> which is *not* right. Using "mmgrab()" does indeed guarantee the
> lifetime of the 'struct mm_struct' itself, but it does *not* guarantee
> the lifetime of the page tables. You need to use "mmget()" and
> "mmput()", which get the reference to the actual process address
> space!
> 
> Now, it is *possible* that the kvm use is correct too, because kvm
> does register a mmu_notifier chain, and in theory you can avoid the
> proper refcounting by just making sure the mmu "release" notifier
> kills any existing uses, but I don't really see kvm doing that. Kvm
> does register a release notifier, but that just flushes the shadow
> page tables, it doesn't kill any use_mm() use by some i915 use case.

Yes, KVM is correct but the i915 bits are at least fishy.  It's probably
as simple as adding a mmget/mmput pair respectively in kvmgt_guest_init
and kvmgt_guest_exit, or maybe mmget_not_zero.

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [GIT PULL] kvm/page_track changes for i915 KVMGT

2016-11-09 Thread Paolo Bonzini


On 09/11/2016 16:15, Daniel Vetter wrote:
> On Wed, Nov 09, 2016 at 03:25:19PM +0100, Paolo Bonzini wrote:
>> Daniel,
>>
>> The following changes since commit d9092f52d7e61dd1557f2db2400ddb430e85937e:
>>
>>   kvm: x86: Check memopp before dereference (CVE-2016-8630) (2016-11-02 
>> 21:31:53 +0100)
>>
>> are available in the git repository at:
>>
>>   git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-kvmgt
>>
>> for you to fetch changes up to 871b7ef2a1850d0b435c8b324bf4a5d391adde3f:
>>
>>   kvm/page_track: export symbols for external usage (2016-11-04 12:13:20 
>> +0100)
> 
> Pulled into drm-intel, thanks. Please also pull this into your kvm-next
> tree to make sure we can land kvm/drm trees in any order for the 4.10
> merge window.

Yes, I've already done that (though I have not pushed yet).

Paolo

> Thanks, Daniel
> 
>>
>> 
>> The three KVM patches that KVMGT needs.
>>
>> 
>> Jike Song (2):
>>   kvm/page_track: call notifiers with kvm_page_track_notifier_node
>>   kvm/page_track: export symbols for external usage
>>
>> Xiaoguang Chen (1):
>>   KVM: x86: add track_flush_slot page track notifier
>>
>>  arch/x86/include/asm/kvm_page_track.h | 14 +-
>>  arch/x86/kvm/mmu.c| 11 ++-
>>  arch/x86/kvm/page_track.c | 31 ++-
>>  arch/x86/kvm/x86.c|  2 +-
>>  4 files changed, 54 insertions(+), 4 deletions(-)
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [GIT PULL] kvm/page_track changes for i915 KVMGT

2016-11-09 Thread Paolo Bonzini
Daniel,

The following changes since commit d9092f52d7e61dd1557f2db2400ddb430e85937e:

  kvm: x86: Check memopp before dereference (CVE-2016-8630) (2016-11-02 
21:31:53 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-kvmgt

for you to fetch changes up to 871b7ef2a1850d0b435c8b324bf4a5d391adde3f:

  kvm/page_track: export symbols for external usage (2016-11-04 12:13:20 +0100)


The three KVM patches that KVMGT needs.


Jike Song (2):
  kvm/page_track: call notifiers with kvm_page_track_notifier_node
  kvm/page_track: export symbols for external usage

Xiaoguang Chen (1):
  KVM: x86: add track_flush_slot page track notifier

 arch/x86/include/asm/kvm_page_track.h | 14 +-
 arch/x86/kvm/mmu.c| 11 ++-
 arch/x86/kvm/page_track.c | 31 ++-
 arch/x86/kvm/x86.c|  2 +-
 4 files changed, 54 insertions(+), 4 deletions(-)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/2] extend page_track for external usage

2016-11-09 Thread Paolo Bonzini


On 09/11/2016 03:03, Zhenyu Wang wrote:
> On 2016.11.07 10:17:54 +0100, Daniel Vetter wrote:
 Paolo, for this case, do you think it's feasible we pick them through
 drm/i915 merge path? As currently initial KVMGT patch sets require these
 exported symbols, that's why I ask how we should handle this dependency.
>>>
>>> Then it's actually a good thing that I dropped from kvm/queue!  You can
>>> certainly include these patches, but please do that through a topic branch.
>>>
>>> I've prepared a branch for you
>>> (git://git.kernel.org/pub/scm/virt/kvm/kvm.git branch for-kvmgt).  Once
>>> Linus processes my outstanding pull request, the branch will only
>>> include the three page-tracking patches.  Please pull that topic branch
>>> into your own branch, and ensure you have a merge commit when you send
>>> the pull request to Daniel.  The merge commit ensures that the workflow
>>> was correct; use --no-ff if necessary.
>>>
>>> You can do the same for Jike's patches for the KVM-VFIO device, when
>>> Alex reviews them, and I suppose you'll need a topic branch for mdev
>>> too?  I didn't know that KVMGT was planned for 4.10.  In the future,
>>> let's synchronize ahead so that we can prepare topic branches for you.
>>
>> Ok, back from the useless wifi at plumbers, I can mail again. Zhenyu
>> confirmed on irc that the initial code pile only needs this. For the
>> cross-maintainer topic tree I prefer a formal pull request with stable
>> tag. Please also cc: intel-gfx on that, since I plan to merge that one
>> directly into i915.
>>
> 
> Paolo, could you help to do this for Daniel? Daniel would like to merge
> current KVMGT required change for KVM directly, then I'd base KVMGT change
> on that.

Yes, I'll send it today.

Paolo



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/10] mm: remove write/force parameters from __get_user_pages_unlocked()

2016-10-13 Thread Paolo Bonzini
= 0;
>   unsigned long max_pages_per_loop = PVM_MAX_KMALLOC_PAGES
>   / sizeof(struct pages *);
> + unsigned int flags = FOLL_REMOTE;
>  
>   /* Work out address and page range required */
>   if (len == 0)
>   return 0;
>   nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1;
>  
> + if (vm_write)
> + flags |= FOLL_WRITE;
> +
>   while (!rc && nr_pages && iov_iter_count(iter)) {
>   int pages = min(nr_pages, max_pages_per_loop);
>   size_t bytes;
> @@ -104,8 +108,7 @@ static int process_vm_rw_single_vec(unsigned long addr,
>* current/current->mm
>*/
>   pages = __get_user_pages_unlocked(task, mm, pa, pages,
> -   vm_write, 0, process_pages,
> -   FOLL_REMOTE);
> +   process_pages, flags);
>   if (pages <= 0)
>   return -EFAULT;
>  
> diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
> index db96688..8035cc1 100644
> --- a/virt/kvm/async_pf.c
> +++ b/virt/kvm/async_pf.c
> @@ -84,7 +84,8 @@ static void async_pf_execute(struct work_struct *work)
>* mm and might be done in another context, so we must
>* use FOLL_REMOTE.
>*/
> - __get_user_pages_unlocked(NULL, mm, addr, 1, 1, 0, NULL, FOLL_REMOTE);
> + __get_user_pages_unlocked(NULL, mm, addr, 1, NULL,
> + FOLL_WRITE | FOLL_REMOTE);
>  
>   kvm_async_page_present_sync(vcpu, apf);
>  
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 81dfc73..28510e7 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1416,10 +1416,15 @@ static int hva_to_pfn_slow(unsigned long addr, bool 
> *async, bool write_fault,
>   down_read(>mm->mmap_sem);
>   npages = get_user_page_nowait(addr, write_fault, page);
>   up_read(>mm->mmap_sem);
> - } else
> + } else {
> +     unsigned int flags = FOLL_TOUCH | FOLL_HWPOISON;
> +
> + if (write_fault)
> + flags |= FOLL_WRITE;
> +
>   npages = __get_user_pages_unlocked(current, current->mm, addr, 
> 1,
> -write_fault, 0, page,
> -FOLL_TOUCH|FOLL_HWPOISON);
> +page, flags);
> + }
>   if (npages != 1)
>   return npages;
>  
> 

Acked-by: Paolo Bonzini <pbonz...@redhat.com>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 09:40, Gerd Hoffmann wrote:
>> > But this code should be
>> > minor to be maintained in libvirt.
> As far I know libvirt only needs to discover those devices.  If they
> look like sr/iov devices in sysfs this might work without any changes to
> libvirt.

I don't think they will look like SR/IOV devices.

The interface may look a little like the sysfs interface that GVT-g is
already using.  However, it should at least be extended to support
multiple vGPUs in a single VM.  This might not be possible for Intel
integrated graphics, but it should definitely be possible for discrete
graphics cards.

Another nit is that the VM id should probably be replaced by a UUID
(because it's too easy to stumble on an existing VM id), assuming a VM
id is needed at all.

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 16:32, Stefano Stabellini wrote:
> > In addition to Kevin's replies, I have a high-level question: can VFIO
> > be used by QEMU for both KVM and Xen?
> 
> No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
> is owned by Xen.

I don't think QEMU command line compatibility between KVM and Xen should
be a design goal for GVT-g.

Nevertheless, it shouldn't be a problem to use a "virtual" VFIO (which
doesn't need the IOMMU, because it uses the MMU in the physical GPU)
even under Xen.

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

2014-12-10 Thread Paolo Bonzini


On 09/12/2014 03:49, Tian, Kevin wrote:
 - Now we have XenGT/KVMGT separately maintained, and KVMGT lags
 behind XenGT regarding to features and qualities. Likely you'll continue
 see stale code (like Xen inst decoder) for some time. In the future we
 plan to maintain a single kernel repo for both, so KVMGT can share
 same quality as XenGT once KVM in-kernel dm framework is stable.
 
 - Regarding to Qemu hacks, KVMGT really doesn't have any different 
 requirements as what have been discussed for GPU pass-through, e.g. 
 about ISA bridge. Our implementation is based on an old Qemu repo, 
 and honestly speaking not cleanly developed, because we know we
 can leverage from GPU pass-through support once it's in Qemu. At 
 that time we'll leverage the same logic with minimal changes to 
 hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So
 we can ignore this area for now. :-)

Could the virtual device model introduce new registers in order to avoid
poking at the ISA bridge?  I'm not sure that you can leverage from GPU
pass-through support once it's in Qemu, since the Xen IGD passthrough
support is being added to a separate machine that is specific to Xen IGD
passthrough; no ISA bridge hacking will probably be allowed on the -M
pc and -M q35 machine types.

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

2014-12-10 Thread Paolo Bonzini


On 11/12/2014 01:33, Tian, Kevin wrote:
 My point is that KVMGT doesn't introduce new requirements as what's
 required in IGD passthrough case, because all the hacks you see now
 is to satisfy guest graphics driver's expectation. I haven't follow up the
 KVM IGD passthrough progress, but if it doesn't require ISA bridge hacking
 the same trick can be adopted by KVMGT too.

Right now it did require ISA bridge hacking.

 You may know Allen is
 working on driver changes to avoid causing those hacks in Qemu side.
 That effort will benefit us too.

That's good to know, thanks!

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

2014-12-05 Thread Paolo Bonzini


On 05/12/2014 09:50, Gerd Hoffmann wrote:
 A few comments on the kernel stuff (brief look so far, also
 compile-tested only, intel gfx on my test machine is too old).
 
  * Noticed the kernel bits don't even compile when configured as
module.  Everything (vgt, i915, kvm) must be compiled into the
kernel.

I'll add that the patch is basically impossible to review with all the
XenGT bits still in.  For example, the x86 emulator seems to be
unnecessary for KVMGT, but I am not 100% sure.

I would like a clear understanding of why/how Andrew Barnes was able to
do i915 passthrough (GVT-d) without hacking the ISA bridge, and why this
does not apply to GVT-g.

Paolo

  * Design approach still seems to be i915 on vgt not the other way
around.
 
 Qemu/SeaBIOS bits:
 
 I've seen the host bridge changes identity from i440fx to
 copy-pci-ids-from-host.  Guess the reason for this is that seabios uses
 this device to figure whenever it is running on i440fx or q35.  Correct?
 
 What are the exact requirements for the device?  Must it match the host
 exactly, to not confuse the guest intel graphics driver?  Or would
 something more recent -- such as the q35 emulation qemu has -- be good
 enough to make things work (assuming we add support for the
 graphic-related pci config space registers there)?
 
 The patch also adds a dummy isa bridge at 0x1f.  Simliar question here:
 What exactly is needed here?  Would things work if we simply use the q35
 lpc device here?
 
 more to come after I've read the paper linked above ...
 
 cheers,
   Gerd
 
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx
 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [IGDVFIO] [PATCH 3/8] RFC and help completing: Intel IGD Direct Assignment with VFIO

2014-09-24 Thread Paolo Bonzini
Il 24/09/2014 21:47, Alex Williamson ha scritto:
 So the opregion is mapped by a config write on the IGD device itself and
 the other 3 regions, that we know about so far, are mapped via writes to
 the host bridge.

AFAIU the opregion is mapped by the (host) BIOS, that writes the address
to a well-known scratch dword in the configuration space.  The host
reads from the dword and finds the opregion that way.

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Qemu-devel] [IGDVFIO] [PATCH 3/8] RFC and help completing: Intel IGD Direct Assignment with VFIO

2014-09-24 Thread Paolo Bonzini
Il 24/09/2014 22:57, Alex Williamson ha scritto:
 Right, that's the physical mapping, Andy's patches are mimicking that
 behavior virtually.  Seabios reserves memory, creates e820 entries, and
 maps the hardware by writing to these registers.  That triggers QEMU
 to adjust the MemoryRegion in the guest address space which is an mmap
 to the host address space, using /dev/mem for now, but hopefully the
 vfio file descriptor in the future (I should be careful what I hope
 for).

Yeah, I remember discussing that with Andrew on IRC.  So he did
implement that idea.

 The opregion is pretty trivial because the write is to the IGD itself.
 The others are to the host bridge, so we need to figure out what sort of
 abstraction makes sense to get that back into vfio code.

Do we have to support all chipsets?  IIUC the more recent devices need
fewer and fewer backdoors.

Paolo

 Perhaps vfio creates all the memory regions and registers them into an
 igd service and the host bridge can make calls like:
 
 gtt = igd_get_gtt_mr();
 
 which returns NULL and nothing happens or the registered MemoryRegion
 and the host bridge places it into the address space.  Thanks,

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support

2014-07-29 Thread Paolo Bonzini
Il 29/07/2014 08:59, Chen, Tiejun ha scritto:

 (see https://lkml.org/lkml/2014/6/19/121)
 gpu:drm:i915:intel_detect_pch: back to check devfn instead of check
 class
 type. Because Windows always use this way, so I think this point
 should be
 same between Linux and Windows.

 Didn't we discuss that we did not want to pass in PCH at all if we can?
 
 I'm a bit confused since I guess I'm missing something important in this
 long discussion. I guess we just fake a simple PCIe device but just has
 PCI_VENDOR_ID_XEN/Real_PCH_Device_ID, and then well probe such a PCIe
 device inside intel_detect_pch(), right?

Yes, for the special PIIX4 legacy machine type we want to do that and
place the device at 00:1f.0.

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-07-07 Thread Paolo Bonzini

Il 07/07/2014 16:49, Daniel Vetter ha scritto:

So the correct fix to forward intel gpus to guests is indeed to somehow
fake the pch pci ids since the driver really needs them. Gross design, but
that's how the hardware works.


A way that could work for virtualization is this: if you find the card 
has a magic subsystem vendor id, fetch the subsystem device id and use 
_that_ as the PCH device id.


Would that work for you?

Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-07-07 Thread Paolo Bonzini

Il 07/07/2014 19:54, Daniel Vetter ha scritto:

On Mon, Jul 07, 2014 at 04:57:45PM +0200, Paolo Bonzini wrote:

Il 07/07/2014 16:49, Daniel Vetter ha scritto:

So the correct fix to forward intel gpus to guests is indeed to somehow
fake the pch pci ids since the driver really needs them. Gross design, but
that's how the hardware works.


A way that could work for virtualization is this: if you find the card has a
magic subsystem vendor id, fetch the subsystem device id and use _that_ as
the PCH device id.

Would that work for you?


I guess for quemu it also depends upon what windows does since we can't
change that. If we care about that part. Another consideration is
supporting older kernels, if that's possible at all.


Yes, but right now it's more important to get something that's not too 
gross for the future, for both Linux and Windows.  Hacks for existing 
guests can be done separately, especially since they might differ 
between Linux (check ISA bridge) and Windows (check 1f.0).


Paolo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support

2014-07-02 Thread Paolo Bonzini

Il 02/07/2014 18:23, Konrad Rzeszutek Wilk ha scritto:

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 651e65e..03f2829 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -433,6 +433,8 @@ void intel_detect_pch(struct drm_device *dev)
unsigned short id = pch-device  
INTEL_PCH_DEVICE_ID_MASK;
dev_priv-pch_id = id;

+   if (pch-subsystem_vendor == PCI_VENDOR_ID_XEN)
+   id = pch-device  INTEL_PCH_DEVICE_ID_MASK;


Actually you could look at *dev*'s subsystem IDs and skip the pch lookup 
completely.


Paolo


if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) {
dev_priv-pch_type = PCH_IBX;
DRM_DEBUG_KMS(Found Ibex Peak PCH\n);




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-06-25 Thread Paolo Bonzini

Il 22/06/2014 10:25, Chen, Tiejun ha scritto:

In qemu-upstream, as you commented we can't create this as a ISA class
type explicitly.


Note I didn't say that QEMU doesn't like having two ISA bridges.

I commented that the firmware will see two ISA bridges and will try to 
initialize both of them.  It currently doesn't just because it only 
knows of two southbridge PCI IDs, and both of them are old or relatively 
old (PIIX3/4 and ICH9).


But what would happen if you did graphics passthrough on a machine with 
an ICH9 southbridge?  Your code will create the PIIX4 ISA bridge, and 
will create an ICH9 southbridge just to please the i915 driver.  The 
BIOS will recognize the ICH9's vendor/device ids and try to initialize 
it.  It will write to the configuration space to set up PCI PIRQ[A-H] 
routing, for example, which makes no sense since your ICH9 is just a 
dummy device.


Second problem.  Your IGD passthrough code currently works with QEMU's 
PIIX4-based machine.  But what happens if you try to extend it, so that 
it works with QEMU's ICH9-based machine?  That's a more modern machine 
that has a PCIe chipset and hence has its ISA bridge at 00:1f.0.  Now 
you have a conflict.


In other words, if you want IGD passthrough in QEMU, you need a solution 
that is future-proof.



So we compromise by faking this ISA bridge without ISA
class type setting (as I recall you already said this way is slightly
better).


It is only slightly better, but the right solution is to fix the driver. 
 There is absolutely zero reason why a graphics driver should know 
about the vendor/device ids of the PCH.


The right way could be to make QEMU add a vendor-specific capability to 
the video device. The driver can probe for that capability before 
looking at the PCI bus.  QEMU can add the capability to the list, it is 
easy because all accesses to the video device's configuration space trap 
to QEMU.  Then you do not need to add fake devices to the machine.


In fact, it would be nice if Intel added such a capability on the next 
generation of integrated graphics, too.  On real hardware, ACPI or some 
other kind of firmware can probe the PCH at 00:1f.0 and initialize that 
vendor-specific capability.  QEMU would just forward the value from the 
host, so that virtualized guests never depend on the PCH at 00:1f.0.


Paolo


Maybe we will figure better way in the future. But anyway, this
is always registered as 00:15.0, right? So I think the i915 driver can
go back to probe the devfn like the native environment.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-06-20 Thread Paolo Bonzini

Il 19/06/2014 11:53, Tiejun Chen ha scritto:

so this mean that isa bridge is still represented with Dev31:Func0
like the native OS. Furthermore, currently we're pushing VGA
passthrough support into qemu upstream, and with some discussion,
we wouldn't set the bridge class type and just expose this devfn.


Even this is not really optimal.  It just happens to work just because 
QEMU's machine is currently a PCI machine with the ISA bridge on 00:01.0.


As soon as you'll try doing integrated graphics passthrough on a PCIe 
machine type (such as QEMU's -M q35) things will break down just as badly.


Paolo
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx