[Intel-gfx] ✓ Fi.CI.BAT: success for enhanced i915 vgpu with PV feature support

2020-09-06 Thread Patchwork
== Series Details ==

Series: enhanced i915 vgpu with PV feature support
URL   : https://patchwork.freedesktop.org/series/81400/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8968 -> Patchwork_18446


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18446/index.html

Known issues


  Here are the changes found in Patchwork_18446 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@gem_contexts:
- fi-tgl-u2:  [PASS][1] -> [INCOMPLETE][2] ([i915#2045])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8968/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18446/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic:
- fi-icl-u2:  [PASS][3] -> [DMESG-WARN][4] ([i915#1982]) +1 similar 
issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8968/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18446/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2:
- fi-skl-guc: [PASS][5] -> [DMESG-WARN][6] ([i915#2203])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8968/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18446/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html

  
 Possible fixes 

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-bsw-kefka:   [DMESG-WARN][7] ([i915#1982]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8968/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18446/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html

  
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2045]: https://gitlab.freedesktop.org/drm/intel/issues/2045
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203


Participating hosts (35 -> 32)
--

  Missing(3): fi-byt-clapper fi-byt-squawks fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_8968 -> Patchwork_18446

  CI-20190529: 20190529
  CI_DRM_8968: 4d831cabbba82294ba008ec7999c71f443b1864f @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5779: f52bf19b5f02d52fc3e201c6467ec3f511227fba @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18446: d4828a33c2335b6b8865993683f6821c6ce4ba4d @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d4828a33c233 drm/i915/gvt: GVTg support pv workload submssion
9352c15a9bc8 drm/i915/gvt: GVTg support ggtt pv operations
0ab96123813a drm/i915/gvt: GVTg support ppgtt pv operations
faded3875072 drm/i915/gvt: GVTg support vgpu pv CTB protocol
b87ea2334728 drm/i915/gvt: GVTg handle guest shared_page setup
e104ce5af762 drm/i915/gvt: GVTg expose pv_caps PVINFO register
1d7f2ad07d7e drm/i915: vgpu workload submisison pv support
ebdd3ccc911e drm/i915: vgpu ggtt page table pv support
4de75f774c0a drm/i915: vgpu ppgtt page table pv support
3357dfa9bfcf drm/i915: vgpu pv command buffer transport protocol
6c29bb1bd899 drm/i915: vgpu shared memory setup for pv support
b72c9e3578ae drm/i915: introduced vgpu pv capability

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18446/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.DOCS: warning for enhanced i915 vgpu with PV feature support

2020-09-06 Thread Patchwork
== Series Details ==

Series: enhanced i915 vgpu with PV feature support
URL   : https://patchwork.freedesktop.org/series/81400/
State : warning

== Summary ==

$ make htmldocs 2>&1 > /dev/null | grep i915
./drivers/gpu/drm/i915/i915_vgpu.c:846: warning: Excess function parameter 
'dev_priv' description in 'intel_vgpu_detect_pv_caps'
./drivers/gpu/drm/i915/i915_vgpu.c:547: warning: Function parameter or member 
'pv' not described in 'pv_command_buffer_write'
./drivers/gpu/drm/i915/i915_vgpu.c:547: warning: Function parameter or member 
'action' not described in 'pv_command_buffer_write'
./drivers/gpu/drm/i915/i915_vgpu.c:547: warning: Function parameter or member 
'len' not described in 'pv_command_buffer_write'
./drivers/gpu/drm/i915/i915_vgpu.c:547: warning: Function parameter or member 
'fence' not described in 'pv_command_buffer_write'
./drivers/gpu/drm/i915/i915_vgpu.c:847: warning: Function parameter or member 
'i915' not described in 'intel_vgpu_detect_pv_caps'
./drivers/gpu/drm/i915/i915_vgpu.c:847: warning: Function parameter or member 
'shared_area' not described in 'intel_vgpu_detect_pv_caps'
./drivers/gpu/drm/i915/i915_vgpu.c:847: warning: Excess function parameter 
'dev_priv' description in 'intel_vgpu_detect_pv_caps'


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for enhanced i915 vgpu with PV feature support

2020-09-06 Thread Patchwork
== Series Details ==

Series: enhanced i915 vgpu with PV feature support
URL   : https://patchwork.freedesktop.org/series/81400/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/gt/intel_reset.c:1311:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/i915_perf.c:1440:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1494:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/intel_pv_submission.c:275:6: warning: symbol 
'pv_submit_request' was not declared. Should it be static?
+./include/linux/seqlock.h:752:24: warning: trying to copy expression type 31
+./include/linux/seqlock.h:778:16: warning: trying to copy expression type 31
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read16' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read32' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read64' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read8' - 
different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write16' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write32' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write8' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write16' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write32' 
- different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write8' 
- different lock contexts for basic block


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for enhanced i915 vgpu with PV feature support

2020-09-06 Thread Patchwork
== Series Details ==

Series: enhanced i915 vgpu with PV feature support
URL   : https://patchwork.freedesktop.org/series/81400/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
b72c9e3578ae drm/i915: introduced vgpu pv capability
-:98: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#98: FILE: drivers/gpu/drm/i915/i915_vgpu.c:144:
+static bool intel_vgpu_check_pv_cap(struct drm_i915_private *dev_priv,
+   enum pv_caps cap)

-:101: CHECK:LOGICAL_CONTINUATIONS: Logical continuations should be on the 
previous line
#101: FILE: drivers/gpu/drm/i915/i915_vgpu.c:147:
+   return (dev_priv->vgpu.active && (dev_priv->vgpu.caps & VGT_CAPS_PV)
+   && (dev_priv->vgpu.pv_caps & cap));

-:125: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#125: FILE: drivers/gpu/drm/i915/i915_vgpu.c:366:
+void intel_vgpu_config_pv_caps(struct drm_i915_private *i915,
+   enum pv_caps cap, void *data)

-:127: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#127: FILE: drivers/gpu/drm/i915/i915_vgpu.c:368:
+{
+

-:140: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#140: FILE: drivers/gpu/drm/i915/i915_vgpu.c:381:
+bool intel_vgpu_detect_pv_caps(struct drm_i915_private *i915,
+   void __iomem *shared_area)

-:181: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#181: FILE: drivers/gpu/drm/i915/i915_vgpu.h:49:
+bool intel_vgpu_detect_pv_caps(struct drm_i915_private *i915,
+   void __iomem *shared_area);

-:183: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#183: FILE: drivers/gpu/drm/i915/i915_vgpu.h:51:
+void intel_vgpu_config_pv_caps(struct drm_i915_private *i915,
+   enum pv_caps cap, void *data);

total: 0 errors, 0 warnings, 7 checks, 137 lines checked
6c29bb1bd899 drm/i915: vgpu shared memory setup for pv support
-:104: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#104: FILE: drivers/gpu/drm/i915/i915_vgpu.c:377:
+static int intel_vgpu_setup_shared_page(struct drm_i915_private *i915,
+   void __iomem *shared_area)

-:162: CHECK:ALLOC_SIZEOF_STRUCT: Prefer kzalloc(sizeof(*pv)...) over 
kzalloc(sizeof(struct i915_virtual_gpu_pv)...)
#162: FILE: drivers/gpu/drm/i915/i915_vgpu.c:435:
+   pv = kzalloc(sizeof(struct i915_virtual_gpu_pv), GFP_KERNEL);

total: 0 errors, 0 warnings, 2 checks, 172 lines checked
3357dfa9bfcf drm/i915: vgpu pv command buffer transport protocol
-:52: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#52: FILE: drivers/gpu/drm/i915/i915_vgpu.c:389:
+static int wait_for_desc_update(struct vgpu_pv_ct_buffer_desc *desc,
+   u32 fence, u32 *status)

-:64: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#64: FILE: drivers/gpu/drm/i915/i915_vgpu.c:401:
+   DRM_ERROR("CT: fence %u failed; reported fence=%u\n",
+   fence, desc->fence);

-:89: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#89: FILE: drivers/gpu/drm/i915/i915_vgpu.c:426:
+static int pv_command_buffer_write(struct i915_virtual_gpu_pv *pv,
+   const u32 *action, u32 len /* in dwords */, u32 fence)

-:150: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#150: FILE: drivers/gpu/drm/i915/i915_vgpu.c:487:
+static int pv_send(struct drm_i915_private *i915,
+   const u32 *action, u32 len, u32 *status)

-:185: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#185: FILE: drivers/gpu/drm/i915/i915_vgpu.c:522:
+static int intel_vgpu_pv_send_command_buffer(struct drm_i915_private *i915,
+   u32 *action, u32 len)

-:201: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#201: FILE: drivers/gpu/drm/i915/i915_vgpu.c:538:
+   DRM_ERROR("PV: send action %#x returned %d (%#x)\n",
+   action[0], ret, ret);

-:251: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV)
#251: FILE: drivers/gpu/drm/i915/i915_vgpu.c:627:
+   pv->ctb.desc->size = PAGE_SIZE/2;
  ^

-:270: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV)
#270: FILE: drivers/gpu/drm/i915/i915_vgpu.h:34:
+#define PV_DESC_OFF (PAGE_SIZE/256)
   ^

-:271: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV)
#271: FILE: drivers/gpu/drm/i915/i915_vgpu.h:35:
+#define PV_CMD_OFF  (PAGE_SIZE/2)
   ^

total: 0 errors, 0 warnings, 9 checks, 299 lines checked
4de75f774c0a drm/i915: vgpu ppgtt page table pv support
-:55: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#55: FILE: drivers/gpu/drm/i915/i915_vgpu.c:377:
+static int vgpu_pv_vma_action(struct i915_address_space *vm,
+   struct i915_vma *vma,

-:87: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be

[Intel-gfx] [PATCH v1 12/12] drm/i915/gvt: GVTg support pv workload submssion

2020-09-06 Thread Xiaolin Zhang
implemented pv workload submission support within GVTg.

GVTg to read engine submission data (engine lrc) from the shared_page
with pv interface to reduce mmio trap cost and then eliminate
execlist HW behavior emulation by removing injecting context switch
interrupt to guest under workload submisison pv mode to improve efficiency.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gvt/gvt.h  |   1 +
 drivers/gpu/drm/i915/gvt/handlers.c | 101 
 drivers/gpu/drm/i915/gvt/vgpu.c |   1 +
 3 files changed, 103 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 05c2f13..18c0926 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -217,6 +217,7 @@ struct intel_vgpu {
u32 pv_caps;
u64 shared_page_gpa;
bool shared_page_enabled;
+   u64 pv_sub_gpa;
 };
 
 static inline void *intel_vgpu_vdev(struct intel_vgpu *vgpu)
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index f1ad024..399427d 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1323,6 +1323,7 @@ static int pv_command_buffer_read(struct intel_vgpu *vgpu,
 
 static int handle_pv_commands(struct intel_vgpu *vgpu)
 {
+   struct pv_cap_addr *cap_addr;
struct intel_vgpu_mm *mm;
struct pv_vma *vma;
u64 pdp;
@@ -1336,6 +1337,17 @@ static int handle_pv_commands(struct intel_vgpu *vgpu)
return ret;
 
switch (cmd) {
+   case PV_CMD_REGISTER_CAP_GPA:
+   cap_addr = (struct pv_cap_addr *)data;
+   switch (cap_addr->cap) {
+   case PV_SUBMISSION:
+   vgpu->pv_sub_gpa = cap_addr->gpa;
+   break;
+   default:
+   gvt_vgpu_err("invalid pv cap 0x%x\n", cap_addr->cap);
+   break;
+   }
+   break;
case PV_CMD_BIND_PPGTT:
case PV_CMD_UNBIND_PPGTT:
vma = (struct pv_vma *)data;
@@ -1858,6 +1870,91 @@ static int mmio_read_from_hw(struct intel_vgpu *vgpu,
return intel_vgpu_default_mmio_read(vgpu, offset, p_data, bytes);
 }
 
+static int pv_prepare_workload(struct intel_vgpu_workload *workload)
+{
+   return 0;
+}
+
+static int pv_complete_workload(struct intel_vgpu_workload *workload)
+{
+   return 0;
+}
+
+static int submit_context_pv(struct intel_vgpu *vgpu,
+ const struct intel_engine_cs *engine,
+ struct execlist_ctx_descriptor_format *desc,
+ bool emulate_schedule_in)
+{
+   struct intel_vgpu_workload *workload = NULL;
+
+   workload = intel_vgpu_create_workload(vgpu, engine, desc);
+   if (IS_ERR(workload))
+   return PTR_ERR(workload);
+
+   workload->prepare = pv_prepare_workload;
+   workload->complete = pv_complete_workload;
+
+   intel_vgpu_queue_workload(workload);
+   return 0;
+}
+
+#define get_desc_from_elsp_dwords(ed, i) \
+   ((struct execlist_ctx_descriptor_format *)&((ed)->data[i * 2]))
+
+static int handle_pv_submission(struct intel_vgpu *vgpu,
+   const struct intel_engine_cs *engine)
+{
+   struct intel_vgpu_execlist *execlist;
+   struct pv_submission subdata;
+   struct execlist_ctx_descriptor_format *desc[2];
+   u32 ring_id = engine->id;
+   u64 base = vgpu->pv_sub_gpa + ring_id * sizeof(struct pv_submission);
+   u64 submit_off = offsetof(struct pv_submission, submitted) + base;
+   bool submitted = false;
+   int i, ret;
+
+   execlist = &vgpu->submission.execlist[ring_id];
+   if (intel_gvt_hypervisor_read_gpa(vgpu, base, &subdata, 
sizeof(subdata)))
+   return -EINVAL;
+
+   desc[0] = (struct execlist_ctx_descriptor_format *)&(subdata.descs[0]);
+   desc[1] = (struct execlist_ctx_descriptor_format *)&(subdata.descs[1]);
+
+   if (!desc[0]->valid) {
+   gvt_vgpu_err("invalid elsp submission, desc0 is invalid\n");
+   goto inv_desc;
+   }
+
+   for (i = 0; i < ARRAY_SIZE(desc); i++) {
+   if (!desc[i]->valid)
+   continue;
+   if (!desc[i]->privilege_access) {
+   gvt_vgpu_err("unexpected GGTT elsp submission\n");
+   goto inv_desc;
+   }
+   }
+
+   /* submit workload */
+   for (i = 0; i < ARRAY_SIZE(desc); i++) {
+   if (!desc[i]->valid)
+   continue;
+
+   ret = submit_context_pv(vgpu, engine, desc[i], i == 0);
+   if (ret) {
+   gvt_vgpu_err("failed to submit desc %d\n", i);
+   return ret;
+   }
+   }
+
+   ret = intel_gvt_hypervisor_write_gpa(vgpu, submit_off, &submitted, 1);
+   return ret;
+
+inv_desc:
+   gvt_vgpu

[Intel-gfx] [PATCH v1 03/12] drm/i915: vgpu pv command buffer transport protocol

2020-09-06 Thread Xiaolin Zhang
based on the common shared memory, vgpu pv command transport buffer (CTB)
protocol is implemented which is a simple pv command buffer ring with pv
command descriptor used to perform guest-2-gvt single direction commucation
between guest and host GVTg.

with this CTB, guest can send PV command with PV data to host to perform PV
commands in host side.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/i915_pvinfo.h |   1 +
 drivers/gpu/drm/i915/i915_vgpu.c   | 195 -
 drivers/gpu/drm/i915/i915_vgpu.h   |  53 ++
 3 files changed, 247 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h 
b/drivers/gpu/drm/i915/i915_pvinfo.h
index 1d44876..ded93c5 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -49,6 +49,7 @@ enum vgt_g2v_type {
VGT_G2V_EXECLIST_CONTEXT_CREATE,
VGT_G2V_EXECLIST_CONTEXT_DESTROY,
VGT_G2V_SHARED_PAGE_REGISTER,
+   VGT_G2V_PV_SEND_TRIGGER,
VGT_G2V_MAX,
 };
 
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 8b2b451..e856eff 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -370,6 +370,183 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
  * i915 vgpu PV support for Linux
  */
 
+/**
+ * wait_for_desc_update - Wait for the command buffer descriptor update.
+ * @desc:  buffer descriptor
+ * @fence: response fence
+ * @status:placeholder for status
+ *
+ * GVTg will update command buffer descriptor with new fence and status
+ * after processing the command identified by the fence. Wait for
+ * specified fence and then read from the descriptor status of the
+ * command.
+ *
+ * Return:
+ * *   0 response received (status is valid)
+ * *   -ETIMEDOUT no response within hardcoded timeout
+ */
+static int wait_for_desc_update(struct vgpu_pv_ct_buffer_desc *desc,
+   u32 fence, u32 *status)
+{
+   int err;
+
+#define done (READ_ONCE(desc->fence) == fence)
+   err = wait_for_us(done, 5);
+   if (err)
+   err = wait_for(done, 10);
+#undef done
+
+   if (unlikely(err)) {
+   DRM_ERROR("CT: fence %u failed; reported fence=%u\n",
+   fence, desc->fence);
+   }
+
+   *status = desc->status;
+   return err;
+}
+
+/**
+ * CTB Guest to GVT request
+ *
+ * Format of the CTB Guest to GVT request message is as follows::
+ *
+ *  ++-+-+-+-+
+ *  |   msg[0]   |   [1]   |   [2]   |   ...   |  [n-1]  |
+ *  ++-+-+-+-+
+ *  |   MESSAGE  |   MESSAGE PAYLOAD |
+ *  +   HEADER   +-+-+-+-+
+ *  ||0|1|   ...   |n|
+ *  ++=+=+=+=+
+ *  |  len >= 1  |  FENCE  | request specific data   |
+ *  +--+-+-+-+-+-+
+ *
+ *   ^-len---^
+ */
+static int pv_command_buffer_write(struct i915_virtual_gpu_pv *pv,
+   const u32 *action, u32 len /* in dwords */, u32 fence)
+{
+   struct vgpu_pv_ct_buffer_desc *desc = pv->ctb.desc;
+   u32 head = desc->head / 4;  /* in dwords */
+   u32 tail = desc->tail / 4;  /* in dwords */
+   u32 size = desc->size / 4;  /* in dwords */
+   u32 used;   /* in dwords */
+   u32 header;
+   u32 *cmds = pv->ctb.cmds;
+   unsigned int i;
+
+   GEM_BUG_ON(desc->size % 4);
+   GEM_BUG_ON(desc->head % 4);
+   GEM_BUG_ON(desc->tail % 4);
+   GEM_BUG_ON(tail >= size);
+
+/* tail == head condition indicates empty */
+   if (tail < head)
+   used = (size - head) + tail;
+   else
+   used = tail - head;
+
+   /* make sure there is a space including extra dw for the fence */
+   if (unlikely(used + len + 1 >= size))
+   return -ENOSPC;
+
+   /*
+* Write the message. The format is the following:
+* DW0: header (including action code)
+* DW1: fence
+* DW2+: action data
+*/
+   header = (len << PV_CT_MSG_LEN_SHIFT) |
+(PV_CT_MSG_WRITE_FENCE_TO_DESC) |
+(action[0] << PV_CT_MSG_ACTION_SHIFT);
+
+   cmds[tail] = header;
+   tail = (tail + 1) % size;
+
+   cmds[tail] = fence;
+   tail = (tail + 1) % size;
+
+   for (i = 1; i < len; i++) {
+   cmds[tail] = action[i];
+   tail = (tail + 1) % size;
+   }
+
+   /* now update desc tail (back in bytes) */
+   desc->tail = tail * 4;
+   GEM_BUG_ON(desc->tail > desc->size);
+
+   return 0;
+}
+
+static u32 pv_get_next_fence(struct i915_virtual_gpu_pv *pv)
+{
+   /* For now it's trivial */
+   return ++pv->next_fence;
+}
+
+

[Intel-gfx] [PATCH v1 05/12] drm/i915: vgpu ggtt page table pv support

2020-09-06 Thread Xiaolin Zhang
to improve efficiency and reduce the complexsity of vgpu ggtt support,
vgpu ggtt page table operations are implemented in pv fashion and
implemented pv version of bind/unbind for ggtt vma ops.

The pv version of ggtt vma ops use the CTB protocol to communicate pv ggtt
command along with data struct pv_vma from guest to GVT and then GVT will
implement command handler of PV_CMD_BIND_GGTT and PV_CMD_UBIND_gGTT to
support vgpu GGTT feature.

new PV_GGTT pv_cap is used to control this level of pv support in
both guest and host side.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/i915_gem.c  |  4 +++-
 drivers/gpu/drm/i915/i915_vgpu.c | 37 -
 drivers/gpu/drm/i915/i915_vgpu.h |  3 +++
 3 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bb0c129..77cd09b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1129,9 +1129,11 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
int ret;
 
/* We need to fallback to 4K pages if host doesn't support huge gtt. */
-   if (intel_vgpu_active(dev_priv) && !intel_vgpu_has_huge_gtt(dev_priv))
+   if (intel_vgpu_active(dev_priv)) {
mkwrite_device_info(dev_priv)->page_sizes =
I915_GTT_PAGE_SIZE_4K;
+   intel_vgpu_config_pv_caps(dev_priv, PV_GGTT, &dev_priv->ggtt);
+   }
 
ret = i915_gem_init_userptr(dev_priv);
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 9875e2f..4e50694 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -100,7 +100,7 @@ void intel_vgpu_detect(struct drm_i915_private *dev_priv)
mutex_init(&dev_priv->vgpu.lock);
 
/* guest driver PV capability */
-   dev_priv->vgpu.pv_caps = PV_PPGTT;
+   dev_priv->vgpu.pv_caps = PV_PPGTT | PV_GGTT;
 
if (!intel_vgpu_detect_pv_caps(dev_priv, shared_area)) {
DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
@@ -458,6 +458,34 @@ static void ppgtt_unbind_vma_pv(struct i915_address_space 
*vm,
vgpu_pv_vma_action(vm, vma, PV_CMD_UNBIND_PPGTT, 0, 0);
 }
 
+static void ggtt_bind_vma_pv(struct i915_address_space *vm,
+ struct i915_vm_pt_stash *stash,
+ struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+   struct drm_i915_gem_object *obj = vma->obj;
+   u32 pte_flags;
+
+   if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
+   return;
+
+   /* Applicable to VLV (gen8+ do not support RO in the GGTT) */
+   pte_flags = 0;
+   if (i915_gem_object_is_readonly(obj))
+   pte_flags |= PTE_READ_ONLY;
+
+   pte_flags = vma->vm->pte_encode(0, cache_level, 0);
+   vgpu_pv_vma_action(vm, vma, PV_CMD_BIND_GGTT, 0, pte_flags);
+   vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+}
+
+static void ggtt_unbind_vma_pv_nop(struct i915_address_space *vm,
+   struct i915_vma *vma)
+{
+
+}
+
 /**
  * wait_for_desc_update - Wait for the command buffer descriptor update.
  * @desc:  buffer descriptor
@@ -733,6 +761,7 @@ void intel_vgpu_config_pv_caps(struct drm_i915_private 
*i915,
enum pv_caps cap, void *data)
 {
struct i915_ppgtt *ppgtt;
+   struct i915_ggtt *ggtt;
 
if (!intel_vgpu_check_pv_cap(i915, cap))
return;
@@ -742,6 +771,12 @@ void intel_vgpu_config_pv_caps(struct drm_i915_private 
*i915,
ppgtt->vm.vma_ops.bind_vma= ppgtt_bind_vma_pv;
ppgtt->vm.vma_ops.unbind_vma  = ppgtt_unbind_vma_pv;
}
+
+   if (cap == PV_GGTT) {
+   ggtt = (struct i915_ggtt *)data;
+   ggtt->vm.vma_ops.bind_vma= ggtt_bind_vma_pv;
+   ggtt->vm.vma_ops.unbind_vma  = ggtt_unbind_vma_pv_nop;
+   }
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_vgpu.h b/drivers/gpu/drm/i915/i915_vgpu.h
index 7e4ea99..588e361 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.h
+++ b/drivers/gpu/drm/i915/i915_vgpu.h
@@ -38,6 +38,7 @@ struct i915_ggtt;
 enum pv_caps {
PV_NONE = 0,
PV_PPGTT = BIT(0),
+   PV_GGTT = BIT(1),
 };
 
 /* vgpu PV commands */
@@ -45,6 +46,8 @@ enum intel_vgpu_pv_cmd {
PV_CMD_DEFAULT = 0x0,
PV_CMD_BIND_PPGTT,
PV_CMD_UNBIND_PPGTT,
+   PV_CMD_BIND_GGTT,
+   PV_CMD_UNBIND_GGTT,
 };
 
 /* A common shared page(4KB) between GVTg and vgpu allocated by guest */
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 04/12] drm/i915: vgpu ppgtt page table pv support

2020-09-06 Thread Xiaolin Zhang
to improve efficiency and reduce the complexsity of vgpu ppgtt support,
vgpu ppgtt page table operations are implemented in pv fashion and
implemented pv version of bind/unbind for ppgtt vma ops.

The pv version of ppgtt vma ops use the CTB protocol to communicate
pv ppgtt command along with data struct pv_vma from guest to GVT
and then GVT will implement command handler of PV_CMD_BIND_PPGTT and
PV_CMD_UBIND_PPGTT to support vgpu PPGTT feature.

new PV_PPGTT pv_cap is used to control this level of pv support in
both guest and host side.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  4 +-
 drivers/gpu/drm/i915/i915_vgpu.c | 95 
 drivers/gpu/drm/i915/i915_vgpu.h | 17 +++
 3 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index eb64f47..de0eb6d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -729,8 +729,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 
ppgtt->vm.pte_encode = gen8_pte_encode;
 
-   if (intel_vgpu_active(gt->i915))
+   if (intel_vgpu_active(gt->i915)) {
+   intel_vgpu_config_pv_caps(gt->i915, PV_PPGTT, ppgtt);
gen8_ppgtt_notify_vgt(ppgtt, true);
+   }
 
ppgtt->vm.cleanup = gen8_ppgtt_cleanup;
 
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index e856eff..9875e2f 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -99,6 +99,9 @@ void intel_vgpu_detect(struct drm_i915_private *dev_priv)
dev_priv->vgpu.active = true;
mutex_init(&dev_priv->vgpu.lock);
 
+   /* guest driver PV capability */
+   dev_priv->vgpu.pv_caps = PV_PPGTT;
+
if (!intel_vgpu_detect_pv_caps(dev_priv, shared_area)) {
DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
goto out;
@@ -370,6 +373,91 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
  * i915 vgpu PV support for Linux
  */
 
+static int vgpu_pv_vma_action(struct i915_address_space *vm,
+   struct i915_vma *vma,
+   u32 action, u64 flags, u64 pte_flag)
+{
+   struct drm_i915_private *i915 = vma->vm->i915;
+   struct sgt_iter sgt_iter;
+   dma_addr_t addr;
+   struct pv_vma pvvma;
+   u32 num_pages;
+   u64 *gpas;
+   int i = 0;
+   u32 data[32];
+   int ret;
+   u32 size = sizeof(pvvma) / 4;
+
+   if (1 + size > ARRAY_SIZE(data))
+   return -EIO;
+
+   memset(&pvvma, 0, sizeof(pvvma));
+   num_pages = vma->node.size >> PAGE_SHIFT;
+   pvvma.size = num_pages;
+   pvvma.start = vma->node.start;
+   pvvma.flags = flags;
+
+   if (action == PV_CMD_BIND_PPGTT || action == PV_CMD_UNBIND_PPGTT)
+   pvvma.pml4 = px_dma(i915_vm_to_ppgtt(vm)->pd);
+
+   if (num_pages == 1) {
+   pvvma.dma_addrs = vma->pages->sgl->dma_address | pte_flag;
+   goto out;
+   }
+
+   gpas = kmalloc_array(num_pages, sizeof(u64), GFP_KERNEL);
+   if (gpas == NULL)
+   return -ENOMEM;
+
+   pvvma.dma_addrs = virt_to_phys((void *)gpas);
+   for_each_sgt_daddr(addr, sgt_iter, vma->pages)
+   gpas[i++] = addr | pte_flag;
+
+   /* Fill the allocated but "unused" space beyond the end of the buffer */
+   while (i < num_pages)
+   gpas[i++] = vm->scratch[0]->encode;
+out:
+   data[0] = action;
+   memcpy(&data[1], &pvvma, sizeof(pvvma));
+   ret = i915->vgpu.pv->send(i915, data, 1 + size);
+
+   if (num_pages > 1)
+   kfree(gpas);
+
+   return ret;
+}
+
+static void ppgtt_bind_vma_pv(struct i915_address_space *vm,
+   struct i915_vm_pt_stash *stash,
+   struct i915_vma *vma,
+   enum i915_cache_level cache_level,
+   u32 flags)
+{
+   u32 pte_flags;
+   u64 pte_encode;
+
+   if (!test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
+   set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
+   flags |= BIT(I915_VMA_ALLOC_BIT);
+   }
+
+   /* Applicable to VLV, and gen8+ */
+   pte_flags = 0;
+   if (i915_gem_object_is_readonly(vma->obj))
+   pte_flags |= PTE_READ_ONLY;
+
+   pte_encode = vma->vm->pte_encode(0, cache_level, pte_flags);
+
+   vgpu_pv_vma_action(vm, vma, PV_CMD_BIND_PPGTT, flags, pte_encode);
+}
+
+static void ppgtt_unbind_vma_pv(struct i915_address_space *vm,
+   struct i915_vma *vma)
+{
+   if (test_and_clear_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma)))
+   vgpu_pv_vma_action(vm, vma, PV_CMD_UNBIND_PPGTT, 0, 0);
+}
+
 /**
  * wait_for_desc_update - Wait for the command buffer descriptor update.
  * @desc:  buffer descriptor
@@ -644,9 +732,16 @@ static int intel_vgpu_setup_share

[Intel-gfx] [PATCH v1 11/12] drm/i915/gvt: GVTg support ggtt pv operations

2020-09-06 Thread Xiaolin Zhang
This patch is to handle ppgtt PV_CMD_BIND_GGTT and PV_CMD_UNBIND_GGTT

for pv ggtt, it is operated (bind/unbind) per vma instead of per ggtt
entry mmio update to improve efficiency

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gvt/gtt.c  | 83 +
 drivers/gpu/drm/i915/gvt/handlers.c |  4 ++
 drivers/gpu/drm/i915/gvt/vgpu.c |  2 +-
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index c13560a..c79171f 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -2732,6 +2732,83 @@ static void intel_vgpu_pv_ppgtt_unbind(struct intel_vgpu 
*vgpu,
 
 }
 
+static int intel_vgpu_pv_ggtt_bind(struct intel_vgpu *vgpu,
+   struct pv_vma *vma, u64 *gpas)
+{
+   u64 off = (vma->start / I915_GTT_PAGE_SIZE) << 3;
+   u32 size = vma->size;
+   struct intel_vgpu_mm *ggtt_mm = vgpu->gtt.ggtt_mm;
+   struct intel_gvt_gtt_pte_ops *ops = vgpu->gvt->gtt.pte_ops;
+   unsigned long g_gtt_index = off >> 3;
+   struct intel_gvt_gtt_entry e = {.val64 = 0, .type = GTT_TYPE_GGTT_PTE};
+   struct intel_gvt_gtt_entry m = {.val64 = 0, .type = GTT_TYPE_GGTT_PTE};
+   int ret = 0;
+   int i;
+   u64 gfn;
+   dma_addr_t dma_addr;
+
+   for (i = 0; i < size; i++) {
+   e.val64 = gpas[i];
+   if (!ops->test_present(&e)) {
+   ops->set_pfn(&m, vgpu->gvt->gtt.scratch_mfn);
+   ops->clear_present(&m);
+   goto out;
+   }
+
+   gfn = ops->get_pfn(&e);
+   m.val64 = e.val64;
+   ret = intel_gvt_hypervisor_dma_map_guest_page(vgpu,
+   gfn, PAGE_SIZE, &dma_addr);
+   if (ret) {
+   gvt_vgpu_err("failed to map guest ggtt entry\n");
+   ops->set_pfn(&m, vgpu->gvt->gtt.scratch_mfn);
+   } else
+   ops->set_pfn(&m, dma_addr >> PAGE_SHIFT);
+out:
+   g_gtt_index = off >> 3;
+   ggtt_set_guest_entry(ggtt_mm, &e, g_gtt_index);
+   ggtt_get_host_entry(ggtt_mm, &e, g_gtt_index);
+   ggtt_invalidate_pte(vgpu, &e);
+   ggtt_set_host_entry(ggtt_mm, &m, g_gtt_index);
+   off += 8;
+   }
+
+   ggtt_invalidate(vgpu->gvt->gt);
+   return ret;
+}
+
+
+static int intel_vgpu_pv_ggtt_unbind(struct intel_vgpu *vgpu,
+   struct pv_vma *vma, u64 *gpas)
+{
+   u64 off = (vma->start / I915_GTT_PAGE_SIZE) << 3;
+   u32 size = vma->size;
+   struct intel_vgpu_mm *ggtt_mm = vgpu->gtt.ggtt_mm;
+   struct intel_gvt_gtt_pte_ops *ops = vgpu->gvt->gtt.pte_ops;
+   unsigned long g_gtt_index = off >> 3;
+   struct intel_gvt_gtt_entry e = {.val64 = 0, .type = GTT_TYPE_GGTT_PTE};
+   struct intel_gvt_gtt_entry m = {.val64 = 0, .type = GTT_TYPE_GGTT_PTE};
+   int ret = 0;
+   int i;
+
+   for (i = 0; i < size; i++) {
+   g_gtt_index = off >> 3;
+   e.val64 = gpas[i];
+   ggtt_invalidate_pte(vgpu, &e);
+   ops->clear_present(&e);
+   ggtt_set_guest_entry(ggtt_mm, &e, g_gtt_index);
+   ops->set_pfn(&m, vgpu->gvt->gtt.scratch_mfn);
+   ops->clear_present(&m);
+   ggtt_get_host_entry(ggtt_mm, &e, g_gtt_index);
+   ggtt_set_host_entry(ggtt_mm, &m, g_gtt_index);
+   off += 8;
+   }
+
+   ggtt_invalidate(vgpu->gvt->gt);
+
+   return ret;
+}
+
 int intel_vgpu_handle_pv_vma(struct intel_vgpu *vgpu,
struct intel_vgpu_mm *mm, u32 cmd, u32 data[])
 {
@@ -2768,6 +2845,12 @@ int intel_vgpu_handle_pv_vma(struct intel_vgpu *vgpu,
case PV_CMD_BIND_PPGTT:
intel_vgpu_pv_ppgtt_bind(vgpu, mm, vma, dma_addrs);
break;
+   case PV_CMD_BIND_GGTT:
+   ret = intel_vgpu_pv_ggtt_bind(vgpu, vma, dma_addrs);
+   break;
+   case PV_CMD_UNBIND_GGTT:
+   ret = intel_vgpu_pv_ggtt_unbind(vgpu, vma, dma_addrs);
+   break;
default:
break;
}
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index a3637d86..f1ad024 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1349,6 +1349,10 @@ static int handle_pv_commands(struct intel_vgpu *vgpu)
}
ret = intel_vgpu_handle_pv_vma(vgpu, mm, cmd, data);
break;
+   case PV_CMD_BIND_GGTT:
+   case PV_CMD_UNBIND_GGTT:
+   ret = intel_vgpu_handle_pv_vma(vgpu, NULL, cmd, data);
+   break;
default:
break;
}
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index c898e0d..1411c7b5 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+

[Intel-gfx] [PATCH v1 02/12] drm/i915: vgpu shared memory setup for pv support

2020-09-06 Thread Xiaolin Zhang
To support vgpu pv features, a common shared memory is setup used for
communication and data exchange between guest and host GVTg to reduce
data access overhead and trap cost.

guest i915 will allocate this common memory (1 page size) and then pass
it's physical address to host GVTg through PVINFO register so that host
GVTg can access this shared guest page meory without trap cost with
hyperviser's facility.

guest i915 will send VGT_G2V_SHARED_PAGE_SETUP notification to host GVTg
once shared memory setup succcessfully finished.

the layout of the shared_page also defined as well, the first part is the
PV vervsion information used for compabilty support.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/i915_drv.c|  2 +
 drivers/gpu/drm/i915/i915_drv.h|  4 +-
 drivers/gpu/drm/i915/i915_pvinfo.h |  5 +-
 drivers/gpu/drm/i915/i915_vgpu.c   | 94 ++
 drivers/gpu/drm/i915/i915_vgpu.h   | 14 ++
 5 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 00292a8..5fbb4ab 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1071,6 +1071,8 @@ static void i915_driver_release(struct drm_device *dev)
 
disable_rpm_wakeref_asserts(rpm);
 
+   intel_vgpu_destroy(dev_priv);
+
i915_gem_driver_release(dev_priv);
 
intel_memory_regions_driver_release(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 16d1b51..3cde2c5f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -809,7 +809,9 @@ struct i915_virtual_gpu {
bool active;
u32 caps;
u32 pv_caps;
-};
+
+   struct i915_virtual_gpu_pv *pv;
+} __packed;
 
 struct intel_cdclk_config {
unsigned int cdclk, vco, ref, bypass;
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h 
b/drivers/gpu/drm/i915/i915_pvinfo.h
index 8b0dc25..1d44876 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -48,6 +48,7 @@ enum vgt_g2v_type {
VGT_G2V_PPGTT_L4_PAGE_TABLE_DESTROY,
VGT_G2V_EXECLIST_CONTEXT_CREATE,
VGT_G2V_EXECLIST_CONTEXT_DESTROY,
+   VGT_G2V_SHARED_PAGE_REGISTER,
VGT_G2V_MAX,
 };
 
@@ -112,7 +113,9 @@ struct vgt_if {
 
u32 pv_caps;
 
-   u32  rsv7[0x200 - 25];/* pad to one page */
+   u64 shared_page_gpa;
+
+   u32  rsv7[0x200 - 27];/* pad to one page */
 } __packed;
 
 #define vgtif_offset(x) (offsetof(struct vgt_if, x))
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 10960125..8b2b451 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -110,6 +110,17 @@ void intel_vgpu_detect(struct drm_i915_private *dev_priv)
pci_iounmap(pdev, shared_area);
 }
 
+void intel_vgpu_destroy(struct drm_i915_private *i915)
+{
+   struct i915_virtual_gpu_pv *pv = i915->vgpu.pv;
+
+   if (!intel_vgpu_active(i915) || !pv)
+   return;
+
+   __free_page(virt_to_page(pv->shared_page));
+   kfree(pv);
+}
+
 void intel_vgpu_register(struct drm_i915_private *i915)
 {
/*
@@ -360,6 +371,83 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
  */
 
 /*
+ * shared_page setup for VGPU PV features
+ */
+static int intel_vgpu_setup_shared_page(struct drm_i915_private *i915,
+   void __iomem *shared_area)
+{
+   void __iomem *addr;
+   struct i915_virtual_gpu_pv *pv;
+   struct gvt_shared_page *base;
+   u64 gpa;
+   u16 ver_maj, ver_min;
+   int ret = 0;
+
+   /* We allocate 1 page shared between guest and GVT for data exchange.
+*   ___
+*  |version|
+*  |___PAGE/8
+*  |   |
+*  |___PAGE/4
+*  |   |
+*  |   |
+*  |   |
+*  |___PAGE/2
+*  |   |
+*  |   |
+*  |   |
+*  |   |
+*  |   |
+*  |   |
+*  |   |
+*  |___|
+*
+* 0 offset: PV version area
+*/
+
+   base =  (struct gvt_shared_page *)get_zeroed_page(GFP_KERNEL);
+   if (!base) {
+   dev_info(i915->drm.dev, "out of memory for shared memory\n");
+   return -ENOMEM;
+   }
+
+   /* pass guest memory pa address to GVT and then read back to verify */
+   gpa = __pa(base);
+   addr = sha

[Intel-gfx] [PATCH v1 01/12] drm/i915: introduced vgpu pv capability

2020-09-06 Thread Xiaolin Zhang
to enable vgpu pv feature, pv capability is introduced for guest by
new pv_caps member in struct i915_virtual_gpu and for host GVT by
new pv_caps register in struct vgt_if.

both of them are used to control different pv feature support in each
domain and the final pv caps runtime negotiated between guest and host.

it also adds VGT_CAPS_PV capability BIT useb by guest to query host GVTg
whether support any PV feature or not.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/i915_debugfs.c |  3 ++
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_pvinfo.h  |  5 ++-
 drivers/gpu/drm/i915/i915_vgpu.c| 63 -
 drivers/gpu/drm/i915/i915_vgpu.h| 10 ++
 5 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 7842199..fd1e0fc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -48,6 +48,7 @@
 #include "i915_trace.h"
 #include "intel_pm.h"
 #include "intel_sideband.h"
+#include "i915_vgpu.h"
 
 static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node)
 {
@@ -60,6 +61,8 @@ static int i915_capabilities(struct seq_file *m, void *data)
struct drm_printer p = drm_seq_file_printer(m);
 
seq_printf(m, "pch: %d\n", INTEL_PCH_TYPE(i915));
+   if (intel_vgpu_active(i915))
+   seq_printf(m, "vgpu pv_caps: 0x%x\n", i915->vgpu.pv_caps);
 
intel_device_info_print_static(INTEL_INFO(i915), &p);
intel_device_info_print_runtime(RUNTIME_INFO(i915), &p);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a455752..16d1b51 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -808,6 +808,7 @@ struct i915_virtual_gpu {
struct mutex lock; /* serialises sending of g2v_notify command pkts */
bool active;
u32 caps;
+   u32 pv_caps;
 };
 
 struct intel_cdclk_config {
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h 
b/drivers/gpu/drm/i915/i915_pvinfo.h
index 683e97a..8b0dc25 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -57,6 +57,7 @@ enum vgt_g2v_type {
 #define VGT_CAPS_FULL_PPGTTBIT(2)
 #define VGT_CAPS_HWSP_EMULATIONBIT(3)
 #define VGT_CAPS_HUGE_GTT  BIT(4)
+#define VGT_CAPS_PVBIT(5)
 
 struct vgt_if {
u64 magic;  /* VGT_MAGIC */
@@ -109,7 +110,9 @@ struct vgt_if {
u32 execlist_context_descriptor_lo;
u32 execlist_context_descriptor_hi;
 
-   u32  rsv7[0x200 - 24];/* pad to one page */
+   u32 pv_caps;
+
+   u32  rsv7[0x200 - 25];/* pad to one page */
 } __packed;
 
 #define vgtif_offset(x) (offsetof(struct vgt_if, x))
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 70fca72..10960125 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -98,7 +98,13 @@ void intel_vgpu_detect(struct drm_i915_private *dev_priv)
 
dev_priv->vgpu.active = true;
mutex_init(&dev_priv->vgpu.lock);
-   drm_info(&dev_priv->drm, "Virtual GPU for Intel GVT-g detected.\n");
+
+   if (!intel_vgpu_detect_pv_caps(dev_priv, shared_area)) {
+   DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
+   goto out;
+   }
+
+   DRM_INFO("Virtual GPU for Intel GVT-g detected with PV Optimized.\n");
 
 out:
pci_iounmap(pdev, shared_area);
@@ -134,6 +140,18 @@ bool intel_vgpu_has_huge_gtt(struct drm_i915_private 
*dev_priv)
return dev_priv->vgpu.caps & VGT_CAPS_HUGE_GTT;
 }
 
+static bool intel_vgpu_check_pv_cap(struct drm_i915_private *dev_priv,
+   enum pv_caps cap)
+{
+   return (dev_priv->vgpu.active && (dev_priv->vgpu.caps & VGT_CAPS_PV)
+   && (dev_priv->vgpu.pv_caps & cap));
+}
+
+static bool intel_vgpu_has_pv_caps(struct drm_i915_private *dev_priv)
+{
+   return dev_priv->vgpu.caps & VGT_CAPS_PV;
+}
+
 struct _balloon_info_ {
/*
 * There are up to 2 regions per mappable/unmappable graphic
@@ -336,3 +354,46 @@ int intel_vgt_balloon(struct i915_ggtt *ggtt)
drm_err(&dev_priv->drm, "VGT balloon fail\n");
return ret;
 }
+
+/*
+ * i915 vgpu PV support for Linux
+ */
+
+/*
+ * Config vgpu PV ops for different PV capabilities
+ */
+void intel_vgpu_config_pv_caps(struct drm_i915_private *i915,
+   enum pv_caps cap, void *data)
+{
+
+   if (!intel_vgpu_check_pv_cap(i915, cap))
+   return;
+}
+
+/**
+ * intel_vgpu_detect_pv_caps - detect virtual GPU PV capabilities
+ * @dev_priv: i915 device private
+ *
+ * This function is called at the initialization stage, to detect VGPU
+ * PV capabilities
+ */
+bool intel_vgpu_detect_pv_caps(struct drm_i915_private *i915,
+   void __iomem *shared_area)
+{
+   u32 gvt_pvc

[Intel-gfx] [PATCH v1 09/12] drm/i915/gvt: GVTg support vgpu pv CTB protocol

2020-09-06 Thread Xiaolin Zhang
host side to implement vgpu PV CTB protocol. based on the protocol,
CTB read functionality is implemented to handle pv command from guest.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gvt/handlers.c | 119 +++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index 295e43a..b9c9f62 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1218,6 +1218,119 @@ static int pvinfo_mmio_read(struct intel_vgpu *vgpu, 
unsigned int offset,
return 0;
 }
 
+static inline unsigned int ct_header_get_len(u32 header)
+{
+   return (header >> PV_CT_MSG_LEN_SHIFT) & PV_CT_MSG_LEN_MASK;
+}
+
+static inline unsigned int ct_header_get_action(u32 header)
+{
+   return (header >> PV_CT_MSG_ACTION_SHIFT) & PV_CT_MSG_ACTION_MASK;
+}
+
+static int fetch_pv_command_buffer(struct intel_vgpu *vgpu,
+   struct vgpu_pv_ct_buffer_desc *desc,
+   u32 *fence, u32 *action, u32 *data)
+{
+   u32 head, tail, len, size, off;
+   u32 cmd_head;
+   u32 avail;
+   u32 ret;
+
+   /* fetch command descriptor */
+   off = PV_DESC_OFF;
+   ret = intel_gvt_read_shared_page(vgpu, off, desc, sizeof(*desc));
+   if (ret)
+   return ret;
+
+   GEM_BUG_ON(desc->size % 4);
+   GEM_BUG_ON(desc->head % 4);
+   GEM_BUG_ON(desc->tail % 4);
+   GEM_BUG_ON(tail >= size);
+   GEM_BUG_ON(head >= size);
+
+   /* tail == head condition indicates empty */
+   head = desc->head/4;
+   tail = desc->tail/4;
+   size = desc->size/4;
+
+   if (unlikely((tail - head) == 0))
+   return -ENODATA;
+
+   /* fetch command head */
+   off = desc->addr + head * 4;
+   ret = intel_gvt_read_shared_page(vgpu, off, &cmd_head, 4);
+   head = (head + 1) % size;
+   if (ret)
+   goto err;
+
+   len = ct_header_get_len(cmd_head) - 1;
+   *action = ct_header_get_action(cmd_head);
+
+   /* fetch command fence */
+   off = desc->addr + head * 4;
+   ret = intel_gvt_read_shared_page(vgpu, off, fence, 4);
+   head = (head + 1) % size;
+   if (ret)
+   goto err;
+
+   /* no command data */
+   if (len == 0)
+   goto err;
+
+   /* fetch command data */
+   avail = size - head;
+   if (len <= avail) {
+   off =  desc->addr + head * 4;
+   ret = intel_gvt_read_shared_page(vgpu, off, data, len * 4);
+   head = (head + len) % size;
+   } else {
+   /* swap case */
+   off =  desc->addr + head * 4;
+   ret = intel_gvt_read_shared_page(vgpu, off, data, avail * 4);
+   head = (head + avail) % size;
+   if (ret)
+   goto err;
+
+   off = desc->addr;
+   ret = intel_gvt_read_shared_page(vgpu, off, &data[avail],
+   (len - avail) * 4);
+   head = (head + len - avail) % size;
+   }
+
+err:
+   desc->head = head * 4;
+   return ret;
+}
+
+static int pv_command_buffer_read(struct intel_vgpu *vgpu,
+   u32 *cmd, u32 *data)
+{
+   struct vgpu_pv_ct_buffer_desc desc;
+   u32 fence, off = PV_DESC_OFF;
+   int ret;
+
+   ret = fetch_pv_command_buffer(vgpu, &desc, &fence, cmd, data);
+
+   /* write command descriptor back */
+   desc.fence = fence;
+   desc.status = ret;
+
+   ret = intel_gvt_write_shared_page(vgpu, off, &desc, sizeof(desc));
+   return ret;
+
+}
+
+static int handle_pv_commands(struct intel_vgpu *vgpu)
+{
+   u32 cmd;
+   u32 data[32];
+   int ret;
+
+   ret = pv_command_buffer_read(vgpu, &cmd, data);
+   return ret;
+}
+
 static int handle_g2v_notification(struct intel_vgpu *vgpu, int notification)
 {
enum intel_gvt_gtt_type root_entry_type = GTT_TYPE_PPGTT_ROOT_L4_ENTRY;
@@ -1226,6 +1339,7 @@ static int handle_g2v_notification(struct intel_vgpu 
*vgpu, int notification)
unsigned long gpa, gfn;
u16 ver_major = PV_MAJOR;
u16 ver_minor = PV_MINOR;
+   int ret = 0;
 
pdps = (u64 *)&vgpu_vreg64_t(vgpu, vgtif_reg(pdp[0]));
 
@@ -1252,6 +1366,9 @@ static int handle_g2v_notification(struct intel_vgpu 
*vgpu, int notification)
intel_gvt_write_shared_page(vgpu, 0, &ver_major, 2);
intel_gvt_write_shared_page(vgpu, 2, &ver_minor, 2);
break;
+   case VGT_G2V_PV_SEND_TRIGGER:
+   ret = handle_pv_commands(vgpu);
+   break;
case VGT_G2V_EXECLIST_CONTEXT_CREATE:
case VGT_G2V_EXECLIST_CONTEXT_DESTROY:
case 1: /* Remove this in guest driver. */
@@ -1259,7 +1376,7 @@ static int handle_g2v_notification(struct intel_vgpu 
*vgpu, int notification)
default:
gvt_vgpu_err("Invalid PV notifica

[Intel-gfx] [PATCH v1 08/12] drm/i915/gvt: GVTg handle guest shared_page setup

2020-09-06 Thread Xiaolin Zhang
GVTg implemented guest shared_page register operation and read and write
shared_page functionality based on hypervisor read and write functionality.

the shared_page_gpa was passed from guest driver through PVINFO
shared_page_gpa register.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gvt/gvt.h  |  9 ++--
 drivers/gpu/drm/i915/gvt/handlers.c | 20 +
 drivers/gpu/drm/i915/gvt/vgpu.c | 43 +
 3 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 31d8a2bcc..d635313 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -214,6 +214,8 @@ struct intel_vgpu {
 
u32 scan_nonprivbb;
u32 pv_caps;
+   u64 shared_page_gpa;
+   bool shared_page_enabled;
 };
 
 static inline void *intel_vgpu_vdev(struct intel_vgpu *vgpu)
@@ -536,7 +538,7 @@ static inline u64 intel_vgpu_get_bar_gpa(struct intel_vgpu 
*vgpu, int bar)
 static inline bool intel_vgpu_enabled_pv_cap(struct intel_vgpu *vgpu,
enum pv_caps cap)
 {
-   return (vgpu->pv_caps & cap);
+   return vgpu->shared_page_enabled && (vgpu->pv_caps & cap);
 }
 
 void intel_vgpu_clean_opregion(struct intel_vgpu *vgpu);
@@ -692,7 +694,10 @@ void intel_gvt_debugfs_add_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_debugfs_remove_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_debugfs_init(struct intel_gvt *gvt);
 void intel_gvt_debugfs_clean(struct intel_gvt *gvt);
-
+int intel_gvt_read_shared_page(struct intel_vgpu *vgpu,
+   unsigned int offset, void *buf, unsigned long len);
+int intel_gvt_write_shared_page(struct intel_vgpu *vgpu,
+   unsigned int offset, void *buf, unsigned long len);
 
 #include "trace.h"
 #include "mpt.h"
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index bfea065..295e43a 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1204,6 +1204,8 @@ static int pvinfo_mmio_read(struct intel_vgpu *vgpu, 
unsigned int offset,
case 0x78010:   /* vgt_caps */
case 0x7881c:
case _vgtif_reg(pv_caps):
+   case _vgtif_reg(shared_page_gpa):
+   case _vgtif_reg(shared_page_gpa) + 4:
break;
default:
invalid_read = true;
@@ -1221,6 +1223,9 @@ static int handle_g2v_notification(struct intel_vgpu 
*vgpu, int notification)
enum intel_gvt_gtt_type root_entry_type = GTT_TYPE_PPGTT_ROOT_L4_ENTRY;
struct intel_vgpu_mm *mm;
u64 *pdps;
+   unsigned long gpa, gfn;
+   u16 ver_major = PV_MAJOR;
+   u16 ver_minor = PV_MINOR;
 
pdps = (u64 *)&vgpu_vreg64_t(vgpu, vgtif_reg(pdp[0]));
 
@@ -1234,6 +1239,19 @@ static int handle_g2v_notification(struct intel_vgpu 
*vgpu, int notification)
case VGT_G2V_PPGTT_L3_PAGE_TABLE_DESTROY:
case VGT_G2V_PPGTT_L4_PAGE_TABLE_DESTROY:
return intel_vgpu_put_ppgtt_mm(vgpu, pdps);
+   case VGT_G2V_SHARED_PAGE_REGISTER:
+   gpa = vgpu_vreg64_t(vgpu, vgtif_reg(shared_page_gpa));
+   gfn = gpa >> PAGE_SHIFT;
+   if (!intel_gvt_hypervisor_is_valid_gfn(vgpu, gfn)) {
+   vgpu_vreg_t(vgpu, vgtif_reg(pv_caps)) = 0;
+   return 0;
+   }
+   vgpu->shared_page_gpa = gpa;
+   vgpu->shared_page_enabled = true;
+
+   intel_gvt_write_shared_page(vgpu, 0, &ver_major, 2);
+   intel_gvt_write_shared_page(vgpu, 2, &ver_minor, 2);
+   break;
case VGT_G2V_EXECLIST_CONTEXT_CREATE:
case VGT_G2V_EXECLIST_CONTEXT_DESTROY:
case 1: /* Remove this in guest driver. */
@@ -1290,6 +1308,8 @@ static int pvinfo_mmio_write(struct intel_vgpu *vgpu, 
unsigned int offset,
case _vgtif_reg(pdp[3].hi):
case _vgtif_reg(execlist_context_descriptor_lo):
case _vgtif_reg(execlist_context_descriptor_hi):
+   case _vgtif_reg(shared_page_gpa):
+   case _vgtif_reg(shared_page_gpa) + 4:
break;
case _vgtif_reg(rsv5[0])..._vgtif_reg(rsv5[3]):
invalid_write = true;
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 4867426..e9bc683 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -64,6 +64,8 @@ void populate_pvinfo_page(struct intel_vgpu *vgpu)
vgpu_vreg_t(vgpu, vgtif_reg(cursor_x_hot)) = UINT_MAX;
vgpu_vreg_t(vgpu, vgtif_reg(cursor_y_hot)) = UINT_MAX;
 
+   vgpu_vreg64_t(vgpu, vgtif_reg(shared_page_gpa)) = 0;
+
gvt_dbg_core("Populate PVINFO PAGE for vGPU %d\n", vgpu->id);
gvt_dbg_core("aperture base [GMADR] 0x%llx size 0x%llx\n",
vgpu_aperture_gmadr_base(vgpu), vgpu_aperture_sz(vgpu));
@@ -609,3 +611,44 @@ void intel_gvt_reset_vgpu(struct intel_vgpu *vgpu)
intel_gv

[Intel-gfx] [PATCH v1 06/12] drm/i915: vgpu workload submisison pv support

2020-09-06 Thread Xiaolin Zhang
to improve efficiency and reduce the complexity of vgpu workload
submission support, a pv version of workload submission backend
implemented with engine submission data in the shared memory and
eliminating execlists csb process and context switch interrupt
in submisision routine.

new PV_SUBMISSION pv_cap is used to control this level of pv support in
both guest and host side.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/Makefile  |   2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c|   2 +
 drivers/gpu/drm/i915/i915_vgpu.c   |  67 +-
 drivers/gpu/drm/i915/i915_vgpu.h   |  25 +++
 drivers/gpu/drm/i915/intel_pv_submission.c | 324 +
 5 files changed, 413 insertions(+), 7 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_pv_submission.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e5574e50..13d1739 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -269,7 +269,7 @@ i915-$(CONFIG_DRM_I915_SELFTEST) += \
selftests/librapl.o
 
 # virtual gpu code
-i915-y += i915_vgpu.o
+i915-y += i915_vgpu.o intel_pv_submission.o
 
 ifeq ($(CONFIG_DRM_I915_GVT),y)
 i915-y += intel_gvt.o
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 0412a44..4f77226 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -5018,6 +5018,8 @@ void intel_execlists_set_default_submission(struct 
intel_engine_cs *engine)
if (IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
engine->flags |= I915_ENGINE_HAS_TIMESLICES;
}
+   } else {
+   intel_vgpu_config_pv_caps(engine->i915, PV_SUBMISSION, engine);
}
 
if (INTEL_GEN(engine->i915) >= 12)
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 4e50694..ba7a1f9 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -101,6 +101,7 @@ void intel_vgpu_detect(struct drm_i915_private *dev_priv)
 
/* guest driver PV capability */
dev_priv->vgpu.pv_caps = PV_PPGTT | PV_GGTT;
+   dev_priv->vgpu.pv_caps |= PV_SUBMISSION;
 
if (!intel_vgpu_detect_pv_caps(dev_priv, shared_area)) {
DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
@@ -120,6 +121,8 @@ void intel_vgpu_destroy(struct drm_i915_private *i915)
if (!intel_vgpu_active(i915) || !pv)
return;
 
+   kfree(pv->submission);
+
__free_page(virt_to_page(pv->shared_page));
kfree(pv);
 }
@@ -600,7 +603,8 @@ static u32 pv_get_next_fence(struct i915_virtual_gpu_pv *pv)
 }
 
 static int pv_send(struct drm_i915_private *i915,
-   const u32 *action, u32 len, u32 *status)
+   const u32 *action, u32 len, u32 *status,
+   void __iomem *addr)
 {
struct i915_virtual_gpu *vgpu = &i915->vgpu;
struct i915_virtual_gpu_pv *pv = vgpu->pv;
@@ -618,7 +622,10 @@ static int pv_send(struct drm_i915_private *i915,
if (unlikely(err))
goto unlink;
 
-   i915->vgpu.pv->notify(i915);
+   if (addr)
+   writel(VGT_G2V_PV_SEND_TRIGGER, addr + 
vgtif_offset(g2v_notify));
+   else
+   i915->vgpu.pv->notify(i915);
 
err = wait_for_desc_update(desc, fence, status);
if (unlikely(err))
@@ -645,7 +652,7 @@ static int intel_vgpu_pv_send_command_buffer(struct 
drm_i915_private *i915,
 
spin_lock_irqsave(&vgpu->pv->lock, flags);
 
-   ret = pv_send(i915, action, len, &status);
+   ret = pv_send(i915, action, len, &status, NULL);
if (unlikely(ret < 0)) {
DRM_ERROR("PV: send action %#X failed; err=%d status=%#X\n",
  action[0], ret, status);
@@ -663,6 +670,17 @@ static void intel_vgpu_pv_notify_mmio(struct 
drm_i915_private *dev_priv)
I915_WRITE(vgtif_reg(g2v_notify), VGT_G2V_PV_SEND_TRIGGER);
 }
 
+static void inte_vgpu_register_cap_gpa(struct drm_i915_private *i915,
+   struct pv_cap_addr *cap_addr, void __iomem *shared_area)
+{
+   u32 data[32];
+   u32 status = ~0;
+
+   data[0] = PV_CMD_REGISTER_CAP_GPA;
+   memcpy(&data[1], cap_addr, sizeof(*cap_addr));
+   pv_send(i915, data, 1 + sizeof(cap_addr), &status, shared_area);
+}
+
 /*
  * shared_page setup for VGPU PV features
  */
@@ -672,17 +690,21 @@ static int intel_vgpu_setup_shared_page(struct 
drm_i915_private *i915,
void __iomem *addr;
struct i915_virtual_gpu_pv *pv;
struct gvt_shared_page *base;
-   u64 gpa;
+   struct pv_cap_addr cap_addr;
+   void *sub_base;
+   u64 gpa, sub_gpa;
u16 ver_maj, ver_min;
int ret = 0;
+   int i;
+   u32 size;
 
/* We allocate 1 page shared between guest and GVT for data exchange.
 *   ___
 

[Intel-gfx] [PATCH v1 07/12] drm/i915/gvt: GVTg expose pv_caps PVINFO register

2020-09-06 Thread Xiaolin Zhang
expose pv_caps PVINFO register from GVTg to guest in order that guest can
query and control different pv capability support.

report VGT_CAPS_PV capability in pvinfo page for guest.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gvt/gvt.h  | 8 
 drivers/gpu/drm/i915/gvt/handlers.c | 5 +
 drivers/gpu/drm/i915/gvt/vgpu.c | 1 +
 3 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 9831361..31d8a2bcc 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -49,6 +49,7 @@
 #include "fb_decoder.h"
 #include "dmabuf.h"
 #include "page_track.h"
+#include "i915_vgpu.h"
 
 #define GVT_MAX_VGPU 8
 
@@ -212,6 +213,7 @@ struct intel_vgpu {
struct idr object_idr;
 
u32 scan_nonprivbb;
+   u32 pv_caps;
 };
 
 static inline void *intel_vgpu_vdev(struct intel_vgpu *vgpu)
@@ -531,6 +533,12 @@ static inline u64 intel_vgpu_get_bar_gpa(struct intel_vgpu 
*vgpu, int bar)
PCI_BASE_ADDRESS_MEM_MASK;
 }
 
+static inline bool intel_vgpu_enabled_pv_cap(struct intel_vgpu *vgpu,
+   enum pv_caps cap)
+{
+   return (vgpu->pv_caps & cap);
+}
+
 void intel_vgpu_clean_opregion(struct intel_vgpu *vgpu);
 int intel_vgpu_init_opregion(struct intel_vgpu *vgpu);
 int intel_vgpu_opregion_base_write_handler(struct intel_vgpu *vgpu, u32 gpa);
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index ee3648d..bfea065 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1203,6 +1203,7 @@ static int pvinfo_mmio_read(struct intel_vgpu *vgpu, 
unsigned int offset,
break;
case 0x78010:   /* vgt_caps */
case 0x7881c:
+   case _vgtif_reg(pv_caps):
break;
default:
invalid_read = true;
@@ -1272,6 +1273,10 @@ static int pvinfo_mmio_write(struct intel_vgpu *vgpu, 
unsigned int offset,
case _vgtif_reg(g2v_notify):
handle_g2v_notification(vgpu, data);
break;
+   case _vgtif_reg(pv_caps):
+   DRM_INFO("vgpu id=%d pv caps =0x%x\n", vgpu->id, data);
+   vgpu->pv_caps = data;
+   break;
/* add xhot and yhot to handled list to avoid error log */
case _vgtif_reg(cursor_x_hot):
case _vgtif_reg(cursor_y_hot):
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 8fa9b31..4867426 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -48,6 +48,7 @@ void populate_pvinfo_page(struct intel_vgpu *vgpu)
vgpu_vreg_t(vgpu, vgtif_reg(vgt_caps)) = VGT_CAPS_FULL_PPGTT;
vgpu_vreg_t(vgpu, vgtif_reg(vgt_caps)) |= VGT_CAPS_HWSP_EMULATION;
vgpu_vreg_t(vgpu, vgtif_reg(vgt_caps)) |= VGT_CAPS_HUGE_GTT;
+   vgpu_vreg_t(vgpu, vgtif_reg(vgt_caps)) |= VGT_CAPS_PV;
 
vgpu_vreg_t(vgpu, vgtif_reg(avail_rs.mappable_gmadr.base)) =
vgpu_aperture_gmadr_base(vgpu);
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 10/12] drm/i915/gvt: GVTg support ppgtt pv operations

2020-09-06 Thread Xiaolin Zhang
This patch is to handle ppgtt PV_CMD_BIND_PPGTT and PV_CMD_UNBIND_PPGTT

for ppgtt, it creates local ppgtt tables and set up the PTE's directly
from the PV command data from guest, which does not track the usage of
guest page table and remove the cost of write protection from the
original PPGTT shadow page mechansim.

Signed-off-by: Xiaolin Zhang 
---
 drivers/gpu/drm/i915/gvt/gtt.c  | 172 
 drivers/gpu/drm/i915/gvt/gtt.h  |   4 +
 drivers/gpu/drm/i915/gvt/gvt.h  |   1 +
 drivers/gpu/drm/i915/gvt/handlers.c |  25 ++
 drivers/gpu/drm/i915/gvt/vgpu.c |   2 +
 5 files changed, 204 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 04bf018..c13560a 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -1777,6 +1777,25 @@ static int ppgtt_handle_guest_write_page_table_bytes(
return 0;
 }
 
+static void invalidate_mm_pv(struct intel_vgpu_mm *mm)
+{
+   struct intel_vgpu *vgpu = mm->vgpu;
+   struct intel_gvt *gvt = vgpu->gvt;
+   struct intel_gvt_gtt *gtt = &gvt->gtt;
+   struct intel_gvt_gtt_pte_ops *ops = gtt->pte_ops;
+   struct intel_gvt_gtt_entry se;
+
+   i915_vm_put(&mm->ppgtt->vm);
+
+   ppgtt_get_shadow_root_entry(mm, &se, 0);
+   if (!ops->test_present(&se))
+   return;
+   se.val64 = 0;
+   ppgtt_set_shadow_root_entry(mm, &se, 0);
+
+   mm->ppgtt_mm.shadowed  = false;
+}
+
 static void invalidate_ppgtt_mm(struct intel_vgpu_mm *mm)
 {
struct intel_vgpu *vgpu = mm->vgpu;
@@ -1789,6 +1808,11 @@ static void invalidate_ppgtt_mm(struct intel_vgpu_mm *mm)
if (!mm->ppgtt_mm.shadowed)
return;
 
+   if (intel_vgpu_enabled_pv_cap(vgpu, PV_PPGTT)) {
+   invalidate_mm_pv(mm);
+   return;
+   }
+
for (index = 0; index < ARRAY_SIZE(mm->ppgtt_mm.shadow_pdps); index++) {
ppgtt_get_shadow_root_entry(mm, &se, index);
 
@@ -1806,6 +1830,26 @@ static void invalidate_ppgtt_mm(struct intel_vgpu_mm *mm)
mm->ppgtt_mm.shadowed = false;
 }
 
+static int shadow_mm_pv(struct intel_vgpu_mm *mm)
+{
+   struct intel_vgpu *vgpu = mm->vgpu;
+   struct intel_gvt *gvt = vgpu->gvt;
+   struct intel_gvt_gtt_entry se;
+
+   mm->ppgtt = i915_ppgtt_create(gvt->gt);
+   if (IS_ERR(mm->ppgtt)) {
+   gvt_vgpu_err("fail to create ppgtt for pdp 0x%llx\n",
+   px_dma(mm->ppgtt->pd));
+   return PTR_ERR(mm->ppgtt);
+   }
+
+   se.type = GTT_TYPE_PPGTT_ROOT_L4_ENTRY;
+   se.val64 = px_dma(mm->ppgtt->pd);
+   ppgtt_set_shadow_root_entry(mm, &se, 0);
+   mm->ppgtt_mm.shadowed  = true;
+
+   return 0;
+}
 
 static int shadow_ppgtt_mm(struct intel_vgpu_mm *mm)
 {
@@ -1820,6 +1864,9 @@ static int shadow_ppgtt_mm(struct intel_vgpu_mm *mm)
if (mm->ppgtt_mm.shadowed)
return 0;
 
+   if (intel_vgpu_enabled_pv_cap(vgpu, PV_PPGTT))
+   return shadow_mm_pv(mm);
+
mm->ppgtt_mm.shadowed = true;
 
for (index = 0; index < ARRAY_SIZE(mm->ppgtt_mm.guest_pdps); index++) {
@@ -2606,6 +2653,131 @@ static int setup_spt_oos(struct intel_gvt *gvt)
return ret;
 }
 
+static int intel_vgpu_pv_ppgtt_insert_4lvl(struct intel_vgpu *vgpu,
+   struct intel_vgpu_mm *mm, struct pv_vma *pvvma, u64 *gfns)
+{
+   u32 size = pvvma->size;
+   int ret = 0;
+   u32 cache_level;
+   struct sg_table st;
+   struct scatterlist *sg = NULL;
+   struct i915_vma vma;
+   unsigned long gfn;
+   dma_addr_t dma_addr;
+   int i;
+   u64 pte_flag;
+
+   cache_level = pvvma->flags & 0x;
+
+   if (sg_alloc_table(&st, size, GFP_KERNEL)) {
+   ret = -ENOMEM;
+   goto fail;
+   }
+
+   pte_flag = gfns[0] & 0xFFF;
+   for_each_sg(st.sgl, sg, size, i) {
+   sg->offset = 0;
+   sg->length = PAGE_SIZE;
+
+   gfn = gfns[i] >> PAGE_SHIFT;
+   ret = intel_gvt_hypervisor_dma_map_guest_page(vgpu,
+   gfn, PAGE_SIZE, &dma_addr);
+   if (ret) {
+   gvt_vgpu_err("fail to translate gfn: 0x%lx\n", gfn);
+   return -ENXIO;
+   }
+   sg_dma_address(sg) = dma_addr | pte_flag;
+   sg_dma_len(sg) = PAGE_SIZE;
+   }
+
+   memset(&vma, 0, sizeof(vma));
+   vma.node.start = pvvma->start;
+   vma.pages = &st;
+   mm->ppgtt->vm.insert_entries(&mm->ppgtt->vm, &vma, 0, 0);
+   sg_free_table(&st);
+
+fail:
+   return ret;
+}
+
+static void intel_vgpu_pv_ppgtt_bind(struct intel_vgpu *vgpu,
+   struct intel_vgpu_mm *mm, struct pv_vma *vma, u64 *gfns)
+{
+   struct i915_vm_pt_stash stash = {};
+
+   if (vma->flags & BIT(I915_VMA_ALLOC_BIT)) {
+   i915_vm_alloc_pt_stash(&mm-

[Intel-gfx] [PATCH v1 00/12] enhanced i915 vgpu with PV feature support

2020-09-06 Thread Xiaolin Zhang
This is new i915 VGPU PV design based on the last year proposal [1].

This is a new series of patch set and discontiued the old series of
patch set due to this new design.

To improve vgpu performance, it could implement some PV optimizations
in different gpu resource domain to reduce the data access overhead
or complexity modeling.

In this patch set, PPGTT and GGTT are identifed as PV optimization from
VGPU memory resource point of view and workloa submisison is identifed
as PV optimization from VGPU compute resource point of view. so 3 PV
features (PV PPGTT, PV GGTT and PV submisison) are designed and implemented
to support VGPU model better. 

To provide the mechanism for PV feature development and implementation,
A simple PV framework is implemented and consisted of 3 sub items:
a. PV capability: it indicateds what kind of PV capability provided by both
guest system and host GVTg subsystem.
b. PV shared memory: this memory is allocated in guest and shared between
guest and host for data exchange, PV command & PV data communication.
c. PV command transport protocol: on top of PV shared memory, it defines
the communication protocol & channel between guest and host to circulate
PV command and PV command data. 

for PV PPGTT, to improve efficiency and reduce the complexity of ppgtt
support, vgpu ppgtt page table operations are implemented in pv fashion
with pv version of bind/unbind for ppgtt vma ops. The pv version of 
ppgtt vma ops use the CTB protocol to communicate pv ppgtt command along 
with data struct pv_vma from guest to GVT and then GVT implement command
handler of PV_CMD_BIND_PPGTT and PV_CMD_UBIND_PPGTT to achieve GVA->HPA
address translation.

for PV GGTT, it is similar with PV PPGGT instead to use PV_CMD_BIND_GGTT
and PV_CMD_UNBIND_GGTT pv command. 

for PV workload submisison, a pv version of workload submission backend
implemented with engine submission data in the shared memory and meanwhile
eliminating execlists csb process and context switch interrupt in
submisision routine to improve efficiency and reduce complexity.

Based on the experiment, small workloads such as glmark2 and Antutu 3D 
benchmark can get benefit for these PV featuers at least 10% performance
gain. for large workload such as media and 3D, it get some benefit,
but not much. 

[1]: https://patchwork.kernel.org/cover/11148059/

Xiaolin Zhang (12):
  drm/i915: introduced vgpu pv capability
  drm/i915: vgpu shared memory setup for pv support
  drm/i915: vgpu pv command buffer transport protocol
  drm/i915: vgpu ppgtt page table pv support
  drm/i915: vgpu ggtt page table pv support
  drm/i915: vgpu workload submisison pv support
  drm/i915/gvt: GVTg expose pv_caps PVINFO register
  drm/i915/gvt: GVTg handle guest shared_page setup
  drm/i915/gvt: GVTg support vgpu pv CTB protocol
  drm/i915/gvt: GVTg support ppgtt pv operations
  drm/i915/gvt: GVTg support ggtt pv operations
  drm/i915/gvt: GVTg support pv workload submssion

 drivers/gpu/drm/i915/Makefile  |   2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c   |   4 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c|   2 +
 drivers/gpu/drm/i915/gvt/gtt.c | 255 ++
 drivers/gpu/drm/i915/gvt/gtt.h |   4 +
 drivers/gpu/drm/i915/gvt/gvt.h |  17 +-
 drivers/gpu/drm/i915/gvt/handlers.c| 274 ++-
 drivers/gpu/drm/i915/gvt/vgpu.c|  47 +++
 drivers/gpu/drm/i915/i915_debugfs.c|   3 +
 drivers/gpu/drm/i915/i915_drv.c|   2 +
 drivers/gpu/drm/i915/i915_drv.h|   5 +-
 drivers/gpu/drm/i915/i915_gem.c|   4 +-
 drivers/gpu/drm/i915/i915_pvinfo.h |   9 +-
 drivers/gpu/drm/i915/i915_vgpu.c   | 533 -
 drivers/gpu/drm/i915/i915_vgpu.h   | 122 +++
 drivers/gpu/drm/i915/intel_pv_submission.c | 324 ++
 16 files changed, 1599 insertions(+), 8 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_pv_submission.c

-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Pushed atomic-pwm changes, had to manually resolve a conflict in drm-tip

2020-09-06 Thread Hans de Goede

Hi All,

Note this is just FYI, in case I did anything wrong...

Now that it is finally fully acked up and has passed CI
I have pushed my atomic-pwm support for i915 series to
dinq.

This let to a conflict in drm-tip. The problem was that
in dinq prior to my push intel_panel.c had the following
around line 1942:

   level = DIV_ROUND_UP(pwm_get_duty_cycle(panel->backlight.pwm) * 100,
CRC_PMIC_PWM_PERIOD_NS);

Where as Linus' master, and drm-tip also from some fixes
branch I assume, has:

   level = DIV_ROUND_UP_ULL(pwm_get_duty_cycle(panel->backlight.pwm) * 100,
CRC_PMIC_PWM_PERIOD_NS);

Notice the extra _ULL in Linus' master / some fixes
branch which is necessary because
pwm_get_duty_cycle(panel->backlight.pwm) went from
returning an u32 to an u64 in 5.9.

My patch-set removes the lines with the
DIV_ROUND_UP[_ULL] replacing them with a
call to pwm_get_relative_duty_cycle() which nicely
abstracts this away.

Resolving this was easy, I followed:
https://drm.pages.freedesktop.org/maintainer-tools/drm-tip.html#resolving-conflicts-when-rebuilding-drm-tip

And I believe I did everything right :)

Still I'm sending this email for 2 reasons:

1. In case I did anything wrong.
2. This will likely also cause a conflict in -next
   I guess, I hope this email will make resolving
   that easier.

Regards,

Hans

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx