Re: [Intel-gfx] [PATCH v2] drm/i915/gvt: Fix cached atomics setting for Windows VM

2021-08-05 Thread Colin Xu

On Fri, 6 Aug 2021, Zhenyu Wang wrote:

Thanks for the fix! Otherwise Windows VM is unusable with recent kernel.

Reviewed-by: Colin Xu 


We've seen recent regression with host and windows VM running
simultaneously that cause gpu hang or even crash. Finally bisect to
commit 58586680ffad ("drm/i915: Disable atomics in L3 for gen9"),
which seems cached atomics behavior difference caused regression
issue.

This tries to add new scratch register handler and add those in mmio
save/restore list for context switch. No gpu hang produced with this one.

Cc: sta...@vger.kernel.org # 5.12+
Cc: "Xu, Terrence" 
Cc: "Bloomfield, Jon" 
Cc: "Ekstrand, Jason" 
Fixes: 58586680ffad ("drm/i915: Disable atomics in L3 for gen9")
Signed-off-by: Zhenyu Wang 
---
drivers/gpu/drm/i915/gvt/handlers.c | 1 +
drivers/gpu/drm/i915/gvt/mmio_context.c | 2 ++
2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index 06024d321a1a..cde0a477fb49 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -3149,6 +3149,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
MMIO_D(_MMIO(0xb110), D_BDW);
+   MMIO_D(GEN9_SCRATCH_LNCF1, D_BDW_PLUS);

MMIO_F(_MMIO(0x24d0), 48, F_CMD_ACCESS | F_CMD_WRITE_PATCH, 0, 0,
D_BDW_PLUS, NULL, force_nonpriv_write);
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c 
b/drivers/gpu/drm/i915/gvt/mmio_context.c
index b8ac80765461..f776c470914d 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -105,6 +105,8 @@ static struct engine_mmio gen9_engine_mmio_list[] 
__cacheline_aligned = {
{RCS0, COMMON_SLICE_CHICKEN2, 0x, true}, /* 0x7014 */
{RCS0, GEN9_CS_DEBUG_MODE1, 0x, false}, /* 0x20ec */
{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
+   {RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
+   {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0x, true}, /* 0xe100 */
{RCS0, HALF_SLICE_CHICKEN2, 0x, true}, /* 0xe180 */
{RCS0, HALF_SLICE_CHICKEN3, 0x, true}, /* 0xe184 */
--
2.32.0.rc2




--
Best Regards,
Colin Xu


[Intel-gfx] [RFC 1/2] drm/doc/rfc: VM_BIND feature design document

2021-08-05 Thread Niranjana Vishwanathapura
VM_BIND design document with description of intended use cases.

Signed-off-by: Niranjana Vishwanathapura 
---
 Documentation/gpu/rfc/i915_vm_bind.rst | 126 +
 Documentation/gpu/rfc/index.rst|   4 +
 2 files changed, 130 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.rst

diff --git a/Documentation/gpu/rfc/i915_vm_bind.rst 
b/Documentation/gpu/rfc/i915_vm_bind.rst
new file mode 100644
index ..dbc35262a554
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.rst
@@ -0,0 +1,126 @@
+==
+I915 VM_BIND feature design and use cases
+==
+
+VM_BIND feature
+
+DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+objects (BOs) or sections of a BOs at specified GPU virtual addresses on
+a specified address space (VM).
+
+These mappings (also referred to as persistent mappings) will be persistent
+across multiple GPU submissions (execbuff) issued by the UMD, making execbuff
+path leaner with fast path submission latency of O(1) w.r.t the number of
+objects required for that submission.
+
+UMDs can still send BOs of these persistent mappings in execlist of execbuff
+for specifying BO dependencies (implicit fencing) and to use BO as a batch.
+
+The persistent mappings are not individually tracked, instead the address
+space (VM) they are mapped in is tracked to determine if the mappings are
+being referred by GPU job (active) or not.
+
+VM_BIND features include:
+- Different VA mappings can map to the same physical pages of an object
+  (aliasing).
+- VA mapping can map to a partial section of the BO (partial binding).
+- Support capture of mapping in the dump upon GPU error.
+- TLB is flushed upon unbind completion.
+- Asynchronous vm_bind and vm_unbind support.
+- VM_BIND uses user/memory fence mechanism (explained below) for signaling
+  bind completion.
+
+
+User/Memory Fence
+==
+The idea is to take a user process virtual address and install an interrupt
+handler to wake up the current task when the memory location passes the user
+supplied filter.
+
+It also allows the user to emit their own MI_FLUSH/PIPE_CONTROL notify
+interrupt within their batches after updating the value on the GPU to
+have sub-batch precision on the wakeup.
+
+User/Memory fence  can also be supplied to the
+kernel driver to signal/wake up the user process after completion of an
+asynchronous operation.
+
+This feature will be derived from the below original work:
+https://patchwork.freedesktop.org/patch/349417/
+
+When VM_BIND ioctl was provided with a user/memory fence via SYNC_FENCE
+extension, it will be signaled upon the completion of binding of that
+mapping. All async binds/unbinds are serialized, hence signaling of
+user/memory fence also indicate the completion of all previous binds/unbinds.
+
+
+TODOs
+==
+- Rebase VM_BIND on top of ongoing i915 TTM adoption changes including
+  eviction support.
+- Various optimizations like around LRU ordering of persistent mappings,
+  batching of TLB flushes etc.
+
+
+Intended use cases
+===
+
+Debugger
+-
+With debug event interface user space process (debugger) is able to keep track
+of and act upon resources created by another process (debuggee) and attached
+to GPU via vm_bind interface.
+
+Mesa/Valkun
+
+VM_BIND can potentially reduce the CPU-overhead in Mesa thus improving
+performance. For Vulkan it should be straightforward to use VM_BIND.
+For Iris implicit buffer tracking must be implemented before we can harness
+VM_BIND benefits. With increasing GPU hardware performance reducing CPU
+overhead becomes more important.
+
+Page level hints settings
+--
+VM_BIND allows any hints setting per mapping instead of per BO.
+Possible hints include read-only, placement and atomicity.
+Sub-BO level placement hint will be even more relevant with
+upcoming GPU on-demand page fault support.
+
+Page level Cache/CLOS settings
+---
+VM_BIND allows cache/CLOS settings per mapping instead of per BO.
+
+Compute
+
+Usage of dma-fence expects that they complete in reasonable amount of time.
+Compute on the other hand can be long running. Hence it is appropriate for
+compute to use user/memory fence (explained above) and dma-fence usage will
+be limited to in kernel consumption only. Compute must opt-in for this
+mechanism during context creation time with a 'compute_ctx' flag.
+
+Where GPU page faults are not available, kernel driver upon buffer invalidation
+must initiate a compute context suspend with a dma-fence attached to it.
+And upon completion of that suspend fence, finish the invalidation and then
+resume the compute context.
+
+This is much easier to support with VM_BIND instead of the current heavier
+execbuff path resource attachment.
+
+Low Latency Submission
+---
+Allow compute 

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gvt: Fix cached atomics setting for Windows VM (rev2)

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915/gvt: Fix cached atomics setting for Windows VM (rev2)
URL   : https://patchwork.freedesktop.org/series/92809/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10454 -> Patchwork_20779


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/index.html

Known issues


  Here are the changes found in Patchwork_20779 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@gem_exec_suspend@basic-s0:
- fi-tgl-1115g4:  [PASS][2] -> [FAIL][3] ([i915#1888])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s0.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s0.html

  
 Possible fixes 

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [INCOMPLETE][4] -> [PASS][5]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888


Participating hosts (37 -> 34)
--

  Missing(3): fi-bdw-samus fi-bsw-cyan bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10454 -> Patchwork_20779

  CI-20190529: 20190529
  CI_DRM_10454: 224f5b80ba7d0caf6c4899e30d13cabde980bf49 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20779: da0d214e746ec4efcdd16e668a8029d168cdc322 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

da0d214e746e drm/i915/gvt: Fix cached atomics setting for Windows VM

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/index.html


[Intel-gfx] [RFC 2/2] drm/doc/rfc: VM_BIND uapi definition

2021-08-05 Thread Niranjana Vishwanathapura
VM_BIND and GEM_WAIT_USER_FENCE uapi document

Signed-off-by: Niranjana Vishwanathapura 
---
 Documentation/gpu/rfc/i915_vm_bind.h   | 113 +
 Documentation/gpu/rfc/i915_vm_bind.rst |   6 ++
 2 files changed, 119 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h

diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
b/Documentation/gpu/rfc/i915_vm_bind.h
new file mode 100644
index ..3aaf66e62aa0
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.h
@@ -0,0 +1,113 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+/* VM_BIND feature availability through drm_i915_getparam */
+#define I915_PARAM_HAS_VM_BIND  59
+
+/**
+ * struct drm_i915_gem_vm_bind - VA to object/buffer mapping to [un]bind.
+ */
+struct drm_i915_gem_vm_bind {
+   /** vm to [un]bind **/
+   __u32 vm_id;
+
+   /** BO handle or file descriptor. 'fd' to -1 for system pages (SVM) **/
+   union {
+   __u32 handle;
+   __s32 fd;
+   }
+
+   /** VA start to [un]bind **/
+   __u64 start;
+
+   /** Offset in object to [un]bind **/
+   __u64 offset;
+
+   /** VA length to [un]bind **/
+   __u64 length;
+
+   /** Flags **/
+   __u64 flags;
+   /** Bind the mapping immediately instead of during next submission */
+#define I915_GEM_VM_BIND_IMMEDIATE   (1 << 0)
+   /** Read-only mapping */
+#define I915_GEM_VM_BIND_READONLY(1 << 1)
+   /** Capture this mapping in the dump upon GPU error */
+#define I915_GEM_VM_BIND_CAPTURE (1 << 2)
+
+   /**
+* Zero-terminated chain of extensions.
+*
+* No current extensions defined; mbz.
+*/
+   __u64 extensions;
+};
+
+/**
+ * struct drm_i915_vm_bind_ext_sync_fence - Bind completion signaling 
extension.
+ */
+struct drm_i915_vm_bind_ext_sync_fence {
+#define I915_VM_BIND_EXT_SYNC_FENCE 0
+   /** @base: Extension link. See struct i915_user_extension. */
+   struct i915_user_extension base;
+
+   /** User/Memory fence address */
+   __u64 addr;
+
+   /** User/Memory fence value to be written after bind completion */
+   __u64 val;
+};
+
+/**
+ * struct drm_i915_gem_wait_user_fence
+ *
+ * Wait on user/memory fence. User/Memory fence can be woken up either by,
+ *1. GPU context indicated by 'ctx_id', or,
+ *2. Kerrnel driver async worker upon I915_UFENCE_WAIT_SOFT.
+ *   'ctx_id' is ignored when this flag is set.
+ *
+ * Wakeup when below condition is true.
+ * (*addr & MASK) OP (VALUE & MASK)
+ *
+ */
+struct drm_i915_gem_wait_user_fence {
+   /** @base: Extension link. See struct i915_user_extension. */
+   __u64 extensions;
+
+   /** User/Memory fence address */
+   __u64 addr;
+
+   /** Id of the Context which will signal the fence. */
+   __u32 ctx_id;
+
+   /** Wakeup condition operator */
+   __u16 op;
+#define I915_UFENCE_WAIT_EQ  0
+#define I915_UFENCE_WAIT_NEQ 1
+#define I915_UFENCE_WAIT_GT  2
+#define I915_UFENCE_WAIT_GTE 3
+#define I915_UFENCE_WAIT_LT  4
+#define I915_UFENCE_WAIT_LTE 5
+#define I915_UFENCE_WAIT_BEFORE  6
+#define I915_UFENCE_WAIT_AFTER   7
+
+   /** Flags */
+   __u16 flags;
+#define I915_UFENCE_WAIT_SOFT0x1
+#define I915_UFENCE_WAIT_ABSTIME 0x2
+
+   /** Wakeup value */
+   __u64 value;
+
+   /** Wakeup mask */
+   __u64 mask;
+#define I915_UFENCE_WAIT_U8 0xffu
+#define I915_UFENCE_WAIT_U160xu
+#define I915_UFENCE_WAIT_U320xul
+#define I915_UFENCE_WAIT_U640xull
+
+   /** Timeout */
+   __s64 timeout;
+};
diff --git a/Documentation/gpu/rfc/i915_vm_bind.rst 
b/Documentation/gpu/rfc/i915_vm_bind.rst
index dbc35262a554..dc843e32a1cd 100644
--- a/Documentation/gpu/rfc/i915_vm_bind.rst
+++ b/Documentation/gpu/rfc/i915_vm_bind.rst
@@ -117,6 +117,12 @@ VM_BIND interface can be used to map system memory 
directly (without gem BO
 abstraction) using the HMM interface.
 
 
+UAPI
+=
+Uapi definiton can be found here:
+.. kernel-doc:: Documentation/gpu/rfc/i915_vm_bind.h
+
+
 Links:
 ==
 - Reference WIP VM_BIND implementation can be found here.
-- 
2.21.0.rc0.32.g243a4c7e27



[Intel-gfx] [RFC 0/2] drm/doc/rfc: i915 VM_BIND feature design + uapi

2021-08-05 Thread Niranjana Vishwanathapura
This is the i915 driver VM_BIND feature design RFC patch series along
with the required uapi definition and description of intended use cases.

Signed-off-by: Niranjana Vishwanathapura 

Niranjana Vishwanathapura (2):
  drm/doc/rfc: VM_BIND feature design document
  drm/doc/rfc: VM_BIND uapi definition

 Documentation/gpu/rfc/i915_vm_bind.h   | 113 +
 Documentation/gpu/rfc/i915_vm_bind.rst | 132 +
 Documentation/gpu/rfc/index.rst|   4 +
 3 files changed, 249 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.rst

-- 
2.21.0.rc0.32.g243a4c7e27



[Intel-gfx] [PATCH v4 05/14] vfio/samples: Delete useless open/close

2021-08-05 Thread Jason Gunthorpe
The core code no longer requires these ops to be defined, so delete these
empty functions and leave the op as NULL. mtty's functions only log a
pointless message, delete that entirely.

Signed-off-by: Yishai Hadas 
Reviewed-by: Cornelia Huck 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 samples/vfio-mdev/mbochs.c |  6 --
 samples/vfio-mdev/mdpy.c   | 11 ---
 samples/vfio-mdev/mtty.c   | 13 -
 3 files changed, 30 deletions(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 0f1511849b7c3c..7b2e12fe70827c 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1278,11 +1278,6 @@ static long mbochs_ioctl(struct vfio_device *vdev, 
unsigned int cmd,
return -ENOTTY;
 }
 
-static int mbochs_open(struct vfio_device *vdev)
-{
-   return 0;
-}
-
 static void mbochs_close(struct vfio_device *vdev)
 {
struct mdev_state *mdev_state =
@@ -1401,7 +1396,6 @@ static struct attribute_group *mdev_type_groups[] = {
 };
 
 static const struct vfio_device_ops mbochs_dev_ops = {
-   .open = mbochs_open,
.release = mbochs_close,
.read = mbochs_read,
.write = mbochs_write,
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 57334034cde6dd..8d1a80a0722aa9 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -614,15 +614,6 @@ static long mdpy_ioctl(struct vfio_device *vdev, unsigned 
int cmd,
return -ENOTTY;
 }
 
-static int mdpy_open(struct vfio_device *vdev)
-{
-   return 0;
-}
-
-static void mdpy_close(struct vfio_device *vdev)
-{
-}
-
 static ssize_t
 resolution_show(struct device *dev, struct device_attribute *attr,
char *buf)
@@ -717,8 +708,6 @@ static struct attribute_group *mdev_type_groups[] = {
 };
 
 static const struct vfio_device_ops mdpy_dev_ops = {
-   .open = mdpy_open,
-   .release = mdpy_close,
.read = mdpy_read,
.write = mdpy_write,
.ioctl = mdpy_ioctl,
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 37cc9067e1601d..5983cdb16e3d1d 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1207,17 +1207,6 @@ static long mtty_ioctl(struct vfio_device *vdev, 
unsigned int cmd,
return -ENOTTY;
 }
 
-static int mtty_open(struct vfio_device *vdev)
-{
-   pr_info("%s\n", __func__);
-   return 0;
-}
-
-static void mtty_close(struct vfio_device *mdev)
-{
-   pr_info("%s\n", __func__);
-}
-
 static ssize_t
 sample_mtty_dev_show(struct device *dev, struct device_attribute *attr,
 char *buf)
@@ -1325,8 +1314,6 @@ static struct attribute_group *mdev_type_groups[] = {
 
 static const struct vfio_device_ops mtty_dev_ops = {
.name = "vfio-mtty",
-   .open = mtty_open,
-   .release = mtty_close,
.read = mtty_read,
.write = mtty_write,
.ioctl = mtty_ioctl,
-- 
2.32.0



[Intel-gfx] [PATCH v4 14/14] vfio: Remove struct vfio_device_ops open/release

2021-08-05 Thread Jason Gunthorpe
Nothing uses this anymore, delete it.

Signed-off-by: Yishai Hadas 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/mdev/vfio_mdev.c | 22 --
 drivers/vfio/vfio.c   | 14 +-
 include/linux/mdev.h  |  7 ---
 include/linux/vfio.h  |  4 
 4 files changed, 1 insertion(+), 46 deletions(-)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index e12196ffd48718..7a9883048216e7 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -37,26 +37,6 @@ static void vfio_mdev_close_device(struct vfio_device 
*core_vdev)
parent->ops->close_device(mdev);
 }
 
-static int vfio_mdev_open(struct vfio_device *core_vdev)
-{
-   struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-   struct mdev_parent *parent = mdev->type->parent;
-
-   if (unlikely(!parent->ops->open))
-   return 0;
-
-   return parent->ops->open(mdev);
-}
-
-static void vfio_mdev_release(struct vfio_device *core_vdev)
-{
-   struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-   struct mdev_parent *parent = mdev->type->parent;
-
-   if (likely(parent->ops->release))
-   parent->ops->release(mdev);
-}
-
 static long vfio_mdev_unlocked_ioctl(struct vfio_device *core_vdev,
 unsigned int cmd, unsigned long arg)
 {
@@ -122,8 +102,6 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
.name   = "vfio-mdev",
.open_device= vfio_mdev_open_device,
.close_device   = vfio_mdev_close_device,
-   .open   = vfio_mdev_open,
-   .release= vfio_mdev_release,
.ioctl  = vfio_mdev_unlocked_ioctl,
.read   = vfio_mdev_read,
.write  = vfio_mdev_write,
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 9cc17768c42554..3c034fe14ccb03 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1470,19 +1470,13 @@ static int vfio_group_get_device_fd(struct vfio_group 
*group, char *buf)
}
mutex_unlock(>dev_set->lock);
 
-   if (device->ops->open) {
-   ret = device->ops->open(device);
-   if (ret)
-   goto err_close_device;
-   }
-
/*
 * We can't use anon_inode_getfd() because we need to modify
 * the f_mode flags directly to allow more than just ioctls
 */
fdno = ret = get_unused_fd_flags(O_CLOEXEC);
if (ret < 0)
-   goto err_release;
+   goto err_close_device;
 
filep = anon_inode_getfile("[vfio-device]", _device_fops,
   device, O_RDWR);
@@ -1509,9 +1503,6 @@ static int vfio_group_get_device_fd(struct vfio_group 
*group, char *buf)
 
 err_fd:
put_unused_fd(fdno);
-err_release:
-   if (device->ops->release)
-   device->ops->release(device);
 err_close_device:
mutex_lock(>dev_set->lock);
if (device->open_count == 1 && device->ops->close_device)
@@ -1659,9 +1650,6 @@ static int vfio_device_fops_release(struct inode *inode, 
struct file *filep)
 {
struct vfio_device *device = filep->private_data;
 
-   if (device->ops->release)
-   device->ops->release(device);
-
mutex_lock(>dev_set->lock);
if (!--device->open_count && device->ops->close_device)
device->ops->close_device(device);
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index cb5b7ed1d7c30d..68427e8fadebd6 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -72,11 +72,6 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
  * @mdev: mdev_device device structure which is being
  *destroyed
  * Returns integer: success (0) or error (< 0)
- * @open:  Open mediated device.
- * @mdev: mediated device.
- * Returns integer: success (0) or error (< 0)
- * @release:   release mediated device
- * @mdev: mediated device.
  * @read:  Read emulation callback
  * @mdev: mediated device structure
  * @buf: read buffer
@@ -113,8 +108,6 @@ struct mdev_parent_ops {
int (*remove)(struct mdev_device *mdev);
int (*open_device)(struct mdev_device *mdev);
void(*close_device)(struct mdev_device *mdev);
-   int (*open)(struct mdev_device *mdev);
-   void(*release)(struct mdev_device *mdev);
ssize_t (*read)(struct mdev_device *mdev, char __user *buf,
size_t count, loff_t *ppos);
ssize_t (*write)(struct mdev_device *mdev, const char __user *buf,
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f0e6a72875e471..b53a9557884ada 100644
--- a/include/linux/vfio.h
+++ 

[Intel-gfx] [PATCH v4 08/14] vfio/pci: Move to the device set infrastructure

2021-08-05 Thread Jason Gunthorpe
From: Yishai Hadas 

PCI wants to have the usual open/close_device() logic with the slight
twist that the open/close_device() must be done under a singelton lock
shared by all of the vfio_devices that are in the PCI "reset group".

The reset group, and thus the device set, is determined by what devices
pci_reset_bus() touches, which is either the entire bus or only the slot.

Rely on the core code to do everything reflck was doing and delete reflck
entirely.

Signed-off-by: Yishai Hadas 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/pci/vfio_pci.c | 162 +++-
 drivers/vfio/pci/vfio_pci_private.h |   7 --
 2 files changed, 37 insertions(+), 132 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index fab3715d60d4ba..5d6db93d6c680f 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -530,53 +530,40 @@ static void vfio_pci_vf_token_user_add(struct 
vfio_pci_device *vdev, int val)
vfio_device_put(_vdev->vdev);
 }
 
-static void vfio_pci_release(struct vfio_device *core_vdev)
+static void vfio_pci_close_device(struct vfio_device *core_vdev)
 {
struct vfio_pci_device *vdev =
container_of(core_vdev, struct vfio_pci_device, vdev);
 
-   mutex_lock(>reflck->lock);
+   vfio_pci_vf_token_user_add(vdev, -1);
+   vfio_spapr_pci_eeh_release(vdev->pdev);
+   vfio_pci_disable(vdev);
 
-   if (!(--vdev->refcnt)) {
-   vfio_pci_vf_token_user_add(vdev, -1);
-   vfio_spapr_pci_eeh_release(vdev->pdev);
-   vfio_pci_disable(vdev);
-
-   mutex_lock(>igate);
-   if (vdev->err_trigger) {
-   eventfd_ctx_put(vdev->err_trigger);
-   vdev->err_trigger = NULL;
-   }
-   if (vdev->req_trigger) {
-   eventfd_ctx_put(vdev->req_trigger);
-   vdev->req_trigger = NULL;
-   }
-   mutex_unlock(>igate);
+   mutex_lock(>igate);
+   if (vdev->err_trigger) {
+   eventfd_ctx_put(vdev->err_trigger);
+   vdev->err_trigger = NULL;
}
-
-   mutex_unlock(>reflck->lock);
+   if (vdev->req_trigger) {
+   eventfd_ctx_put(vdev->req_trigger);
+   vdev->req_trigger = NULL;
+   }
+   mutex_unlock(>igate);
 }
 
-static int vfio_pci_open(struct vfio_device *core_vdev)
+static int vfio_pci_open_device(struct vfio_device *core_vdev)
 {
struct vfio_pci_device *vdev =
container_of(core_vdev, struct vfio_pci_device, vdev);
int ret = 0;
 
-   mutex_lock(>reflck->lock);
+   ret = vfio_pci_enable(vdev);
+   if (ret)
+   return ret;
 
-   if (!vdev->refcnt) {
-   ret = vfio_pci_enable(vdev);
-   if (ret)
-   goto error;
-
-   vfio_spapr_pci_eeh_open(vdev->pdev);
-   vfio_pci_vf_token_user_add(vdev, 1);
-   }
-   vdev->refcnt++;
-error:
-   mutex_unlock(>reflck->lock);
-   return ret;
+   vfio_spapr_pci_eeh_open(vdev->pdev);
+   vfio_pci_vf_token_user_add(vdev, 1);
+   return 0;
 }
 
 static int vfio_pci_get_irq_count(struct vfio_pci_device *vdev, int irq_type)
@@ -1870,8 +1857,8 @@ static int vfio_pci_match(struct vfio_device *core_vdev, 
char *buf)
 
 static const struct vfio_device_ops vfio_pci_ops = {
.name   = "vfio-pci",
-   .open   = vfio_pci_open,
-   .release= vfio_pci_release,
+   .open_device= vfio_pci_open_device,
+   .close_device   = vfio_pci_close_device,
.ioctl  = vfio_pci_ioctl,
.read   = vfio_pci_read,
.write  = vfio_pci_write,
@@ -1880,9 +1867,6 @@ static const struct vfio_device_ops vfio_pci_ops = {
.match  = vfio_pci_match,
 };
 
-static int vfio_pci_reflck_attach(struct vfio_pci_device *vdev);
-static void vfio_pci_reflck_put(struct vfio_pci_reflck *reflck);
-
 static int vfio_pci_bus_notifier(struct notifier_block *nb,
 unsigned long action, void *data)
 {
@@ -2020,12 +2004,23 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
INIT_LIST_HEAD(>vma_list);
init_rwsem(>memory_lock);
 
-   ret = vfio_pci_reflck_attach(vdev);
+   if (pci_is_root_bus(pdev->bus)) {
+   ret = vfio_assign_device_set(>vdev, vdev);
+   } else if (!pci_probe_reset_slot(pdev->slot)) {
+   ret = vfio_assign_device_set(>vdev, pdev->slot);
+   } else {
+   /*
+* If there is no slot reset support for this device, the whole
+* bus needs to be grouped together to support bus-wide resets.
+*/
+   ret = vfio_assign_device_set(>vdev, pdev->bus);
+   }
+
if (ret)

[Intel-gfx] [PATCH v4 12/14] vfio/ap, ccw: Fix open/close when multiple device FDs are open

2021-08-05 Thread Jason Gunthorpe
The user can open multiple device FDs if it likes, however these open()
functions call vfio_register_notifier() on some device global
state. Calling vfio_register_notifier() twice in will trigger a WARN_ON
from notifier_chain_register() and the first close will wrongly delete the
notifier and more.

Since these really want the new open/close_device() semantics just change
the functions over.

Reviewed-by: Cornelia Huck 
Signed-off-by: Jason Gunthorpe 
---
 drivers/s390/cio/vfio_ccw_ops.c   | 8 
 drivers/s390/crypto/vfio_ap_ops.c | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index c57d2a7f091975..7f540ad0b568bc 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -159,7 +159,7 @@ static int vfio_ccw_mdev_remove(struct mdev_device *mdev)
return 0;
 }
 
-static int vfio_ccw_mdev_open(struct mdev_device *mdev)
+static int vfio_ccw_mdev_open_device(struct mdev_device *mdev)
 {
struct vfio_ccw_private *private =
dev_get_drvdata(mdev_parent_dev(mdev));
@@ -194,7 +194,7 @@ static int vfio_ccw_mdev_open(struct mdev_device *mdev)
return ret;
 }
 
-static void vfio_ccw_mdev_release(struct mdev_device *mdev)
+static void vfio_ccw_mdev_close_device(struct mdev_device *mdev)
 {
struct vfio_ccw_private *private =
dev_get_drvdata(mdev_parent_dev(mdev));
@@ -638,8 +638,8 @@ static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
.supported_type_groups  = mdev_type_groups,
.create = vfio_ccw_mdev_create,
.remove = vfio_ccw_mdev_remove,
-   .open   = vfio_ccw_mdev_open,
-   .release= vfio_ccw_mdev_release,
+   .open_device= vfio_ccw_mdev_open_device,
+   .close_device   = vfio_ccw_mdev_close_device,
.read   = vfio_ccw_mdev_read,
.write  = vfio_ccw_mdev_write,
.ioctl  = vfio_ccw_mdev_ioctl,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
b/drivers/s390/crypto/vfio_ap_ops.c
index 122c85c224695e..cee5626fe0a4ef 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1315,7 +1315,7 @@ static int vfio_ap_mdev_reset_queues(struct mdev_device 
*mdev)
return rc;
 }
 
-static int vfio_ap_mdev_open(struct mdev_device *mdev)
+static int vfio_ap_mdev_open_device(struct mdev_device *mdev)
 {
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
unsigned long events;
@@ -1348,7 +1348,7 @@ static int vfio_ap_mdev_open(struct mdev_device *mdev)
return ret;
 }
 
-static void vfio_ap_mdev_release(struct mdev_device *mdev)
+static void vfio_ap_mdev_close_device(struct mdev_device *mdev)
 {
struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
@@ -1427,8 +1427,8 @@ static const struct mdev_parent_ops vfio_ap_matrix_ops = {
.mdev_attr_groups   = vfio_ap_mdev_attr_groups,
.create = vfio_ap_mdev_create,
.remove = vfio_ap_mdev_remove,
-   .open   = vfio_ap_mdev_open,
-   .release= vfio_ap_mdev_release,
+   .open_device= vfio_ap_mdev_open_device,
+   .close_device   = vfio_ap_mdev_close_device,
.ioctl  = vfio_ap_mdev_ioctl,
 };
 
-- 
2.32.0



[Intel-gfx] [PATCH v4 01/14] vfio/samples: Remove module get/put

2021-08-05 Thread Jason Gunthorpe
The patch to move the get/put to core and the patch to convert the samples
to use vfio_device crossed in a way that this was missed. When both
patches are together the samples do not need their own get/put.

Fixes: 437e41368c01 ("vfio/mdpy: Convert to use vfio_register_group_dev()")
Fixes: 681c1615f891 ("vfio/mbochs: Convert to use vfio_register_group_dev()")
Reviewed-by: Cornelia Huck 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 samples/vfio-mdev/mbochs.c | 4 
 samples/vfio-mdev/mdpy.c   | 4 
 2 files changed, 8 deletions(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 6c0f229db36a1a..e81b875b4d87b4 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1274,9 +1274,6 @@ static long mbochs_ioctl(struct vfio_device *vdev, 
unsigned int cmd,
 
 static int mbochs_open(struct vfio_device *vdev)
 {
-   if (!try_module_get(THIS_MODULE))
-   return -ENODEV;
-
return 0;
 }
 
@@ -1300,7 +1297,6 @@ static void mbochs_close(struct vfio_device *vdev)
mbochs_put_pages(mdev_state);
 
mutex_unlock(_state->ops_lock);
-   module_put(THIS_MODULE);
 }
 
 static ssize_t
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 393c9df6f6a010..a7d4ed28d66411 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -611,15 +611,11 @@ static long mdpy_ioctl(struct vfio_device *vdev, unsigned 
int cmd,
 
 static int mdpy_open(struct vfio_device *vdev)
 {
-   if (!try_module_get(THIS_MODULE))
-   return -ENODEV;
-
return 0;
 }
 
 static void mdpy_close(struct vfio_device *vdev)
 {
-   module_put(THIS_MODULE);
 }
 
 static ssize_t
-- 
2.32.0



[Intel-gfx] [PATCH v4 07/14] vfio/platform: Use open_device() instead of open coding a refcnt scheme

2021-08-05 Thread Jason Gunthorpe
Platform simply wants to run some code when the device is first
opened/last closed. Use the core framework and locking for this.  Aside
from removing a bit of code this narrows the locking scope from a global
lock.

Reviewed-by: Cornelia Huck 
Reviewed-by: Eric Auger 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Yishai Hadas 
---
 drivers/vfio/platform/vfio_platform_common.c  | 95 ---
 drivers/vfio/platform/vfio_platform_private.h |  1 -
 2 files changed, 40 insertions(+), 56 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index bdde8605178cd2..6af7ce7d619c25 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -218,65 +218,52 @@ static int vfio_platform_call_reset(struct 
vfio_platform_device *vdev,
return -EINVAL;
 }
 
-static void vfio_platform_release(struct vfio_device *core_vdev)
-{
-   struct vfio_platform_device *vdev =
-   container_of(core_vdev, struct vfio_platform_device, vdev);
-
-   mutex_lock(_lock);
-
-   if (!(--vdev->refcnt)) {
-   const char *extra_dbg = NULL;
-   int ret;
-
-   ret = vfio_platform_call_reset(vdev, _dbg);
-   if (ret && vdev->reset_required) {
-   dev_warn(vdev->device, "reset driver is required and 
reset call failed in release (%d) %s\n",
-ret, extra_dbg ? extra_dbg : "");
-   WARN_ON(1);
-   }
-   pm_runtime_put(vdev->device);
-   vfio_platform_regions_cleanup(vdev);
-   vfio_platform_irq_cleanup(vdev);
-   }
-
-   mutex_unlock(_lock);
-}
-
-static int vfio_platform_open(struct vfio_device *core_vdev)
+static void vfio_platform_close_device(struct vfio_device *core_vdev)
 {
struct vfio_platform_device *vdev =
container_of(core_vdev, struct vfio_platform_device, vdev);
+   const char *extra_dbg = NULL;
int ret;
 
-   mutex_lock(_lock);
-
-   if (!vdev->refcnt) {
-   const char *extra_dbg = NULL;
-
-   ret = vfio_platform_regions_init(vdev);
-   if (ret)
-   goto err_reg;
-
-   ret = vfio_platform_irq_init(vdev);
-   if (ret)
-   goto err_irq;
-
-   ret = pm_runtime_get_sync(vdev->device);
-   if (ret < 0)
-   goto err_rst;
-
-   ret = vfio_platform_call_reset(vdev, _dbg);
-   if (ret && vdev->reset_required) {
-   dev_warn(vdev->device, "reset driver is required and 
reset call failed in open (%d) %s\n",
-ret, extra_dbg ? extra_dbg : "");
-   goto err_rst;
-   }
+   ret = vfio_platform_call_reset(vdev, _dbg);
+   if (WARN_ON(ret && vdev->reset_required)) {
+   dev_warn(
+   vdev->device,
+   "reset driver is required and reset call failed in 
release (%d) %s\n",
+   ret, extra_dbg ? extra_dbg : "");
}
+   pm_runtime_put(vdev->device);
+   vfio_platform_regions_cleanup(vdev);
+   vfio_platform_irq_cleanup(vdev);
+}
 
-   vdev->refcnt++;
+static int vfio_platform_open_device(struct vfio_device *core_vdev)
+{
+   struct vfio_platform_device *vdev =
+   container_of(core_vdev, struct vfio_platform_device, vdev);
+   const char *extra_dbg = NULL;
+   int ret;
 
-   mutex_unlock(_lock);
+   ret = vfio_platform_regions_init(vdev);
+   if (ret)
+   return ret;
+
+   ret = vfio_platform_irq_init(vdev);
+   if (ret)
+   goto err_irq;
+
+   ret = pm_runtime_get_sync(vdev->device);
+   if (ret < 0)
+   goto err_rst;
+
+   ret = vfio_platform_call_reset(vdev, _dbg);
+   if (ret && vdev->reset_required) {
+   dev_warn(
+   vdev->device,
+   "reset driver is required and reset call failed in open 
(%d) %s\n",
+   ret, extra_dbg ? extra_dbg : "");
+   goto err_rst;
+   }
return 0;
 
 err_rst:
@@ -284,8 +271,6 @@ static int vfio_platform_open(struct vfio_device *core_vdev)
vfio_platform_irq_cleanup(vdev);
 err_irq:
vfio_platform_regions_cleanup(vdev);
-err_reg:
-   mutex_unlock(_lock);
return ret;
 }
 
@@ -616,8 +601,8 @@ static int vfio_platform_mmap(struct vfio_device 
*core_vdev, struct vm_area_stru
 
 static const struct vfio_device_ops vfio_platform_ops = {
.name   = "vfio-platform",
-   .open   = vfio_platform_open,
-   .release= vfio_platform_release,
+   .open_device= vfio_platform_open_device,
+   .close_device   = 

[Intel-gfx] [PATCH v4 06/14] vfio/fsl: Move to the device set infrastructure

2021-08-05 Thread Jason Gunthorpe
FSL uses the internal reflck to implement the open_device() functionality,
conversion to the core code is straightforward.

The decision on which set to be part of is trivially based on the
is_fsl_mc_bus_dprc() and we use a 'struct device *' pointer as the set_id.

The dev_set lock is protecting the interrupts setup. The FSL MC devices
are using MSIs and only the DPRC device is allocating the MSIs from the
MSI domain. The other devices just take interrupts from a pool. The lock
is protecting the access to this pool.

Signed-off-by: Yishai Hadas 
Tested-by: Diana Craciun OSS 
Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c | 156 --
 drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c|   6 +-
 drivers/vfio/fsl-mc/vfio_fsl_mc_private.h |   7 -
 3 files changed, 29 insertions(+), 140 deletions(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c 
b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 122997c61ba450..0ead91bfa83867 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -19,81 +19,10 @@
 
 static struct fsl_mc_driver vfio_fsl_mc_driver;
 
-static DEFINE_MUTEX(reflck_lock);
-
-static void vfio_fsl_mc_reflck_get(struct vfio_fsl_mc_reflck *reflck)
-{
-   kref_get(>kref);
-}
-
-static void vfio_fsl_mc_reflck_release(struct kref *kref)
-{
-   struct vfio_fsl_mc_reflck *reflck = container_of(kref,
- struct vfio_fsl_mc_reflck,
- kref);
-
-   mutex_destroy(>lock);
-   kfree(reflck);
-   mutex_unlock(_lock);
-}
-
-static void vfio_fsl_mc_reflck_put(struct vfio_fsl_mc_reflck *reflck)
-{
-   kref_put_mutex(>kref, vfio_fsl_mc_reflck_release, _lock);
-}
-
-static struct vfio_fsl_mc_reflck *vfio_fsl_mc_reflck_alloc(void)
-{
-   struct vfio_fsl_mc_reflck *reflck;
-
-   reflck = kzalloc(sizeof(*reflck), GFP_KERNEL);
-   if (!reflck)
-   return ERR_PTR(-ENOMEM);
-
-   kref_init(>kref);
-   mutex_init(>lock);
-
-   return reflck;
-}
-
-static int vfio_fsl_mc_reflck_attach(struct vfio_fsl_mc_device *vdev)
-{
-   int ret = 0;
-
-   mutex_lock(_lock);
-   if (is_fsl_mc_bus_dprc(vdev->mc_dev)) {
-   vdev->reflck = vfio_fsl_mc_reflck_alloc();
-   ret = PTR_ERR_OR_ZERO(vdev->reflck);
-   } else {
-   struct device *mc_cont_dev = vdev->mc_dev->dev.parent;
-   struct vfio_device *device;
-   struct vfio_fsl_mc_device *cont_vdev;
-
-   device = vfio_device_get_from_dev(mc_cont_dev);
-   if (!device) {
-   ret = -ENODEV;
-   goto unlock;
-   }
-
-   cont_vdev =
-   container_of(device, struct vfio_fsl_mc_device, vdev);
-   if (!cont_vdev || !cont_vdev->reflck) {
-   vfio_device_put(device);
-   ret = -ENODEV;
-   goto unlock;
-   }
-   vfio_fsl_mc_reflck_get(cont_vdev->reflck);
-   vdev->reflck = cont_vdev->reflck;
-   vfio_device_put(device);
-   }
-
-unlock:
-   mutex_unlock(_lock);
-   return ret;
-}
-
-static int vfio_fsl_mc_regions_init(struct vfio_fsl_mc_device *vdev)
+static int vfio_fsl_mc_open_device(struct vfio_device *core_vdev)
 {
+   struct vfio_fsl_mc_device *vdev =
+   container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
struct fsl_mc_device *mc_dev = vdev->mc_dev;
int count = mc_dev->obj_desc.region_count;
int i;
@@ -136,58 +65,30 @@ static void vfio_fsl_mc_regions_cleanup(struct 
vfio_fsl_mc_device *vdev)
kfree(vdev->regions);
 }
 
-static int vfio_fsl_mc_open(struct vfio_device *core_vdev)
-{
-   struct vfio_fsl_mc_device *vdev =
-   container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
-   int ret = 0;
-
-   mutex_lock(>reflck->lock);
-   if (!vdev->refcnt) {
-   ret = vfio_fsl_mc_regions_init(vdev);
-   if (ret)
-   goto out;
-   }
-   vdev->refcnt++;
-out:
-   mutex_unlock(>reflck->lock);
-
-   return ret;
-}
-
-static void vfio_fsl_mc_release(struct vfio_device *core_vdev)
+
+static void vfio_fsl_mc_close_device(struct vfio_device *core_vdev)
 {
struct vfio_fsl_mc_device *vdev =
container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
+   struct fsl_mc_device *mc_dev = vdev->mc_dev;
+   struct device *cont_dev = fsl_mc_cont_dev(_dev->dev);
+   struct fsl_mc_device *mc_cont = to_fsl_mc_device(cont_dev);
int ret;
 
-   mutex_lock(>reflck->lock);
+   vfio_fsl_mc_regions_cleanup(vdev);
 
-   if (!(--vdev->refcnt)) {
-   struct fsl_mc_device *mc_dev = vdev->mc_dev;
-   struct device *cont_dev = fsl_mc_cont_dev(_dev->dev);
-   struct 

[Intel-gfx] [PATCH v4 10/14] vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device set

2021-08-05 Thread Jason Gunthorpe
Like vfio_pci_dev_set_try_reset() this code wants to reset all of the
devices in the "reset group" which is the same membership as the device
set.

Instead of trying to reconstruct the device set from the PCI list go
directly from the device set's device list to execute the reset.

The same basic structure as vfio_pci_dev_set_try_reset() is used. The
'vfio_devices' struct is replaced with the device set linked list and we
simply sweep it multiple times under the lock.

This eliminates a memory allocation and get/put traffic and another
improperly locked test of pci_dev_driver().

Reviewed-off-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/pci/vfio_pci.c | 213 +++-
 1 file changed, 89 insertions(+), 124 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 0147f04c91b2fb..a4f44ea52fa324 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -223,9 +223,11 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device 
*vdev)
}
 }
 
+struct vfio_pci_group_info;
 static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set);
 static void vfio_pci_disable(struct vfio_pci_device *vdev);
-static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data);
+static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set,
+ struct vfio_pci_group_info *groups);
 
 /*
  * INTx masking requires the ability to disable INTx signaling via PCI_COMMAND
@@ -643,37 +645,11 @@ static int vfio_pci_fill_devs(struct pci_dev *pdev, void 
*data)
return 0;
 }
 
-struct vfio_pci_group_entry {
-   struct vfio_group *group;
-   int id;
-};
-
 struct vfio_pci_group_info {
int count;
-   struct vfio_pci_group_entry *groups;
+   struct vfio_group **groups;
 };
 
-static int vfio_pci_validate_devs(struct pci_dev *pdev, void *data)
-{
-   struct vfio_pci_group_info *info = data;
-   struct iommu_group *group;
-   int id, i;
-
-   group = iommu_group_get(>dev);
-   if (!group)
-   return -EPERM;
-
-   id = iommu_group_id(group);
-
-   for (i = 0; i < info->count; i++)
-   if (info->groups[i].id == id)
-   break;
-
-   iommu_group_put(group);
-
-   return (i == info->count) ? -EINVAL : 0;
-}
-
 static bool vfio_pci_dev_below_slot(struct pci_dev *pdev, struct pci_slot 
*slot)
 {
for (; pdev; pdev = pdev->bus->self)
@@ -751,12 +727,6 @@ int vfio_pci_register_dev_region(struct vfio_pci_device 
*vdev,
return 0;
 }
 
-struct vfio_devices {
-   struct vfio_pci_device **devices;
-   int cur_index;
-   int max_index;
-};
-
 static long vfio_pci_ioctl(struct vfio_device *core_vdev,
   unsigned int cmd, unsigned long arg)
 {
@@ -1125,11 +1095,10 @@ static long vfio_pci_ioctl(struct vfio_device 
*core_vdev,
} else if (cmd == VFIO_DEVICE_PCI_HOT_RESET) {
struct vfio_pci_hot_reset hdr;
int32_t *group_fds;
-   struct vfio_pci_group_entry *groups;
+   struct vfio_group **groups;
struct vfio_pci_group_info info;
-   struct vfio_devices devs = { .cur_index = 0 };
bool slot = false;
-   int i, group_idx, mem_idx = 0, count = 0, ret = 0;
+   int group_idx, count = 0, ret = 0;
 
minsz = offsetofend(struct vfio_pci_hot_reset, count);
 
@@ -1196,9 +1165,7 @@ static long vfio_pci_ioctl(struct vfio_device *core_vdev,
break;
}
 
-   groups[group_idx].group = group;
-   groups[group_idx].id =
-   vfio_external_user_iommu_id(group);
+   groups[group_idx] = group;
}
 
kfree(group_fds);
@@ -1210,64 +1177,11 @@ static long vfio_pci_ioctl(struct vfio_device 
*core_vdev,
info.count = hdr.count;
info.groups = groups;
 
-   /*
-* Test whether all the affected devices are contained
-* by the set of groups provided by the user.
-*/
-   ret = vfio_pci_for_each_slot_or_bus(vdev->pdev,
-   vfio_pci_validate_devs,
-   , slot);
-   if (ret)
-   goto hot_reset_release;
-
-   devs.max_index = count;
-   devs.devices = kcalloc(count, sizeof(struct vfio_device *),
-  GFP_KERNEL);
-   if (!devs.devices) {
-   ret = -ENOMEM;
-   goto hot_reset_release;
-   }
-
-   /*
-* We need to get memory_lock for each device, but devices
-

[Intel-gfx] [PATCH v4 11/14] vfio/mbochs: Fix close when multiple device FDs are open

2021-08-05 Thread Jason Gunthorpe
mbochs_close() iterates over global device state and frees it. Currently
this is done every time a device FD is closed, but if multiple device FDs
are open this could corrupt other still active FDs.

Change this to use close_device() so it only runs on the last close.

Reviewed-by: Cornelia Huck 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 samples/vfio-mdev/mbochs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 7b2e12fe70827c..c313ab4d1f4e4e 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1278,7 +1278,7 @@ static long mbochs_ioctl(struct vfio_device *vdev, 
unsigned int cmd,
return -ENOTTY;
 }
 
-static void mbochs_close(struct vfio_device *vdev)
+static void mbochs_close_device(struct vfio_device *vdev)
 {
struct mdev_state *mdev_state =
container_of(vdev, struct mdev_state, vdev);
@@ -1396,7 +1396,7 @@ static struct attribute_group *mdev_type_groups[] = {
 };
 
 static const struct vfio_device_ops mbochs_dev_ops = {
-   .release = mbochs_close,
+   .close_device = mbochs_close_device,
.read = mbochs_read,
.write = mbochs_write,
.ioctl = mbochs_ioctl,
-- 
2.32.0



[Intel-gfx] [PATCH v4 00/14] Provide core infrastructure for managing open/release

2021-08-05 Thread Jason Gunthorpe
This is in support of Max's series to split vfio-pci. For that to work the
reflck concept embedded in vfio-pci needs to be sharable across all of the
new VFIO PCI drivers which motivated re-examining how this is
implemented.

Another significant issue is how the VFIO PCI core includes code like:

   if (pci_dev_driver(pdev) != _pci_driver)

Which is not scalable if there are going to be multiple different driver
types.

This series takes the approach of moving the "reflck" mechanism into the
core code as a "device set". Each vfio_device driver can specify how
vfio_devices are grouped into the set using a key and the set comes along
with a set-global mutex. The core code manages creating per-device set
memory and associating it with each vfio_device.

In turn this allows the core code to provide an open/close_device()
operation that is called only for the first/last FD, and is called under
the global device set lock.

Review of all the drivers show that they are either already open coding
the first/last semantic or are buggy and missing it. All drivers are
migrated/fixed to the new open/close_device ops and the unused per-FD
open()/release() ops are deleted.

The special behavior of PCI around the bus/slot "reset group" is recast in
terms of the device set which conslidates the reflck, eliminates two
touches of pci_dev_driver(), and allows the reset mechanism to share
across all VFIO PCI drivers. PCI is changed to acquire devices directly
from the device set instead of trying to work backwards from the struct
pci_device.

Overall a few minor bugs are squashed and quite a bit of code is removed
through consolidation.

v4:
 - Fix use-after-free typo in mbochs error unwind
 - Allow mdevs to work when they don't have open/release ops, for
   bisect-ability
 - Redo the vfio_pci_try_bus_reset() patch, make it dev_set centric
 - Change VFIO_DEVICE_PCI_HOT_RESET to align with the new
   vfio_pci_try_bus_reset() design
v3: https://lore.kernel.org/r/0-v3-6c9e19cc7d44+15613-vfio_reflck_...@nvidia.com
 - Atomic conversion of mbochs_used_mbytes
 - Add missing vfio_uninit_group_dev in error unwind of mbochs
 - Reorganize vfio_assign_device_set()
 - Move the dev_set_list hunks to the introduction of the dev_set
 - Use if instead of ?: in fsl
 - Add a comment about the whole bus reset in vfio_pci_probe()
 - Rename vfio_pci_check_all_devices_bound() to
   vfio_pci_is_device_in_set()
 - Move logic from vfio_pci_try_bus_reset() into vfio_pci_find_reset_target()
v2: https://lore.kernel.org/r/0-v2-b6a5582525c9+ff96-vfio_reflck_...@nvidia.com
 - Reorder fsl and mbochs vfio_uninit_group_dev
 - Fix missing error unwind in mbochs
 - Return 0 from mdev open_device if there is no op
 - Fix style for else {}
 - Spelling fix for singleton
 - Acquire cur_mem under lock
 - Always use error unwind flow for vfio_pci_check_all_devices_bound()
v1: https://lore.kernel.org/r/0-v1-eaf3ccbba33c+1add0-vfio_reflck_...@nvidia.com

Jason Gunthorpe (12):
  vfio/samples: Remove module get/put
  vfio/mbochs: Fix missing error unwind of mbochs_used_mbytes
  vfio: Provide better generic support for open/release vfio_device_ops
  vfio/samples: Delete useless open/close
  vfio/fsl: Move to the device set infrastructure
  vfio/platform: Use open_device() instead of open coding a refcnt
scheme
  vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set
  vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device set
  vfio/mbochs: Fix close when multiple device FDs are open
  vfio/ap,ccw: Fix open/close when multiple device FDs are open
  vfio/gvt: Fix open/close when multiple device FDs are open
  vfio: Remove struct vfio_device_ops open/release

Max Gurtovoy (1):
  vfio: Introduce a vfio_uninit_group_dev() API call

Yishai Hadas (1):
  vfio/pci: Move to the device set infrastructure

 Documentation/driver-api/vfio.rst |   4 +-
 drivers/gpu/drm/i915/gvt/kvmgt.c  |   8 +-
 drivers/s390/cio/vfio_ccw_ops.c   |   8 +-
 drivers/s390/crypto/vfio_ap_ops.c |   8 +-
 drivers/vfio/fsl-mc/vfio_fsl_mc.c | 161 +-
 drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c|   6 +-
 drivers/vfio/fsl-mc/vfio_fsl_mc_private.h |   7 -
 drivers/vfio/mdev/vfio_mdev.c |  33 +-
 drivers/vfio/pci/vfio_pci.c   | 539 +++---
 drivers/vfio/pci/vfio_pci_private.h   |   7 -
 drivers/vfio/platform/vfio_platform_common.c  | 102 ++--
 drivers/vfio/platform/vfio_platform_private.h |   1 -
 drivers/vfio/vfio.c   | 142 -
 include/linux/mdev.h  |   9 +-
 include/linux/vfio.h  |  26 +-
 samples/vfio-mdev/mbochs.c|  40 +-
 samples/vfio-mdev/mdpy.c  |  40 +-
 samples/vfio-mdev/mtty.c  |  40 +-
 18 files changed, 509 insertions(+), 672 deletions(-)


base-commit: 3fb1712d85962f81265b5018922a2da13cdf6033
-- 
2.32.0



[Intel-gfx] [PATCH v4 13/14] vfio/gvt: Fix open/close when multiple device FDs are open

2021-08-05 Thread Jason Gunthorpe
The user can open multiple device FDs if it likes, however the open
function calls vfio_register_notifier() on device global state. Calling
vfio_register_notifier() twice will trigger a WARN_ON from
notifier_chain_register() and the first close will wrongly delete the
notifier and more.

Since these really want the new open/close_device() semantics just change
the function over.

Reviewed-by: Zhenyu Wang 
Reviewed-by: Cornelia Huck 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/i915/gvt/kvmgt.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 1ac98f8aba31e6..7efa386449d104 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -885,7 +885,7 @@ static int intel_vgpu_group_notifier(struct notifier_block 
*nb,
return NOTIFY_OK;
 }
 
-static int intel_vgpu_open(struct mdev_device *mdev)
+static int intel_vgpu_open_device(struct mdev_device *mdev)
 {
struct intel_vgpu *vgpu = mdev_get_drvdata(mdev);
struct kvmgt_vdev *vdev = kvmgt_vdev(vgpu);
@@ -1004,7 +1004,7 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
vgpu->handle = 0;
 }
 
-static void intel_vgpu_release(struct mdev_device *mdev)
+static void intel_vgpu_close_device(struct mdev_device *mdev)
 {
struct intel_vgpu *vgpu = mdev_get_drvdata(mdev);
 
@@ -1753,8 +1753,8 @@ static struct mdev_parent_ops intel_vgpu_ops = {
.create = intel_vgpu_create,
.remove = intel_vgpu_remove,
 
-   .open   = intel_vgpu_open,
-   .release= intel_vgpu_release,
+   .open_device= intel_vgpu_open_device,
+   .close_device   = intel_vgpu_close_device,
 
.read   = intel_vgpu_read,
.write  = intel_vgpu_write,
-- 
2.32.0



[Intel-gfx] [PATCH v4 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Jason Gunthorpe
vfio_pci_try_bus_reset() is triggering a reset of the entire_dev set if
any device within it has accumulated a needs_reset. This reset can only be
done once all of the drivers operating the PCI devices to be reset are in
a known safe state.

Make this clearer by directly operating on the dev_set instead of the
vfio_pci_device. Rename the function to vfio_pci_dev_set_try_reset().

Use the device list inside the dev_set to check that all drivers are in a
safe state instead of working backwards from the pci_device.

The dev_set->lock directly prevents devices from joining/leaving the set,
or changing their state, which further implies the pci_device cannot
change drivers or that the vfio_device be freed, eliminating the need for
get/put's.

If a pci_device to be reset is not in the dev_set then the reset cannot be
used as we can't know what the state of that driver is. Directly measure
this by checking that every pci_device is in the dev_set - which
effectively proves that VFIO drivers are attached to everything.

Remove the odd interaction around vfio_pci_set_power_state() - have the
only caller avoid its redundant vfio_pci_set_power_state() instead of
avoiding it inside vfio_pci_dev_set_try_reset().

This restructuring corrects a call to pci_dev_driver() without holding the
device_lock() and removes a hard wiring to _pci_driver.

Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/pci/vfio_pci.c | 182 +---
 1 file changed, 86 insertions(+), 96 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 5d6db93d6c680f..0147f04c91b2fb 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -223,7 +223,7 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device 
*vdev)
}
 }
 
-static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
+static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set);
 static void vfio_pci_disable(struct vfio_pci_device *vdev);
 static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data);
 
@@ -404,6 +404,9 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
int i, bar;
 
+   /* For needs_reset */
+   lockdep_assert_held(>vdev.dev_set->lock);
+
/* Stop the device from further DMA */
pci_clear_master(pdev);
 
@@ -487,9 +490,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 out:
pci_disable_device(pdev);
 
-   vfio_pci_try_bus_reset(vdev);
-
-   if (!disable_idle_d3)
+   if (!vfio_pci_dev_set_try_reset(vdev->vdev.dev_set) && !disable_idle_d3)
vfio_pci_set_power_state(vdev, PCI_D3hot);
 }
 
@@ -2145,36 +2146,6 @@ static struct pci_driver vfio_pci_driver = {
.err_handler= _err_handlers,
 };
 
-static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
-{
-   struct vfio_devices *devs = data;
-   struct vfio_device *device;
-   struct vfio_pci_device *vdev;
-
-   if (devs->cur_index == devs->max_index)
-   return -ENOSPC;
-
-   device = vfio_device_get_from_dev(>dev);
-   if (!device)
-   return -EINVAL;
-
-   if (pci_dev_driver(pdev) != _pci_driver) {
-   vfio_device_put(device);
-   return -EBUSY;
-   }
-
-   vdev = container_of(device, struct vfio_pci_device, vdev);
-
-   /* Fault if the device is not unused */
-   if (device->open_count) {
-   vfio_device_put(device);
-   return -EBUSY;
-   }
-
-   devs->devices[devs->cur_index++] = vdev;
-   return 0;
-}
-
 static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
 {
struct vfio_devices *devs = data;
@@ -2208,79 +2179,98 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct 
pci_dev *pdev, void *data)
return 0;
 }
 
+static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
+{
+   struct vfio_device_set *dev_set = data;
+   struct vfio_device *cur;
+
+   list_for_each_entry(cur, _set->device_list, dev_set_list)
+   if (cur->dev == >dev)
+   return 0;
+   return -EBUSY;
+}
+
 /*
- * If a bus or slot reset is available for the provided device and:
+ * vfio-core considers a group to be viable and will create a vfio_device even
+ * if some devices are bound to drivers like pci-stub or pcieport. Here we
+ * require all PCI devices to be inside our dev_set since that ensures they 
stay
+ * put and that every driver controlling the device can co-ordinate with the
+ * device reset.
+ *
+ * Returns the pci_dev to pass to pci_reset_bus() if every PCI device to be
+ * reset is inside the dev_set, and pci_reset_bus() can succeed. NULL 
otherwise.
+ */
+static struct pci_dev *
+vfio_pci_dev_set_resettable(struct vfio_device_set *dev_set)
+{
+   struct pci_dev *pdev;
+
+   lockdep_assert_held(_set->lock);
+

[Intel-gfx] [PATCH v4 02/14] vfio/mbochs: Fix missing error unwind of mbochs_used_mbytes

2021-08-05 Thread Jason Gunthorpe
Convert mbochs to use an atomic scheme for this like mtty was changed
into. The atomic fixes various race conditions with probing. Add the
missing error unwind. Also add the missing kfree of mdev_state->pages.

Fixes: 681c1615f891 ("vfio/mbochs: Convert to use vfio_register_group_dev()")
Reported-by: Cornelia Huck 
Co-developed-by: Alex Williamson 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 samples/vfio-mdev/mbochs.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index e81b875b4d87b4..3e885be7d076ad 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -129,7 +129,7 @@ static dev_tmbochs_devt;
 static struct class*mbochs_class;
 static struct cdev mbochs_cdev;
 static struct device   mbochs_dev;
-static int mbochs_used_mbytes;
+static atomic_t mbochs_avail_mbytes;
 static const struct vfio_device_ops mbochs_dev_ops;
 
 struct vfio_region_info_ext {
@@ -507,18 +507,22 @@ static int mbochs_reset(struct mdev_state *mdev_state)
 
 static int mbochs_probe(struct mdev_device *mdev)
 {
+   int avail_mbytes = atomic_read(_avail_mbytes);
const struct mbochs_type *type =
_types[mdev_get_type_group_id(mdev)];
struct device *dev = mdev_dev(mdev);
struct mdev_state *mdev_state;
int ret = -ENOMEM;
 
-   if (type->mbytes + mbochs_used_mbytes > max_mbytes)
-   return -ENOMEM;
+   do {
+   if (avail_mbytes < type->mbytes)
+   return -ENOSPC;
+   } while (!atomic_try_cmpxchg(_avail_mbytes, _mbytes,
+avail_mbytes - type->mbytes));
 
mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
if (mdev_state == NULL)
-   return -ENOMEM;
+   goto err_avail;
vfio_init_group_dev(_state->vdev, >dev, _dev_ops);
 
mdev_state->vconfig = kzalloc(MBOCHS_CONFIG_SPACE_SIZE, GFP_KERNEL);
@@ -549,17 +553,17 @@ static int mbochs_probe(struct mdev_device *mdev)
mbochs_create_config_space(mdev_state);
mbochs_reset(mdev_state);
 
-   mbochs_used_mbytes += type->mbytes;
-
ret = vfio_register_group_dev(_state->vdev);
if (ret)
goto err_mem;
dev_set_drvdata(>dev, mdev_state);
return 0;
-
 err_mem:
+   kfree(mdev_state->pages);
kfree(mdev_state->vconfig);
kfree(mdev_state);
+err_avail:
+   atomic_add(type->mbytes, _avail_mbytes);
return ret;
 }
 
@@ -567,8 +571,8 @@ static void mbochs_remove(struct mdev_device *mdev)
 {
struct mdev_state *mdev_state = dev_get_drvdata(>dev);
 
-   mbochs_used_mbytes -= mdev_state->type->mbytes;
vfio_unregister_group_dev(_state->vdev);
+   atomic_add(mdev_state->type->mbytes, _avail_mbytes);
kfree(mdev_state->pages);
kfree(mdev_state->vconfig);
kfree(mdev_state);
@@ -1351,7 +1355,7 @@ static ssize_t available_instances_show(struct mdev_type 
*mtype,
 {
const struct mbochs_type *type =
_types[mtype_get_type_group_id(mtype)];
-   int count = (max_mbytes - mbochs_used_mbytes) / type->mbytes;
+   int count = atomic_read(_avail_mbytes) / type->mbytes;
 
return sprintf(buf, "%d\n", count);
 }
@@ -1433,6 +1437,8 @@ static int __init mbochs_dev_init(void)
 {
int ret = 0;
 
+   atomic_set(_avail_mbytes, max_mbytes);
+
ret = alloc_chrdev_region(_devt, 0, MINORMASK + 1, MBOCHS_NAME);
if (ret < 0) {
pr_err("Error: failed to register mbochs_dev, err: %d\n", ret);
-- 
2.32.0



[Intel-gfx] [PATCH v4 03/14] vfio: Introduce a vfio_uninit_group_dev() API call

2021-08-05 Thread Jason Gunthorpe
From: Max Gurtovoy 

This pairs with vfio_init_group_dev() and allows undoing any state that is
stored in the vfio_device unrelated to registration. Add appropriately
placed calls to all the drivers.

The following patch will use this to add pre-registration state for the
device set.

Signed-off-by: Max Gurtovoy 
Reviewed-by: Cornelia Huck 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 Documentation/driver-api/vfio.rst|  4 ++-
 drivers/vfio/fsl-mc/vfio_fsl_mc.c|  7 ++---
 drivers/vfio/mdev/vfio_mdev.c| 13 +++---
 drivers/vfio/pci/vfio_pci.c  |  6 +++--
 drivers/vfio/platform/vfio_platform_common.c |  7 +++--
 drivers/vfio/vfio.c  |  5 
 include/linux/vfio.h |  1 +
 samples/vfio-mdev/mbochs.c   |  2 ++
 samples/vfio-mdev/mdpy.c | 25 ++
 samples/vfio-mdev/mtty.c | 27 
 10 files changed, 64 insertions(+), 33 deletions(-)

diff --git a/Documentation/driver-api/vfio.rst 
b/Documentation/driver-api/vfio.rst
index 606eed8823ceab..c663b6f978255b 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -255,11 +255,13 @@ vfio_unregister_group_dev() respectively::
void vfio_init_group_dev(struct vfio_device *device,
struct device *dev,
const struct vfio_device_ops *ops);
+   void vfio_uninit_group_dev(struct vfio_device *device);
int vfio_register_group_dev(struct vfio_device *device);
void vfio_unregister_group_dev(struct vfio_device *device);
 
 The driver should embed the vfio_device in its own structure and call
-vfio_init_group_dev() to pre-configure it before going to registration.
+vfio_init_group_dev() to pre-configure it before going to registration
+and call vfio_uninit_group_dev() after completing the un-registration.
 vfio_register_group_dev() indicates to the core to begin tracking the
 iommu_group of the specified dev and register the dev as owned by a VFIO bus
 driver. Once vfio_register_group_dev() returns it is possible for userspace to
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c 
b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 90cad109583b80..122997c61ba450 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -627,7 +627,7 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 
ret = vfio_fsl_mc_reflck_attach(vdev);
if (ret)
-   goto out_kfree;
+   goto out_uninit;
 
ret = vfio_fsl_mc_init_device(vdev);
if (ret)
@@ -657,7 +657,8 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
vfio_fsl_uninit_device(vdev);
 out_reflck:
vfio_fsl_mc_reflck_put(vdev->reflck);
-out_kfree:
+out_uninit:
+   vfio_uninit_group_dev(>vdev);
kfree(vdev);
 out_group_put:
vfio_iommu_group_put(group, dev);
@@ -675,7 +676,7 @@ static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
dprc_remove_devices(mc_dev, NULL, 0);
vfio_fsl_uninit_device(vdev);
vfio_fsl_mc_reflck_put(vdev->reflck);
-
+   vfio_uninit_group_dev(>vdev);
kfree(vdev);
vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 39ef7489fe4719..a5c77ccb24f70a 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -120,12 +120,16 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 
vfio_init_group_dev(vdev, >dev, _mdev_dev_ops);
ret = vfio_register_group_dev(vdev);
-   if (ret) {
-   kfree(vdev);
-   return ret;
-   }
+   if (ret)
+   goto out_uninit;
+
dev_set_drvdata(>dev, vdev);
return 0;
+
+out_uninit:
+   vfio_uninit_group_dev(vdev);
+   kfree(vdev);
+   return ret;
 }
 
 static void vfio_mdev_remove(struct mdev_device *mdev)
@@ -133,6 +137,7 @@ static void vfio_mdev_remove(struct mdev_device *mdev)
struct vfio_device *vdev = dev_get_drvdata(>dev);
 
vfio_unregister_group_dev(vdev);
+   vfio_uninit_group_dev(vdev);
kfree(vdev);
 }
 
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 318864d5283782..fab3715d60d4ba 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -2022,7 +2022,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
 
ret = vfio_pci_reflck_attach(vdev);
if (ret)
-   goto out_free;
+   goto out_uninit;
ret = vfio_pci_vf_init(vdev);
if (ret)
goto out_reflck;
@@ -2059,7 +2059,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
vfio_pci_vf_uninit(vdev);
 out_reflck:

[Intel-gfx] [PATCH v4 04/14] vfio: Provide better generic support for open/release vfio_device_ops

2021-08-05 Thread Jason Gunthorpe
Currently the driver ops have an open/release pair that is called once
each time a device FD is opened or closed. Add an additional set of
open/close_device() ops which are called when the device FD is opened for
the first time and closed for the last time.

An analysis shows that all of the drivers require this semantic. Some are
open coding it as part of their reflck implementation, and some are just
buggy and miss it completely.

To retain the current semantics PCI and FSL depend on, introduce the idea
of a "device set" which is a grouping of vfio_device's that share the same
lock around opening.

The device set is established by providing a 'set_id' pointer. All
vfio_device's that provide the same pointer will be joined to the same
singleton memory and lock across the whole set. This effectively replaces
the oddly named reflck.

After conversion the set_id will be sourced from:
 - A struct device from a fsl_mc_device (fsl)
 - A struct pci_slot (pci)
 - A struct pci_bus (pci)
 - The struct vfio_device (everything)

The design ensures that the above pointers are live as long as the
vfio_device is registered, so they form reliable unique keys to group
vfio_devices into sets.

This implementation uses xarray instead of searching through the driver
core structures, which simplifies the somewhat tricky locking in this
area.

Following patches convert all the drivers.

Signed-off-by: Yishai Hadas 
Reviewed-by: Cornelia Huck 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/mdev/vfio_mdev.c |  26 +-
 drivers/vfio/vfio.c   | 149 +-
 include/linux/mdev.h  |   2 +
 include/linux/vfio.h  |  21 +
 4 files changed, 174 insertions(+), 24 deletions(-)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index a5c77ccb24f70a..e12196ffd48718 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -17,13 +17,33 @@
 
 #include "mdev_private.h"
 
+static int vfio_mdev_open_device(struct vfio_device *core_vdev)
+{
+   struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
+   struct mdev_parent *parent = mdev->type->parent;
+
+   if (unlikely(!parent->ops->open_device))
+   return 0;
+
+   return parent->ops->open_device(mdev);
+}
+
+static void vfio_mdev_close_device(struct vfio_device *core_vdev)
+{
+   struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
+   struct mdev_parent *parent = mdev->type->parent;
+
+   if (likely(parent->ops->close_device))
+   parent->ops->close_device(mdev);
+}
+
 static int vfio_mdev_open(struct vfio_device *core_vdev)
 {
struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
struct mdev_parent *parent = mdev->type->parent;
 
if (unlikely(!parent->ops->open))
-   return -EINVAL;
+   return 0;
 
return parent->ops->open(mdev);
 }
@@ -44,7 +64,7 @@ static long vfio_mdev_unlocked_ioctl(struct vfio_device 
*core_vdev,
struct mdev_parent *parent = mdev->type->parent;
 
if (unlikely(!parent->ops->ioctl))
-   return -EINVAL;
+   return 0;
 
return parent->ops->ioctl(mdev, cmd, arg);
 }
@@ -100,6 +120,8 @@ static void vfio_mdev_request(struct vfio_device 
*core_vdev, unsigned int count)
 
 static const struct vfio_device_ops vfio_mdev_dev_ops = {
.name   = "vfio-mdev",
+   .open_device= vfio_mdev_open_device,
+   .close_device   = vfio_mdev_close_device,
.open   = vfio_mdev_open,
.release= vfio_mdev_release,
.ioctl  = vfio_mdev_unlocked_ioctl,
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index cc375df0fd5dda..9cc17768c42554 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -96,6 +96,79 @@ module_param_named(enable_unsafe_noiommu_mode,
 MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode.  
This mode provides no device isolation, no DMA translation, no host kernel 
protection, cannot be used for device assignment to virtual machines, requires 
RAWIO permissions, and will taint the kernel.  If you do not know what this is 
for, step away. (default: false)");
 #endif
 
+static DEFINE_XARRAY(vfio_device_set_xa);
+
+int vfio_assign_device_set(struct vfio_device *device, void *set_id)
+{
+   unsigned long idx = (unsigned long)set_id;
+   struct vfio_device_set *new_dev_set;
+   struct vfio_device_set *dev_set;
+
+   if (WARN_ON(!set_id))
+   return -EINVAL;
+
+   /*
+* Atomically acquire a singleton object in the xarray for this set_id
+*/
+   xa_lock(_device_set_xa);
+   dev_set = xa_load(_device_set_xa, idx);
+   if (dev_set)
+   goto found_get_ref;
+   xa_unlock(_device_set_xa);
+
+   new_dev_set = kzalloc(sizeof(*new_dev_set), GFP_KERNEL);
+   if (!new_dev_set)
+   return 

[Intel-gfx] [PATCH v2] drm/i915/gvt: Fix cached atomics setting for Windows VM

2021-08-05 Thread Zhenyu Wang
We've seen recent regression with host and windows VM running
simultaneously that cause gpu hang or even crash. Finally bisect to
commit 58586680ffad ("drm/i915: Disable atomics in L3 for gen9"),
which seems cached atomics behavior difference caused regression
issue.

This tries to add new scratch register handler and add those in mmio
save/restore list for context switch. No gpu hang produced with this one.

Cc: sta...@vger.kernel.org # 5.12+
Cc: "Xu, Terrence" 
Cc: "Bloomfield, Jon" 
Cc: "Ekstrand, Jason" 
Fixes: 58586680ffad ("drm/i915: Disable atomics in L3 for gen9")
Signed-off-by: Zhenyu Wang 
---
 drivers/gpu/drm/i915/gvt/handlers.c | 1 +
 drivers/gpu/drm/i915/gvt/mmio_context.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index 06024d321a1a..cde0a477fb49 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -3149,6 +3149,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
MMIO_D(_MMIO(0xb110), D_BDW);
+   MMIO_D(GEN9_SCRATCH_LNCF1, D_BDW_PLUS);
 
MMIO_F(_MMIO(0x24d0), 48, F_CMD_ACCESS | F_CMD_WRITE_PATCH, 0, 0,
D_BDW_PLUS, NULL, force_nonpriv_write);
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c 
b/drivers/gpu/drm/i915/gvt/mmio_context.c
index b8ac80765461..f776c470914d 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -105,6 +105,8 @@ static struct engine_mmio gen9_engine_mmio_list[] 
__cacheline_aligned = {
{RCS0, COMMON_SLICE_CHICKEN2, 0x, true}, /* 0x7014 */
{RCS0, GEN9_CS_DEBUG_MODE1, 0x, false}, /* 0x20ec */
{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
+   {RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
+   {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0x, true}, /* 0xe100 */
{RCS0, HALF_SLICE_CHICKEN2, 0x, true}, /* 0xe180 */
{RCS0, HALF_SLICE_CHICKEN3, 0x, true}, /* 0xe184 */
-- 
2.32.0.rc2



Re: [Intel-gfx] [PATCH v3 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Jason Gunthorpe
On Thu, Aug 05, 2021 at 11:33:11AM -0600, Alex Williamson wrote:
> > +static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
> > +{
> > +   struct vfio_device_set *dev_set = data;
> > +   struct vfio_device *cur;
> > +
> > +   lockdep_assert_held(_set->lock);
> > +
> > +   list_for_each_entry(cur, _set->device_list, dev_set_list)
> > +   if (cur->dev == >dev)
> > +   return 0;
> > +   return -EBUSY;
> > +}
> > +
> > +static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set)
> 
> Slight nit on the name here since we're essentially combining
> needs_reset along with the notion of the device being unused.  I'm not
> sure, maybe "should_reset"?  Otherwise it looks ok.  Thanks,

What I did is add a new function vfio_pci_dev_set_resettable() which
pulls in three parts of logic that can be be shared with the
VFIO_DEVICE_PCI_HOT_RESET change in the next patch. That leaves this
function as purely needs_reset.

In turn the VFIO_DEVICE_PCI_HOT_RESET patch gets the same treatment
where it becomes a dev_set centric API just like this.

I'll send it as a v4.

Thanks,
Jason


[Intel-gfx] ✗ Fi.CI.BAT: failure for Provide core infrastructure for managing open/release (rev9)

2021-08-05 Thread Patchwork
== Series Details ==

Series: Provide core infrastructure for managing open/release (rev9)
URL   : https://patchwork.freedesktop.org/series/92556/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10453 -> Patchwork_20778


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20778 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20778, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20778:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_rps@basic-api:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10453/fi-rkl-guc/igt@i915_pm_...@basic-api.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-rkl-guc/igt@i915_pm_...@basic-api.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][3]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-rkl-guc/igt@run...@aborted.html

  
Known issues


  Here are the changes found in Patchwork_20778 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-sdma:
- fi-kbl-7500u:   NOTRUN -> [SKIP][4] ([fdo#109271]) +30 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@amdgpu/amd_ba...@cs-sdma.html

  * igt@gem_exec_fence@basic-busy@bcs0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][5] ([fdo#109271]) +26 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@gem_exec_fence@basic-b...@bcs0.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#2190])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html
- fi-kbl-7500u:   NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#2190])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_rpm@basic-rte:
- fi-kbl-7500u:   NOTRUN -> [FAIL][8] ([i915#579])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@i915_pm_...@basic-rte.html
- fi-kbl-soraka:  NOTRUN -> [FAIL][9] ([i915#579])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][10] ([i915#1886] / [i915#2291])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][11] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-7500u:   NOTRUN -> [SKIP][12] ([fdo#109271] / [i915#533])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][13] ([fdo#109271] / [i915#533])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  
 Possible fixes 

  * igt@i915_selftest@live@hangcheck:
- {fi-hsw-gt1}:   [DMESG-WARN][14] ([i915#3303]) -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10453/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (35 -> 34)
--

  Additional (2): fi-kbl-soraka fi-kbl-7500u 
  Missing(3): fi-bdw-samus fi-bsw-cyan bat-jsl-1 



[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Update small joiner ram size

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915: Update small joiner ram size
URL   : https://patchwork.freedesktop.org/series/93410/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10449_full -> Patchwork_20771_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20771_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
- shard-skl:  [PASS][1] -> [INCOMPLETE][2] ([i915#198])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-skl1/igt@gem_ctx_isolation@preservation...@rcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-skl8/igt@gem_ctx_isolation@preservation...@rcs0.html

  * igt@gem_ctx_persistence@legacy-engines-hostile-preempt:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-hostile-preempt.html

  * igt@gem_eio@in-flight-contexts-10ms:
- shard-tglb: [PASS][4] -> [TIMEOUT][5] ([i915#3063])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-tglb1/igt@gem_...@in-flight-contexts-10ms.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb2/igt@gem_...@in-flight-contexts-10ms.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][6] -> [TIMEOUT][7] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-tglb5/igt@gem_...@unwedge-stress.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb6/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-none@vecs0:
- shard-apl:  NOTRUN -> [FAIL][8] ([i915#2842] / [i915#3468])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl6/igt@gem_exec_fair@basic-n...@vecs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-tglb2/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb7/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-kbl:  [PASS][11] -> [SKIP][12] ([fdo#109271])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-kbl6/igt@gem_exec_fair@basic-p...@vcs1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-kbl3/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][13] -> [FAIL][14] ([i915#2842]) +1 similar 
issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-glk3/igt@gem_exec_fair@basic-throt...@rcs0.html
- shard-iclb: [PASS][15] -> [FAIL][16] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-iclb1/igt@gem_exec_fair@basic-throt...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-iclb5/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_render_copy@linear-to-vebox-y-tiled:
- shard-apl:  NOTRUN -> [SKIP][17] ([fdo#109271]) +223 similar 
issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl6/igt@gem_render_c...@linear-to-vebox-y-tiled.html

  * igt@gem_userptr_blits@input-checking:
- shard-apl:  NOTRUN -> [DMESG-WARN][18] ([i915#3002])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl2/igt@gem_userptr_bl...@input-checking.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
- shard-iclb: NOTRUN -> [SKIP][19] ([i915#3297])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-iclb7/igt@gem_userptr_bl...@unsync-unmap-cycles.html

  * igt@gen7_exec_parse@oacontrol-tracking:
- shard-iclb: NOTRUN -> [SKIP][20] ([fdo#109289])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-iclb7/igt@gen7_exec_pa...@oacontrol-tracking.html

  * igt@i915_pm_dc@dc6-psr:
- shard-tglb: NOTRUN -> [FAIL][21] ([i915#454])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb7/igt@i915_pm...@dc6-psr.html

  * igt@i915_pm_rpm@basic-rte:
- shard-apl:  NOTRUN -> [FAIL][22] ([i915#579])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl6/igt@i915_pm_...@basic-rte.html

  * igt@i915_pm_rpm@gem-idle:
- shard-tglb: NOTRUN -> [SKIP][23] ([i915#579])
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb7/igt@i915_pm_...@gem-idle.html

  * 

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Provide core infrastructure for managing open/release (rev9)

2021-08-05 Thread Patchwork
== Series Details ==

Series: Provide core infrastructure for managing open/release (rev9)
URL   : https://patchwork.freedesktop.org/series/92556/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
5d99eb3c1b1c vfio/samples: Remove module get/put
-:57: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 31 lines checked
fb9d431ac9b4 vfio/mbochs: Fix missing error unwind of mbochs_used_mbytes
-:12: WARNING:BAD_SIGN_OFF: Co-developed-by: must be immediately followed by 
Signed-off-by:
#12: 
Co-developed-by: Alex Williamson 
Reviewed-by: Christoph Hellwig 
-:103: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 2 warnings, 0 checks, 78 lines checked
14a1353260e5 vfio: Introduce a vfio_uninit_group_dev() API call
2085f40d11f9 vfio: Provide better generic support for open/release 
vfio_device_ops
-:260: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#260: FILE: drivers/vfio/vfio.c:1483:
+   fdno = ret = get_unused_fd_flags(O_CLOEXEC);

-:358: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#358: FILE: include/linux/vfio.h:25:
+   struct mutex lock;

-:402: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 2 checks, 325 lines checked
abe0ffd61cfa vfio/samples: Delete useless open/close
-:98: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 66 lines checked
5d682856d46f vfio/fsl: Move to the device set infrastructure
-:300: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 256 lines checked
d06f6c571ebe vfio/platform: Use open_device() instead of open coding a refcnt 
scheme
-:51: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#51: FILE: drivers/vfio/platform/vfio_platform_common.c:230:
+   dev_warn(

-:105: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#105: FILE: drivers/vfio/platform/vfio_platform_common.c:261:
+   dev_warn(

-:149: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 2 checks, 120 lines checked
aaf4591d0469 vfio/pci: Move to the device set infrastructure
1d22f1ed155e vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set
-:276: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 231 lines checked
1309c4bd1733 vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device 
set
-:21: WARNING:BAD_SIGN_OFF: Non-standard signature: Reviewed-off-by:
#21: 
Reviewed-off-by: Christoph Hellwig 

-:309: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 2 warnings, 0 checks, 274 lines checked
c5854ee895cf vfio/mbochs: Fix close when multiple device FDs are open
-:37: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 16 lines checked
486294d2c582 vfio/ap, ccw: Fix open/close when multiple device FDs are open
-:84: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 52 lines checked
b75c4b452009 vfio/gvt: Fix open/close when multiple device FDs are open
-:52: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 26 lines checked
16a862df1e07 vfio: Remove struct vfio_device_ops open/release
-:143: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 107 lines checked




Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-05 Thread Rob Clark
On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  wrote:
>
> Originally drm_sched_job_init was the point of no return, after which
> drivers must submit a job. I've split that up, which allows us to fix
> this issue pretty easily.
>
> Only thing we have to take care of is to not skip to error paths after
> that. Other drivers do this the same for out-fence and similar things.
>
> Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> Cc: Rob Clark 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-arm-...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: freedr...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 6d6c44f0e1f3..d0ed4ddc509e 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> drm_device *dev,
> return ERR_PTR(ret);
> }
>
> -   /* FIXME: this is way too early */
> -   drm_sched_job_arm(>base);
> -
> xa_init_flags(>deps, XA_FLAGS_ALLOC);
>
> kref_init(>ref);
> @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
>
> submit->user_fence = dma_fence_get(>base.s_fence->finished);
>
> +   /* point of no return, we _have_ to submit no matter what */
> +   drm_sched_job_arm(>base);
> +
> /*
>  * Allocate an id which can be used by WAIT_FENCE ioctl to map back
>  * to the underlying fence.
> @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
> if (submit->fence_id < 0) {
> ret = submit->fence_id = 0;
> submit->fence_id = 0;
> -   goto out;
> }
>
> -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> struct sync_file *sync_file = 
> sync_file_create(submit->user_fence);
> if (!sync_file) {
> ret = -ENOMEM;
> -   goto out;
> +   } else {
> +   fd_install(out_fence_fd, sync_file->file);
> +   args->fence_fd = out_fence_fd;
> }
> -   fd_install(out_fence_fd, sync_file->file);
> -   args->fence_fd = out_fence_fd;

I wonder if instead we should (approximately) undo "drm/msm/submit:
Simplify out-fence-fd handling" so that the point that it could fail
is moved up ahead of the drm_sched_job_arm()?

Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
a quick look, it looks like it won't, but I'm still playing catchup
and haven't had a chance to look at your entire series.  If it doesn't
work before drm_sched_job_arm(), then there is really no way to
prevent a error path between the fence-init and job-submit.

But, prior to your series, wouldn't a failure after
drm_sched_job_init() but before the job is submitted just burn a
fence-id, and otherwise carry on it's merry way?

BR,
-R

> }
>
> submit_attach_object_fences(submit);
> --
> 2.32.0
>


Re: [Intel-gfx] [PATCH 4/4] DO_NOT_MERGE: drm/i915/display: Enable PSR2 selective fetch by default

2021-08-05 Thread Souza, Jose
On Thu, 2021-08-05 at 21:26 +0300, Gwan-gyeong Mun wrote:
> 
> On 8/3/21 8:18 PM, Souza, Jose wrote:
> > On Tue, 2021-08-03 at 14:17 +0300, Gwan-gyeong Mun wrote:
> > > 
> > > On 7/31/21 3:10 AM, José Roberto de Souza wrote:
> > > > Only to execute tests with PSR2 selective fetch enabled and check what
> > > > is broken.
> > > > 
> > > > IGT tests know to fail with this:
> > > > - kms_cursor_legacy: all tests that checks if evasion happend, I have
> > > > fix for it making cursor_slowpath() returns true for display 12+.
> > > > 
> > > > - kms_psr2_su: The pageflip test, it needs to have the damage clip set
> > > > otherwise it will update the whole screen and the selective blocks
> > > > will not match with expected.
> > > > 
> > > kms_psr2_su is a test case for intel PSR2 HW tracking and kms_psr2_sf is
> > > used as a test for intel PSR2 manual tracking. Is it necessary to modify
> > > kms_psr2_su for testing PSR2 manual tracking?
> > 
> > kms_psr2_su is to test that PSR2 is sending selective updates, just adding 
> > a couple of lines we can make it work with selective fetch.
> > 
> > > > - kms_psr: psr2_*_(mmap_gtt, mmap_cpu, blt and render), all those
> > > > tests should be dropped or skipped for display 12+.
> > > > 
> > > Could you explain in more detail why we need to skip on display 12+?
> > 
> > This are stuff that would end up calling intel_psr_invalidate/flush().
> > 
> 
> Thanks for the explanation.
> And there is an issue confirmed in local tests, so I leave additional 
> comments.
> > > 
> > > > Signed-off-by: José Roberto de Souza 
> > > > ---
> > > >drivers/gpu/drm/i915/display/intel_psr.c | 9 -
> > > >drivers/gpu/drm/i915/i915_params.h   | 2 +-
> > > >2 files changed, 1 insertion(+), 10 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
> > > > b/drivers/gpu/drm/i915/display/intel_psr.c
> > > > index 894a2d35668a2..e128f0c2aeecc 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > > > @@ -877,15 +877,6 @@ static bool intel_psr2_config_valid(struct 
> > > > intel_dp *intel_dp,
> > > >return false;
> > > >}
> > > > 
> > > > -/*
> > > > - * We are missing the implementation of some workarounds to enabled 
> > > > PSR2
> > > > - * in Alderlake_P, until ready PSR2 should be kept disabled.
> > > > - */
> > > > -if (IS_ALDERLAKE_P(dev_priv)) {
> > > > -drm_dbg_kms(_priv->drm, "PSR2 is missing the implementation of 
> > > > workarounds\n");
> > > > -return false;
> > > > -}
> > > > -
> > > >if (!transcoder_has_psr2(dev_priv, crtc_state->cpu_transcoder)) {
> > > >drm_dbg_kms(_priv->drm,
> > > >"PSR2 not supported in transcoder %s\n",
> > > > diff --git a/drivers/gpu/drm/i915/i915_params.h 
> > > > b/drivers/gpu/drm/i915/i915_params.h
> > > > index f27eceb82c0f5..8d725b64592d8 100644
> > > > --- a/drivers/gpu/drm/i915/i915_params.h
> > > > +++ b/drivers/gpu/drm/i915/i915_params.h
> > > > @@ -55,7 +55,7 @@ struct drm_printer;
> > > >param(int, enable_fbc, -1, 0600) \
> > > >param(int, enable_psr, -1, 0600) \
> > > >param(bool, psr_safest_params, false, 0400) \
> > > > -param(bool, enable_psr2_sel_fetch, false, 0400) \
> > > > +param(bool, enable_psr2_sel_fetch, true, 0400) \
> If we do not modify this part and do not enable it by default at boot 
> time as shown in the original code below,
> param(bool, enable_psr2_sel_fetch, false, 0400) \
> 
> when we execute the kms_psr2_sf test case of igt, the FIFO underrun as 
> below still occurs.
> 
> i915 :00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,
> 
> When PSR2 panel is used, PSR1 is enabled by default when 
> enable_psr2_sel_fetch is not enabled by default.
> And when kms_psr2_sf is executed, the mode is changed to PSR2, and when 
> kms_psr2_sf is terminated, PSR2 is deactivated and PSR1 is re-enabled. 
> At this point. I suspect there is a problem.

Was able to reproduce this even with enable_psr2_sel_fetch set to true.
Added some debug messages to intel_psr_exit() and intel_psr_activate() and 
those functions are not called and the underrun still happens.

Could be a regression recently introduced because I was not seeing this 
underrun a few weeks ago.
Anyways this underrun happens with and without(just doing the changes to allow 
PSR2 in alderlake-P in intel_psr2_config_valid()) this patches.

> 
> > > >param(int, disable_power_well, -1, 0400) \
> > > >param(int, enable_ips, 1, 0600) \
> > > >param(int, invert_brightness, 0, 0600) \
> > > > 
> > 



Re: [Intel-gfx] [PATCH v5 07/20] drm/panfrost: use scheduler dependency tracking

2021-08-05 Thread Alyssa Rosenzweig
Acked-by: Alyssa Rosenzweig 

On Thu, Aug 05, 2021 at 12:46:52PM +0200, Daniel Vetter wrote:
> Just deletes some code that's now more shared.
> 
> Note that thanks to the split into drm_sched_job_init/arm we can now
> easily pull the _init() part from under the submission lock way ahead
> where we're adding the sync file in-fences as dependencies.
> 
> v2: Correctly clean up the partially set up job, now that job_init()
> and job_arm() are apart (Emma).
> 
> v3: Rebased over renamed functions for adding depdencies
> 
> Acked-by: Emma Anholt 
> Reviewed-by: Steven Price  (v3)
> Signed-off-by: Daniel Vetter 
> Cc: Rob Herring 
> Cc: Tomeu Vizoso 
> Cc: Steven Price 
> Cc: Alyssa Rosenzweig 
> Cc: Sumit Semwal 
> Cc: "Christian K??nig" 
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Cc: Emma Anholt 
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 16 ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 38 -
>  drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
>  3 files changed, 18 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 1ffaef5ec5ff..16212b6b202e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
>   if (ret)
>   goto fail;
>  
> - ret = drm_gem_fence_array_add(>deps, fence);
> + ret = drm_sched_job_add_dependency(>base, fence);
>  
>   if (ret)
>   goto fail;
> @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   struct drm_panfrost_submit *args = data;
>   struct drm_syncobj *sync_out = NULL;
>   struct panfrost_job *job;
> - int ret = 0;
> + int ret = 0, slot;
>  
>   if (!args->jc)
>   return -EINVAL;
> @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device 
> *dev, void *data,
>  
>   kref_init(>refcount);
>  
> - xa_init_flags(>deps, XA_FLAGS_ALLOC);
> -
>   job->pfdev = pfdev;
>   job->jc = args->jc;
>   job->requirements = args->requirements;
>   job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
>   job->file_priv = file->driver_priv;
>  
> + slot = panfrost_job_get_slot(job);
> +
> + ret = drm_sched_job_init(>base,
> +  >file_priv->sched_entity[slot],
> +  NULL);
> + if (ret)
> + goto fail_job_put;
> +
>   ret = panfrost_copy_in_sync(dev, file, args, job);
>   if (ret)
>   goto fail_job;
> @@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   drm_syncobj_replace_fence(sync_out, job->render_done_fence);
>  
>  fail_job:
> + drm_sched_job_cleanup(>base);
> +fail_job_put:
>   panfrost_job_put(job);
>  fail_out_sync:
>   if (sync_out)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 4bc962763e1f..a98f507dc779 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct 
> panfrost_device *pfdev, in
>   return >base;
>  }
>  
> -static int panfrost_job_get_slot(struct panfrost_job *job)
> +int panfrost_job_get_slot(struct panfrost_job *job)
>  {
>   /* JS0: fragment jobs.
>* JS1: vertex/tiler jobs
> @@ -242,13 +242,14 @@ static void panfrost_job_hw_submit(struct panfrost_job 
> *job, int js)
>  
>  static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> int bo_count,
> -   struct xarray *deps)
> +   struct drm_sched_job *job)
>  {
>   int i, ret;
>  
>   for (i = 0; i < bo_count; i++) {
>   /* panfrost always uses write mode in its current uapi */
> - ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> + ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
> +   true);
>   if (ret)
>   return ret;
>   }
> @@ -269,31 +270,21 @@ static void panfrost_attach_object_fences(struct 
> drm_gem_object **bos,
>  int panfrost_job_push(struct panfrost_job *job)
>  {
>   struct panfrost_device *pfdev = job->pfdev;
> - int slot = panfrost_job_get_slot(job);
> - struct drm_sched_entity *entity = >file_priv->sched_entity[slot];
>   struct ww_acquire_ctx acquire_ctx;
>   int ret = 0;
>  
> -
>   ret = drm_gem_lock_reservations(job->bos, job->bo_count,
>   _ctx);
>   if (ret)
>   return ret;
>  
>   

Re: [Intel-gfx] [PATCH] drm/aperture: Pass DRM driver structure instead of driver name

2021-08-05 Thread Dmitry Baryshkov

On 29/06/2021 16:58, Thomas Zimmermann wrote:

Print the name of the DRM driver when taking over fbdev devices. Makes
the output to dmesg more consistent. Note that the driver name is only
used for printing a string to the kernel log. No UAPI is affected by this
change.

Signed-off-by: Thomas Zimmermann 
---


[...]


  drivers/gpu/drm/msm/msm_fbdev.c   |  2 +-


Reviewed-by: Dmitry Baryshkov 


  drivers/gpu/drm/nouveau/nouveau_drm.c |  2 +-
  drivers/gpu/drm/qxl/qxl_drv.c |  2 +-
  drivers/gpu/drm/radeon/radeon_drv.c   |  2 +-
  drivers/gpu/drm/rockchip/rockchip_drm_drv.c   |  2 +-
  drivers/gpu/drm/sun4i/sun4i_drv.c |  2 +-
  drivers/gpu/drm/tegra/drm.c   |  2 +-
  drivers/gpu/drm/tiny/cirrus.c |  2 +-
  drivers/gpu/drm/vboxvideo/vbox_drv.c  |  2 +-
  drivers/gpu/drm/vc4/vc4_drv.c |  2 +-
  drivers/gpu/drm/virtio/virtgpu_drv.c  |  2 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c   |  2 +-
  include/drm/drm_aperture.h| 14 +-
  23 files changed, 43 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6f30c525caac..accf9c1b967a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1278,7 +1278,7 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
  #endif
  
  	/* Get rid of things like offb */

-   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"amdgpudrmfb");
+   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
_kms_driver);
if (ret)
return ret;
  
diff --git a/drivers/gpu/drm/armada/armada_drv.c b/drivers/gpu/drm/armada/armada_drv.c

index dab0a1f0983b..31925ae3ab72 100644
--- a/drivers/gpu/drm/armada/armada_drv.c
+++ b/drivers/gpu/drm/armada/armada_drv.c
@@ -95,7 +95,7 @@ static int armada_drm_bind(struct device *dev)
}
  
  	/* Remove early framebuffers */

-   ret = drm_aperture_remove_framebuffers(false, "armada-drm-fb");
+   ret = drm_aperture_remove_framebuffers(false, _drm_driver);
if (ret) {
dev_err(dev, "[" DRM_NAME ":%s] can't kick out simple-fb: %d\n",
__func__, ret);
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index 5aa452b4efe6..86d5cd7b6318 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -100,7 +100,7 @@ static int ast_remove_conflicting_framebuffers(struct 
pci_dev *pdev)
primary = pdev->resource[PCI_ROM_RESOURCE].flags & 
IORESOURCE_ROM_SHADOW;
  #endif
  
-	return drm_aperture_remove_conflicting_framebuffers(base, size, primary, "astdrmfb");

+   return drm_aperture_remove_conflicting_framebuffers(base, size, primary, 
_driver);
  }
  
  static int ast_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)

diff --git a/drivers/gpu/drm/bochs/bochs_drv.c 
b/drivers/gpu/drm/bochs/bochs_drv.c
index c828cadbabff..0d232b44ecd7 100644
--- a/drivers/gpu/drm/bochs/bochs_drv.c
+++ b/drivers/gpu/drm/bochs/bochs_drv.c
@@ -110,7 +110,7 @@ static int bochs_pci_probe(struct pci_dev *pdev,
return -ENOMEM;
}
  
-	ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "bochsdrmfb");

+   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
_driver);
if (ret)
return ret;
  
diff --git a/drivers/gpu/drm/drm_aperture.c b/drivers/gpu/drm/drm_aperture.c

index 9335d9d6cf9a..9ac39cf11694 100644
--- a/drivers/gpu/drm/drm_aperture.c
+++ b/drivers/gpu/drm/drm_aperture.c
@@ -33,6 +33,10 @@
   *
   * .. code-block:: c
   *
+ * static const struct drm_driver example_driver = {
+ * ...
+ * };
+ *
   *static int remove_conflicting_framebuffers(struct pci_dev *pdev)
   *{
   *bool primary = false;
@@ -46,7 +50,7 @@
   *#endif
   *
   *return drm_aperture_remove_conflicting_framebuffers(base, size, 
primary,
- * "example 
driver");
+ * 
_driver);
   *}
   *
   *static int probe(struct pci_dev *pdev)
@@ -274,7 +278,7 @@ static void drm_aperture_detach_drivers(resource_size_t 
base, resource_size_t si
   * @base: the aperture's base address in physical memory
   * @size: aperture size in bytes
   * @primary: also kick vga16fb if present
- * @name: requesting driver name
+ * @req_driver: requesting DRM driver
   *
   * This function removes graphics device drivers which use memory range 
described by
   * @base and @size.
@@ -283,7 +287,7 @@ static void drm_aperture_detach_drivers(resource_size_t 
base, resource_size_t si
   * 0 on success, or a negative errno code otherwise
   */
  int drm_aperture_remove_conflicting_framebuffers(resource_size_t base, 
resource_size_t size,
-

[Intel-gfx] ✗ Fi.CI.BAT: failure for remove rcu support from i915_address_space (rev5)

2021-08-05 Thread Patchwork
== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10451 -> Patchwork_20777


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20777 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20777, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20777:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
Known issues


  Here are the changes found in Patchwork_20777 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@execlists:
- fi-bsw-nick:[PASS][3] -> [INCOMPLETE][4] ([i915#2940])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-bsw-nick/igt@i915_selftest@l...@execlists.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-bsw-nick/igt@i915_selftest@l...@execlists.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [PASS][5] -> [FAIL][6] ([i915#1372])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  * igt@runner@aborted:
- fi-bsw-nick:NOTRUN -> [FAIL][7] ([fdo#109271] / [i915#1436])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-bsw-nick/igt@run...@aborted.html

  
 Possible fixes 

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [DMESG-WARN][8] ([i915#2203] / [i915#2868]) -> 
[PASS][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940


Participating hosts (40 -> 35)
--

  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan bat-jsl-1 fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10451 -> Patchwork_20777

  CI-20190529: 20190529
  CI_DRM_10451: 3bea0ad83735904d380d83bcca30557268acf887 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20777: 57adc91192e34f34d12cce813f1991033826e70c @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

57adc91192e3 drm/i915: Stop rcu support for i915_address_space
89f791357ea9 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
04ad005ea013 drm/i915: Drop __rcu from gem_context->vm
ba6c948d717d drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
b2e1515de24d drm/i915: Add i915_gem_context_is_full_ppgtt
07bcc7e00033 drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
d23b18f97bd9 drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
0b21f453cdfb drm/i915: Drop code to handle set-vm races from execbuf

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/index.html


Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Christian König

Am 05.08.21 um 16:07 schrieb Daniel Vetter:

On Thu, Aug 5, 2021 at 3:44 PM Christian König  wrote:

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
to be moved into drm_sched_job_arm, which made me realize that the
job->id definitely needs to be moved too.

Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 

At least the amdgpu parts look ok of hand, but I can't judge the rest I
think.

The thing that really scares me here and that I got wrong a few times
is the cleanup for drm_sched_job at the various points. Can you give
those parts in drm/scheduler/ a full review pls, just to make sure? I
can note that in the tag ofc, just like a bit more confidence here
that it's not busted :-)


I can take another look, but I won't have time for that in the next two 
weeks - vacation and kid starting school.


Christian.




So only Acked-by: Christian König 

Thanks, Daniel


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
   drivers/gpu/drm/lima/lima_sched.c|  2 +
   drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
   drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
   drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
   drivers/gpu/drm/scheduler/sched_main.c   | 69 
   drivers/gpu/drm/v3d/v3d_gem.c|  2 +
   include/drm/gpu_scheduler.h  |  7 ++-
   11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
   if (r)
   goto error_unlock;

+ drm_sched_job_arm(>base);
+
   /* No memory allocation is allowed while holding the notifier lock.
* The lock is held until amdgpu_cs_submit is finished and fence is
* added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
   if (r)
   return r;

+ drm_sched_job_arm(>base);
+
   *f = dma_fence_get(>base.s_fence->finished);
   amdgpu_job_free_resources(job);
   

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for remove rcu support from i915_address_space (rev5)

2021-08-05 Thread Patchwork
== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34: warning: incorrect type 
in argument 1 (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)




[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for remove rcu support from i915_address_space (rev5)

2021-08-05 Thread Patchwork
== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0b21f453cdfb drm/i915: Drop code to handle set-vm races from execbuf
-:17: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:46: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 2 warnings, 0 checks, 12 lines checked
d23b18f97bd9 drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
-:148: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 80 lines checked
07bcc7e00033 drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
-:54: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 23 lines checked
b2e1515de24d drm/i915: Add i915_gem_context_is_full_ppgtt
-:105: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 53 lines checked
ba6c948d717d drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#12: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:61: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 18 lines checked
04ad005ea013 drm/i915: Drop __rcu from gem_context->vm
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#11: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:23: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#23: 
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

-:42: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit a4e7ccdac38e ("drm/i915: Move 
context management under GEM")'
#42: 
commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1

-:345: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 2 errors, 2 warnings, 0 checks, 232 lines checked
89f791357ea9 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
-:15: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit aabbe344dc3c ("drm/i915: Use RCU 
for unlocked vm_idr lookup")'
#15: 
commit aabbe344dc3ca5f7d8263a02608ba6179e8a4499

-:52: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 13 lines checked
57adc91192e3 drm/i915: Stop rcu support for i915_address_space
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#11: 
- i915_dpt has very simple lifetime (somehow we create a display pagetable vm

-:27: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit cf977e18610e ("drm/i915/gem: 
Spring clean debugfs")'
#27: 
commit cf977e18610e66e48c31619e7e0cfa871be9eada

-:35: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit db80a1294c23 ("drm/i915/gem: 
Remove per-client stats from debugfs/i915_gem_objects")'
#35: 
commit db80a1294c231b6ac725085f046bb2931e00c9db

-:47: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#47: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:59: WARNING:TYPO_SPELLING: 'Preceeding' may be misspelled - perhaps 
'Preceding'?
#59: 
  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  ^^


Re: [Intel-gfx] [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-08-05 Thread Thomas Zimmermann

Hi

Am 23.07.21 um 09:36 schrieb Daniel Vetter:


The real fix is to get at the architecture-specific wc allocator, which is
currently not something that's exposed, but hidden within the dma api. I
think having this stick out like this is better than hiding it behind fake
generic code (like we do with drm_clflush, which defacto also only really
works on x86).

Also note that ttm has the exact same ifdef in its page allocator, but it
does fall back to using dma_alloc_coherent on other platforms.


If this fixes a real problem and there's no full solution yet, let's 
take what we have. So if you can extract the essence of this comment 
into a TODO comment that tells how to fix the issue, fell free to add my


Acked-by: Thomas Zimmermann 

Best regards
Thomas


-Daniel


Best regard
Thomas


+
shmem->pages = pages;
return 0;
@@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
  shmem->pages_mark_dirty_on_put,
  shmem->pages_mark_accessed_on_put);



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer








--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [Intel-gfx] [PATCH] drm/i915: Update small joiner ram size

2021-08-05 Thread Navare, Manasi
On Thu, Aug 05, 2021 at 03:49:37PM +0530, Vandita Kulkarni wrote:
> Xelpd supports larger small joiner ram.
> 
> Signed-off-by: Vandita Kulkarni 
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> b/drivers/gpu/drm/i915/display/intel_dp.c
> index 75d4ebc66941..d174f0d6e7cd 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -461,7 +461,9 @@ u32 intel_dp_mode_to_fec_clock(u32 mode_clock)
>  static int
>  small_joiner_ram_size_bits(struct drm_i915_private *i915)
>  {
> - if (DISPLAY_VER(i915) >= 11)
> + if (DISPLAY_VER(i915) >= 13)
> + return 17280 * 8;

Verified from the Bspec, looks good to me.

Reviewed-by: Manasi Navare 

Manasi

> + else if (DISPLAY_VER(i915) >= 11)
>   return 7680 * 8;
>   else
>   return 6144 * 8;
> -- 
> 2.32.0
> 


Re: [Intel-gfx] [PATCH 4/4] DO_NOT_MERGE: drm/i915/display: Enable PSR2 selective fetch by default

2021-08-05 Thread Gwan-gyeong Mun




On 8/3/21 8:18 PM, Souza, Jose wrote:

On Tue, 2021-08-03 at 14:17 +0300, Gwan-gyeong Mun wrote:


On 7/31/21 3:10 AM, José Roberto de Souza wrote:

Only to execute tests with PSR2 selective fetch enabled and check what
is broken.

IGT tests know to fail with this:
- kms_cursor_legacy: all tests that checks if evasion happend, I have
fix for it making cursor_slowpath() returns true for display 12+.

- kms_psr2_su: The pageflip test, it needs to have the damage clip set
otherwise it will update the whole screen and the selective blocks
will not match with expected.


kms_psr2_su is a test case for intel PSR2 HW tracking and kms_psr2_sf is
used as a test for intel PSR2 manual tracking. Is it necessary to modify
kms_psr2_su for testing PSR2 manual tracking?


kms_psr2_su is to test that PSR2 is sending selective updates, just adding a 
couple of lines we can make it work with selective fetch.


- kms_psr: psr2_*_(mmap_gtt, mmap_cpu, blt and render), all those
tests should be dropped or skipped for display 12+.


Could you explain in more detail why we need to skip on display 12+?


This are stuff that would end up calling intel_psr_invalidate/flush().



Thanks for the explanation.
And there is an issue confirmed in local tests, so I leave additional 
comments.



Signed-off-by: José Roberto de Souza 
---
   drivers/gpu/drm/i915/display/intel_psr.c | 9 -
   drivers/gpu/drm/i915/i915_params.h   | 2 +-
   2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 894a2d35668a2..e128f0c2aeecc 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -877,15 +877,6 @@ static bool intel_psr2_config_valid(struct intel_dp 
*intel_dp,
   return false;
   }

-/*
- * We are missing the implementation of some workarounds to enabled PSR2
- * in Alderlake_P, until ready PSR2 should be kept disabled.
- */
-if (IS_ALDERLAKE_P(dev_priv)) {
-drm_dbg_kms(_priv->drm, "PSR2 is missing the implementation of 
workarounds\n");
-return false;
-}
-
   if (!transcoder_has_psr2(dev_priv, crtc_state->cpu_transcoder)) {
   drm_dbg_kms(_priv->drm,
   "PSR2 not supported in transcoder %s\n",
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index f27eceb82c0f5..8d725b64592d8 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -55,7 +55,7 @@ struct drm_printer;
   param(int, enable_fbc, -1, 0600) \
   param(int, enable_psr, -1, 0600) \
   param(bool, psr_safest_params, false, 0400) \
-param(bool, enable_psr2_sel_fetch, false, 0400) \
+param(bool, enable_psr2_sel_fetch, true, 0400) \
If we do not modify this part and do not enable it by default at boot 
time as shown in the original code below,

param(bool, enable_psr2_sel_fetch, false, 0400) \

when we execute the kms_psr2_sf test case of igt, the FIFO underrun as 
below still occurs.


i915 :00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,

When PSR2 panel is used, PSR1 is enabled by default when 
enable_psr2_sel_fetch is not enabled by default.
And when kms_psr2_sf is executed, the mode is changed to PSR2, and when 
kms_psr2_sf is terminated, PSR2 is deactivated and PSR1 is re-enabled. 
At this point. I suspect there is a problem.



   param(int, disable_power_well, -1, 0400) \
   param(int, enable_ips, 1, 0600) \
   param(int, invert_brightness, 0, 0600) \





Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Matthew Brost
On Thu, Aug 05, 2021 at 05:10:29PM +0100, Tvrtko Ursulin wrote:
> 
> On 05/08/2021 16:04, Patchwork wrote:
> > *Patch Details*
> > *Series:*   drm/i915: Be more gentle when exiting non-persistent contexts
> > *URL:*  https://patchwork.freedesktop.org/series/93420/
> > 
> > *State:*failure
> > *Details:*
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html
> > 
> > 
> > 
> >   CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20775
> > 
> > 
> > Summary
> > 
> > *FAILURE*
> > 
> > Serious unknown changes coming with Patchwork_20775 absolutely need to be
> > verified manually.
> > 
> > If you think the reported changes have nothing to do with the changes
> > introduced in Patchwork_20775, please notify your bug team to allow them
> > to document this new failure mode, which will reduce false positives in CI.
> > 
> > External URL:
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html
> > 
> > 
> > Possible new issues
> > 
> > Here are the unknown changes that may have been introduced in
> > Patchwork_20775:
> > 
> > 
> >   IGT changes
> > 
> > 
> > Possible regressions
> > 
> >   * igt@i915_selftest@live@gt_lrc:
> >   o fi-rkl-guc: PASS
> > 
> > 
> > -> DMESG-WARN
> > 
> > 
> 
> <6> [233.928677] i915: Running intel_lrc_live_selftests/live_lrc_isolation
> <3> [233.988780] i915 :00:02.0: [drm] *ERROR* rcs0 context redzone 
> overwritten!
> 
> Something GuC specific by the look of it, or at least I haven't found the 
> same signature elsewhere. But in any case it is not related to this patch.
> 

No sure what this is about. Ran this locally on a RKL machine and it
passed just fine for me. Something to keep an eye on as CI gets fully
enabled with GuC submission.

Also BTW, speaking of CI & GuC submission it isn't all that great yet.
Maybe ping me when you have the next rev of this patch and I can run
series of tests with GuC submission related to banning / persistence.

Matt

> Regards,
> 
> Tvrtko
> 
> > 
> > 
> > Known issues
> > 
> > Here are the changes found in Patchwork_20775 that come from known issues:
> > 
> > 
> >   IGT changes
> > 
> > 
> > Issues hit
> > 
> >   *
> > 
> > igt@amdgpu/amd_basic@query-info:
> > 
> >   o fi-bsw-kefka: NOTRUN -> SKIP
> > 
> > 
> > (fdo#109271
> > ) +17
> > similar issues
> >   *
> > 
> > igt@gem_exec_fence@basic-busy@bcs0:
> > 
> >   o fi-kbl-soraka: NOTRUN -> SKIP
> > 
> > 
> > (fdo#109271
> > ) +26
> > similar issues
> >   *
> > 
> > igt@gem_huc_copy@huc-copy:
> > 
> >   o fi-kbl-soraka: NOTRUN -> SKIP
> > 
> > 
> > (fdo#109271
> >  /
> > i915#2190 )
> >   *
> > 
> > igt@i915_pm_rpm@basic-rte:
> > 
> >   o fi-kbl-soraka: NOTRUN -> FAIL
> > 
> > 
> > (i915#579 )
> >   *
> > 
> > igt@i915_selftest@live@gt_pm:
> > 
> >   o fi-kbl-soraka: NOTRUN -> DMESG-FAIL
> > 
> > 
> > (i915#1886
> >  /
> > i915#2291 )
> >   *
> > 
> > igt@i915_selftest@live@late_gt_pm:
> > 
> >   o fi-bsw-nick: PASS
> > 
> > 
> > -> DMESG-FAIL
> > 
> > 
> > (i915#2927 )
> >   *
> > 
> > igt@kms_chamelium@common-hpd-after-suspend:
> > 
> >   o fi-kbl-soraka: NOTRUN -> SKIP
> > 
> > 

[Intel-gfx] ✓ Fi.CI.BAT: success for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-05 Thread Patchwork
== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10451 -> Patchwork_20776


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20776:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@gt_timelines:
- {fi-ehl-2}: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-ehl-2/igt@i915_selftest@live@gt_timelines.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/fi-ehl-2/igt@i915_selftest@live@gt_timelines.html

  
Known issues


  Here are the changes found in Patchwork_20776 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [DMESG-WARN][3] ([i915#2203] / [i915#2868]) -> 
[PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10451 -> Patchwork_20776

  CI-20190529: 20190529
  CI_DRM_10451: 3bea0ad83735904d380d83bcca30557268acf887 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20776: e8ef3eecff4fdf295eeb9d88287bd5fe99f1ad11 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

e8ef3eecff4f drm/i915/dg2: Configure PCON in DP pre-enable path
af46376d4b74 drm/i915/dg2: Maintain backward-compatible nested batch behavior
de625d7a0adc drm/i915/dg2: Add new LRI reg offsets
7c5f8a298512 drm/i915/xehpsdv: Read correct RP_STATE_CAP register
f61fbc6c1d71 drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
0b355c3ad3ca drm/i915/xehpsdv: Add compute DSS type
8b84974c7782 drm/i915/dg2: Report INSTDONE_GEOM values in error state
b8058f4b6221 drm/i915/xehp: Loop over all gslices for INSTDONE processing
515352629f0e drm/i915/dg2: Add support for new DG2-G11 revid 0x5

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/index.html


Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Christian König

Am 05.08.21 um 15:25 schrieb Daniel Vetter:

On Thu, Aug 5, 2021 at 3:18 PM Christian König  wrote:



Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.

Well you should probably drop the sentence about the vm_flush, this is
completely unrelated.

Additional to that I still don't think that this is a good idea.
Dependency handling is something completely driver specific.

E.g. even when you have submitted jobs back to back they still might
need a cache flush in between and that is not only for amdgpu like this.

What you can do is to optimize for while looking at the fences later on
and then note that you have done so and what the last hw fence is you
used instead.

Out of 6 drivers using drm/sched 5 can use this. When we get i915
over, that one will be added to the list. amdgpu can't use any of this
anyway due to the vm_id allocation requirements, which is why I
mention that. Also note that all the callbacks are still there, so you
can just ignore this all and still build your own. Like amdgpu does.


The VMID allocation stuff is rather easy to handle, that's why I noted 
we should remove that sentence.


The problematic stuff is handling the cache flush and pipeline sync 
which you make impossible with this here.



So I'm not sure what exactly your object is, aside from "this doesn't
fit for amdgpu", which a) I know b) the commit message explains c)
doesn't actually hurt amdgpu in the slightest. And we still get the
benefit that for most drivers it's a nice optimization.


Well exactly that's what I wanted to avoid. We still can use this in 
amdgpu even with the VMID allocation stuff and I still hope to do so.


Can't we add this as a wrapper or similar?

Christian.


-Daniel


Regards,
Christian.


Reviewed-by: Lucas Stach 
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
   drivers/gpu/drm/scheduler/sched_main.c | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f77456929139..49e507f91ec0 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
   if (!fence)
   return 0;

+ /* if it's a fence from us it's guaranteed to be earlier */
+ if (fence->context == job->entity->fence_context ||
+ fence->context == job->entity->fence_context + 1) {
+ dma_fence_put(fence);
+ return 0;
+ }
+
   /* Deduplicate if we already depend on a fence from the same context.
* This lets the size of the array of deps scale with the number of
* engines involved, rather than the number of BOs.






Re: [Intel-gfx] [PATCH v5 05/20] drm/sched: drop entity parameter from drm_sched_push_job

2021-08-05 Thread Christian König

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Acked-by: Emma Anholt 
Acked-by: Melissa Wen 
Reviewed-by: Steven Price  (v1)
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 


Reviewed-by: Christian König 


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Melissa Wen 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
  drivers/gpu/drm/lima/lima_gem.c  | 3 +--
  drivers/gpu/drm/lima/lima_sched.c| 5 ++---
  drivers/gpu/drm/lima/lima_sched.h| 3 +--
  drivers/gpu/drm/msm/msm_gem_submit.c | 2 +-
  drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
  drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
  drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
  include/drm/gpu_scheduler.h  | 3 +--
  11 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 32e80bc6af22..1d8a914108af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
  
  	trace_amdgpu_cs_ioctl(job);

amdgpu_vm_bo_trace_cs(>vm, >ticket);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
  
  	amdgpu_vm_move_to_lru_tail(p->adev, >vm);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
  
  	*f = dma_fence_get(>base.s_fence->finished);

amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
  
  	return 0;

  }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(>refcount);
  
-	drm_sched_entity_push_job(>sched_job, sched_entity);

+   drm_sched_entity_push_job(>sched_job);
  
  out_unlock:

mutex_unlock(>gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
  
-	fence = lima_sched_context_queue_task(

-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
  
  	for (i = 0; i < submit->nr_bos; i++) {

if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(>base);
  }
  
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,

-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
  {
struct dma_fence *fence = dma_fence_get(>base.s_fence->finished);
  
  	trace_lima_task_submit(task);

-   drm_sched_entity_push_job(>base, >base);
+   drm_sched_entity_push_job(>base);
return fence;
  }
  
diff --git a/drivers/gpu/drm/lima/lima_sched.h 

Re: [Intel-gfx] [PATCH v5 04/20] drm/sched: Add dependency tracking

2021-08-05 Thread Christian König

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

v2/3: Rebased.

v4: Repaint this shed. The functions are now called _add_dependency()
and _add_implicit_dependency()

Reviewed-by: Boris Brezillon  (v3)
Reviewed-by: Steven Price  (v1)
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 


Reviewed-by: Christian König 


Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Nirmoy Das 
Cc: Boris Brezillon 
Cc: Luben Tuikov 
Cc: Alex Deucher 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
  drivers/gpu/drm/scheduler/sched_main.c   | 104 +++
  include/drm/gpu_scheduler.h  |  33 ++-
  3 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 89e3f6eaf519..381fbf462ea7 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence 
*f,
job->sched->ops->free_job(job);
  }
  
+static struct dma_fence *

+drm_sched_job_dependency(struct drm_sched_job *job,
+struct drm_sched_entity *entity)
+{
+   if (!xa_empty(>dependencies))
+   return xa_erase(>dependencies, job->last_dependency++);
+
+   if (job->sched->ops->dependency)
+   return job->sched->ops->dependency(job, entity);
+
+   return NULL;
+}
+
  /**
   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
   *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct 
drm_sched_entity *entity)
struct drm_sched_fence *s_fence = job->s_fence;
  
  		/* Wait for all dependencies to avoid data corruptions */

-   while ((f = job->sched->ops->dependency(job, entity)))
+   while ((f = drm_sched_job_dependency(job, entity)))
dma_fence_wait(f, false);
  
  		drm_sched_fence_scheduled(s_fence);

@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct 
drm_sched_entity *entity)
   */
  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity 
*entity)
  {
-   struct drm_gpu_scheduler *sched = entity->rq->sched;
struct drm_sched_job *sched_job;
  
  	sched_job = to_drm_sched_job(spsc_queue_peek(>job_queue));

@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
return NULL;
  
  	while ((entity->dependency =

-   sched->ops->dependency(sched_job, entity))) {
+   drm_sched_job_dependency(sched_job, entity))) {
trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
  
  		if (drm_sched_entity_add_dependency_cb(entity))

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 454cb6164bdc..f77456929139 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
  
  	INIT_LIST_HEAD(>list);
  
+	xa_init_flags(>dependencies, XA_FLAGS_ALLOC);

+
return 0;
  }
  EXPORT_SYMBOL(drm_sched_job_init);
@@ -637,6 +639,99 @@ void drm_sched_job_arm(struct drm_sched_job *job)
  }
  EXPORT_SYMBOL(drm_sched_job_arm);
  
+/**

+ * drm_sched_job_add_dependency - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_add_dependency(struct drm_sched_job *job,
+struct dma_fence *fence)
+{
+   struct dma_fence *entry;
+   unsigned long index;
+   u32 id = 0;
+   int ret;
+
+   if (!fence)
+   return 0;
+
+   /* Deduplicate if we already depend on a fence from the same context.
+* This lets the size of the array of deps scale with the number of
+* engines involved, rather than the number of BOs.
+*/
+   xa_for_each(>dependencies, index, entry) {
+   if (entry->context != fence->context)
+   continue;
+
+   if (dma_fence_is_later(fence, entry)) {
+   dma_fence_put(entry);
+   xa_store(>dependencies, index, fence, GFP_KERNEL);
+   } else {
+   dma_fence_put(fence);
+   }
+   return 0;
+   }
+
+   ret = xa_alloc(>dependencies, , fence, xa_limit_32b, 
GFP_KERNEL);
+   if 

Re: [Intel-gfx] [PATCH v5 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-08-05 Thread Christian König




Am 05.08.21 um 12:46 schrieb Daniel Vetter:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(_job->s_fence->finished, -ECANCELED);
  
  	dma_fence_put(entity->last_scheduled);

+
entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
  
+	/*

+* If the queue is empty we allow drm_sched_entity_select_rq() to
+* locklessly access ->last_scheduled. This only works if we set the
+* pointer before we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(>job_queue);
return sched_job;
  }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
  
-	if (spsc_queue_count(>job_queue) || !entity->sched_list)

+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(>job_queue))
return;
  
-	fence = READ_ONCE(entity->last_scheduled);

+   /*
+* Only when the queue is empty are we guaranteed that the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   fence = entity->last_scheduled;
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
  




Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Christian König

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
   usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
   to be moved into drm_sched_job_arm, which made me realize that the
   job->id definitely needs to be moved too.

   Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 


At least the amdgpu parts look ok of hand, but I can't judge the rest I 
think.


So only Acked-by: Christian König 


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
  drivers/gpu/drm/lima/lima_sched.c|  2 +
  drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
  drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
  drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
  drivers/gpu/drm/scheduler/sched_main.c   | 69 
  drivers/gpu/drm/v3d/v3d_gem.c|  2 +
  include/drm/gpu_scheduler.h  |  7 ++-
  11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
  
+	drm_sched_job_arm(>base);

+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
  
+	drm_sched_job_arm(>base);

+
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(>base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
  
+	drm_sched_job_arm(>sched_job);

+
submit->out_fence = dma_fence_get(>sched_job.s_fence->finished);
submit->out_fence_id = 

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-05 Thread Patchwork
== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1901:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1268:24: warning: Using plain 
integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: 

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-05 Thread Patchwork
== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
515352629f0e drm/i915/dg2: Add support for new DG2-G11 revid 0x5
b8058f4b6221 drm/i915/xehp: Loop over all gslices for INSTDONE processing
-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'iter_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'gslice_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'dss_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

total: 0 errors, 0 warnings, 3 checks, 164 lines checked
8b84974c7782 drm/i915/dg2: Report INSTDONE_GEOM values in error state
0b355c3ad3ca drm/i915/xehpsdv: Add compute DSS type
f61fbc6c1d71 drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
7c5f8a298512 drm/i915/xehpsdv: Read correct RP_STATE_CAP register
de625d7a0adc drm/i915/dg2: Add new LRI reg offsets
af46376d4b74 drm/i915/dg2: Maintain backward-compatible nested batch behavior
e8ef3eecff4f drm/i915/dg2: Configure PCON in DP pre-enable path




Re: [Intel-gfx] [PATCH v3 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Alex Williamson
On Thu, 5 Aug 2021 08:47:01 -0300
Jason Gunthorpe  wrote:

> On Tue, Aug 03, 2021 at 10:52:25AM -0600, Alex Williamson wrote:
> > On Tue, 3 Aug 2021 13:41:52 -0300
> > Jason Gunthorpe  wrote:  
> > > On Tue, Aug 03, 2021 at 10:34:06AM -0600, Alex Williamson wrote:  
> > > > I think the vfio_pci_find_reset_target() function needs to be re-worked
> > > > to just tell us true/false that it's ok to reset the provided device,
> > > > not to anoint an arbitrary target device.  Thanks,
> > > 
> > > Yes, though this logic is confusing, why do we need to check if any
> > > device needs a reset at this point? If we are being asked to reset
> > > vdev shouldn't vdev needs_reset?
> > > 
> > > Or is the function more of a 'synchronize pending reset' kind of
> > > thing?  
> > 
> > Yes, the latter.  For instance think about a multi-function PCI device
> > such as a GPU.  The functions have dramatically different capabilities,
> > some might have function level reset abilities and others not.  We want
> > to be able to trigger a bus reset as the last device of the set is
> > released, no matter the order they're released and no matter the
> > capabilities of the device we're currently processing.  Thanks,  
> 
> I worked on this for awhile, I think this is much clearer about what
> this algorithm is trying to do:
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 5d6db93d6c680f..e418bcbb68facc 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -223,7 +223,7 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device 
> *vdev)
>   }
>  }
>  
> -static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
> +static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set);
>  static void vfio_pci_disable(struct vfio_pci_device *vdev);
>  static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void 
> *data);
>  
> @@ -404,6 +404,9 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
>   struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
>   int i, bar;
>  
> + /* For needs_reset */
> + lockdep_assert_held(>vdev.dev_set->lock);
> +
>   /* Stop the device from further DMA */
>   pci_clear_master(pdev);
>  
> @@ -487,9 +490,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
>  out:
>   pci_disable_device(pdev);
>  
> - vfio_pci_try_bus_reset(vdev);
> -
> - if (!disable_idle_d3)
> + if (!vfio_pci_dev_set_try_reset(vdev->vdev.dev_set) && !disable_idle_d3)
>   vfio_pci_set_power_state(vdev, PCI_D3hot);
>  }
>  
> @@ -2145,36 +2146,6 @@ static struct pci_driver vfio_pci_driver = {
>   .err_handler= _err_handlers,
>  };
>  
> -static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
> -{
> - struct vfio_devices *devs = data;
> - struct vfio_device *device;
> - struct vfio_pci_device *vdev;
> -
> - if (devs->cur_index == devs->max_index)
> - return -ENOSPC;
> -
> - device = vfio_device_get_from_dev(>dev);
> - if (!device)
> - return -EINVAL;
> -
> - if (pci_dev_driver(pdev) != _pci_driver) {
> - vfio_device_put(device);
> - return -EBUSY;
> - }
> -
> - vdev = container_of(device, struct vfio_pci_device, vdev);
> -
> - /* Fault if the device is not unused */
> - if (device->open_count) {
> - vfio_device_put(device);
> - return -EBUSY;
> - }
> -
> - devs->devices[devs->cur_index++] = vdev;
> - return 0;
> -}
> -
>  static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
>  {
>   struct vfio_devices *devs = data;
> @@ -2208,79 +2179,86 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct 
> pci_dev *pdev, void *data)
>   return 0;
>  }
>  
> +static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
> +{
> + struct vfio_device_set *dev_set = data;
> + struct vfio_device *cur;
> +
> + lockdep_assert_held(_set->lock);
> +
> + list_for_each_entry(cur, _set->device_list, dev_set_list)
> + if (cur->dev == >dev)
> + return 0;
> + return -EBUSY;
> +}
> +
> +static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set)

Slight nit on the name here since we're essentially combining
needs_reset along with the notion of the device being unused.  I'm not
sure, maybe "should_reset"?  Otherwise it looks ok.  Thanks,

Alex

> +{
> + struct vfio_pci_device *cur;
> + bool needs_reset = false;
> +
> + list_for_each_entry(cur, _set->device_list, vdev.dev_set_list) {
> + /* No VFIO device in the set can have an open device FD */
> + if (cur->vdev.open_count)
> + return false;
> + needs_reset |= cur->needs_reset;
> + }
> + return needs_reset;
> +}
> +
>  /*
> - * If a bus or slot reset is available for the provided device and:
> + * If a bus or slot 

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: fix i915_globals_exit() section mismatch error

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915: fix i915_globals_exit() section mismatch error
URL   : https://patchwork.freedesktop.org/series/93398/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10445_full -> Patchwork_20770_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20770_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20770_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20770_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_endless@dispatch@vcs0:
- shard-tglb: NOTRUN -> [INCOMPLETE][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-tglb1/igt@gem_exec_endless@dispa...@vcs0.html

  * igt@i915_pm_sseu@full-enable:
- shard-glk:  NOTRUN -> [FAIL][2]
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-glk6/igt@i915_pm_s...@full-enable.html

  
Known issues


  Here are the changes found in Patchwork_20770_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][3] ([i915#1839])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-iclb2/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][4] ([i915#3002])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb2/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@hostile:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb7/igt@gem_ctx_persiste...@hostile.html

  * igt@gem_ctx_persistence@legacy-engines-hang@render:
- shard-tglb: [PASS][6] -> [FAIL][7] ([i915#2410])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb2/igt@gem_ctx_persistence@legacy-engines-h...@render.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-tglb7/igt@gem_ctx_persistence@legacy-engines-h...@render.html

  * igt@gem_exec_fair@basic-deadline:
- shard-glk:  [PASS][8] -> [FAIL][9] ([i915#2846])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-glk9/igt@gem_exec_f...@basic-deadline.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-glk7/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-apl:  [PASS][10] -> [SKIP][11] ([fdo#109271])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-apl3/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-apl1/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][12] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-iclb2/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][13] -> [FAIL][14] ([i915#2842]) +2 similar 
issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb6/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-tglb7/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-snb:  NOTRUN -> [WARN][15] ([i915#2658])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb6/igt@gem_pwr...@basic-exhaustion.html
- shard-kbl:  NOTRUN -> [WARN][16] ([i915#2658])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-kbl2/igt@gem_pwr...@basic-exhaustion.html

  * igt@gen9_exec_parse@batch-invalid-length:
- shard-snb:  NOTRUN -> [SKIP][17] ([fdo#109271]) +269 similar 
issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb7/igt@gen9_exec_pa...@batch-invalid-length.html

  * igt@i915_pm_dc@dc5-dpms:
- shard-kbl:  NOTRUN -> [FAIL][18] ([i915#545])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-kbl2/igt@i915_pm...@dc5-dpms.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
- shard-apl:  NOTRUN -> [SKIP][19] ([fdo#109271] / [i915#1937])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-apl1/igt@i915_pm_lpsp@kms-l...@kms-lpsp-dp.html

  * igt@i915_pm_sseu@full-enable:
- shard-skl:  [PASS][20] -> [FAIL][21] ([i915#3650])
   [20]: 

Re: [Intel-gfx] [PATCH v5 4/9] drm/i915/xehpsdv: Add compute DSS type

2021-08-05 Thread Lucas De Marchi

On Thu, Aug 05, 2021 at 09:36:42AM -0700, Matt Roper wrote:

From: Stuart Summers 

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

v2:
- Generalize a comment about uapi access to geometry/compute masks; the
  proposed uapi has changed since the comment was first written, and
  will show up in a future series once the userspace code is published.
  (Lucas)

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Stuart Summers 
Signed-off-by: Steve Hampson 
Signed-off-by: Matt Roper 
---
drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +---
drivers/gpu/drm/i915/gt/intel_sseu.h |  5 ++-
drivers/gpu/drm/i915/i915_reg.h  |  3 +-
include/uapi/drm/i915_drm.h  |  3 --
4 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbd272943c3f..9cf157a2454f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info 
*sseu, u8 slice)
}

void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
- u32 ss_mask)
+ u8 *subslice_mask, u32 ss_mask)
{
int offset = slice * sseu->ss_stride;

-   memcpy(>subslice_mask[offset], _mask, sseu->ss_stride);
+   memcpy(_mask[offset], _mask, sseu->ss_stride);
}

unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info 
*sseu)
return total;
}

-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-   u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+   u32 ss_mask;
+
+   ss_mask = ss_en >> (s * sseu->max_subslices);
+   ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+   return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+   u32 g_ss_en, u32 c_ss_en, u16 eu_en)
{
int s, ss;

-   /* ss_en represents entire subslice mask across all slices */
+   /* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-  sizeof(ss_en) * BITS_PER_BYTE);
+  sizeof(g_ss_en) * BITS_PER_BYTE);

for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,22 @@ static void gen11_compute_sseu_info(struct sseu_dev_info 
*sseu,

sseu->slice_mask |= BIT(s);

-   intel_sseu_set_subslices(sseu, s, ss_en);
+   /*
+* XeHP introduces the concept of compute vs geometry DSS. To
+* reduce variation between GENs around subslice usage, store a
+* mask for both the geometry and compute enabled masks since
+* userspace will need to be able to query these masks
+* independently.  Also compute a total enabled subslice count
+* for the purposes of selecting subslices to use in a
+* particular GEM context.
+*/
+   intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+get_ss_stride_mask(sseu, s, c_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+get_ss_stride_mask(sseu, s, g_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+get_ss_stride_mask(sseu, s,
+   g_ss_en | c_ss_en));

for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +154,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
{
struct sseu_dev_info *sseu = >info.sseu;
struct intel_uncore *uncore = gt->uncore;
-   u32 dss_en;
+   u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@@ -145,10 +170,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
+   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) {
intel_sseu_set_info(sseu, 1, 32, 16);
-   else
+ 

Re: [Intel-gfx] [PATCH v5 15/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-08-05 Thread Christian König

Am 05.08.21 um 12:47 schrieb Daniel Vetter:

You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 


The function name in the subject line should be updated, apart from that 
feel free to add my rb to this patch.


Christian.


---
  drivers/gpu/drm/scheduler/sched_main.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 49e507f91ec0..1abb40b07324 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -715,6 +715,8 @@ int drm_sched_job_add_implicit_dependencies(struct 
drm_sched_job *job,
struct dma_fence **fences;
unsigned int i, fence_count;
  
+	dma_resv_assert_held(obj->resv);

+
if (!write) {
struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
  




Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Christian König




Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.


Well you should probably drop the sentence about the vm_flush, this is 
completely unrelated.


Additional to that I still don't think that this is a good idea. 
Dependency handling is something completely driver specific.


E.g. even when you have submitted jobs back to back they still might 
need a cache flush in between and that is not only for amdgpu like this.


What you can do is to optimize for while looking at the fences later on 
and then note that you have done so and what the last hw fence is you 
used instead.


Regards,
Christian.



Reviewed-by: Lucas Stach 
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
  drivers/gpu/drm/scheduler/sched_main.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f77456929139..49e507f91ec0 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
if (!fence)
return 0;
  
+	/* if it's a fence from us it's guaranteed to be earlier */

+   if (fence->context == job->entity->fence_context ||
+   fence->context == job->entity->fence_context + 1) {
+   dma_fence_put(fence);
+   return 0;
+   }
+
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.




Re: [Intel-gfx] [PATCH v5 1/9] drm/i915/dg2: Add support for new DG2-G11 revid 0x5

2021-08-05 Thread Lucas De Marchi

On Thu, Aug 05, 2021 at 09:36:39AM -0700, Matt Roper wrote:

The bspec has been updated with a new revision 0x5 that translates to B1
GT stepping and C0 display stepping.

Bspec: 44477
Signed-off-by: Matt Roper 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/intel_step.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index b5fb961e1b62..6cf967631395 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -118,6 +118,7 @@ static const struct intel_step_info 
dg2_g10_revid_step_tbl[] = {
static const struct intel_step_info dg2_g11_revid_step_tbl[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_B0 },
[0x4] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
+   [0x5] = { .gt_step = STEP_B1, .display_step = STEP_C0 },
};

void intel_step_init(struct drm_i915_private *i915)
--
2.25.4



[Intel-gfx] [PATCH v5 7/9] drm/i915/dg2: Add new LRI reg offsets

2021-08-05 Thread Matt Roper
From: Akeem G Abodunrin 

New LRI register offsets were introduced for DG2, this patch adds
those extra registers, and create new register table for setting offsets
to compare with HW generated context image - especially for gt_lrc test.
Also updates general purpose register with scratch offset for DG2, in
order to use it for live_lrc_fixed selftest.

Cc: Chris P Wilson 
Cc: Prathap Kumar Valsan 
Signed-off-by: Akeem G Abodunrin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 85 -
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index bb4af4977920..6ba8daea2f56 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = {
END
 };
 
+static const u8 dg2_xcs_offsets[] = {
+   NOP(1),
+   LRI(15, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+   REG(0x120),
+   REG(0x124),
+
+   NOP(1),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   END
+};
+
 static const u8 gen8_rcs_offsets[] = {
NOP(1),
LRI(14, POSTED),
@@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = {
END
 };
 
+static const u8 dg2_rcs_offsets[] = {
+   NOP(1),
+   LRI(15, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+   REG(0x120),
+   REG(0x124),
+
+   NOP(1),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   LRI(3, POSTED),
+   REG(0x1b0),
+   REG16(0x5a8),
+   REG16(0x5ac),
+
+   NOP(6),
+   LRI(1, 0),
+   REG(0x0c8),
+
+   END
+};
+
 #undef END
 #undef REG16
 #undef REG
@@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
   !intel_engine_has_relative_mmio(engine));
 
if (engine->class == RENDER_CLASS) {
-   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+   return dg2_rcs_offsets;
+   else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
return xehp_rcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_rcs_offsets;
@@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
else
return gen8_rcs_offsets;
} else {
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+   return dg2_xcs_offsets;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_xcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 9)
return gen9_xcs_offsets;
-- 
2.25.4



[Intel-gfx] [PATCH v5 0/9] Begin enabling Xe_HP SDV and DG2 platforms

2021-08-05 Thread Matt Roper
This series provides some of the initial enablement patches for two
upcoming discrete GPUs:
 * XeHP SDV:  Xe_HP (version 12.50) graphics IP, no display IP
 * DG2:  Xe_HPG (version 12.55) graphics IP, Xe_LPD (version 13) display IP

Both platforms will need additional enablement patches beyond what's
present in this series before they're truly usable, including various
LMEM and GuC work that's already happening separately.  The new
features/functionality that these platforms bring (such as multi-tile
support, dedicated compute engines, etc.) may be referenced in passing
in some of these patches but will be fully enabled in future series.

v2:
 - General rebase and incorporation of r-b's.
 - Re-order intel_gt_info and intel_device_info structures to eliminate
   some unnecessary padding after the size change of
   intel_engine_mask_t.  (Tvrtko)
 - Use 'intel_step' mechanisms for revid->stepping mapping.  (Jani)
 - Drop the DSC patches for now; they need some rework.  (Jani)

v3:
 - About 20 of the patches have landed upstream now.  Rebase and resend
   the rest.  Some of these are already reviewed, but have dependencies
   on other unreviewed patches (e.g., the new engine definitions, the
   initial SNPS PHY support, etc.).

v4:
 - Several more patches have landed upstream; rebase and re-send the
   rest.  Some of the remaining patches are reviewed but still have
   dependencies on non-reviewed patches, so the order is shuffled this
   time to group patches by dependency rather than by xehp vs xehpsdv vs
   dg2.
 - Minor cleanup to "drm/i915/xehp: handle new steering options"
   suggested by Caz.

v5:
 - Rebase remaining patches after several more have landed upstream.
 - Drop the two MOCS patches for now; we need to wait for some prep work
   from Ayaz to land before we apply those.
 - Make a comment about uapi in the compute DSS patch more general; the
   uapi itself will show up in a future series once the corresponding
   userspace driver code is published.  (Lucas)
 - Add an extra patch for a new DG2-G11 stepping that has appeared.

Cc: Rodrigo Vivi 
Cc: Lucas De Marchi 
Cc: James Ausmus 


Akeem G Abodunrin (1):
  drm/i915/dg2: Add new LRI reg offsets

Ankit Nautiyal (1):
  drm/i915/dg2: Configure PCON in DP pre-enable path

Lucas De Marchi (1):
  drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

Matt Roper (5):
  drm/i915/dg2: Add support for new DG2-G11 revid 0x5
  drm/i915/xehp: Loop over all gslices for INSTDONE processing
  drm/i915/dg2: Report INSTDONE_GEOM values in error state
  drm/i915/xehpsdv: Read correct RP_STATE_CAP register
  drm/i915/dg2: Maintain backward-compatible nested batch behavior

Stuart Summers (1):
  drm/i915/xehpsdv: Add compute DSS type

 drivers/gpu/drm/i915/display/intel_ddi.c |  3 +
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c  |  8 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 55 -
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 +++-
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 85 +++-
 drivers/gpu/drm/i915/gt/intel_rps.c  | 19 +++--
 drivers/gpu/drm/i915/gt/intel_rps.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +++
 drivers/gpu/drm/i915/gt/intel_sseu.h | 12 ++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  | 39 -
 drivers/gpu/drm/i915/i915_debugfs.c  |  8 +-
 drivers/gpu/drm/i915/i915_gpu_error.c| 36 +++--
 drivers/gpu/drm/i915/i915_reg.h  |  6 +-
 drivers/gpu/drm/i915/intel_step.c|  1 +
 include/uapi/drm/i915_drm.h  |  3 -
 15 files changed, 284 insertions(+), 73 deletions(-)

-- 
2.25.4



[Intel-gfx] [PATCH v5 5/9] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

2021-08-05 Thread Matt Roper
From: Lucas De Marchi 

Instead of maintaining the same if ladder in 3 different places, add a
function to read RP_STATE_CAP.

Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c |  8 +++-
 drivers/gpu/drm/i915/gt/intel_rps.c | 17 -
 drivers/gpu/drm/i915/gt/intel_rps.h |  1 +
 drivers/gpu/drm/i915/i915_debugfs.c |  8 +++-
 4 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
index d6f5836396f8..f6733f279890 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
@@ -309,13 +309,11 @@ static int frequency_show(struct seq_file *m, void 
*unused)
int max_freq;
 
rp_state_limits = intel_uncore_read(uncore, 
GEN6_RP_STATE_LIMITS);
-   if (IS_GEN9_LP(i915)) {
-   rp_state_cap = intel_uncore_read(uncore, 
BXT_RP_STATE_CAP);
+   rp_state_cap = intel_rps_read_state_cap(rps);
+   if (IS_GEN9_LP(i915))
gt_perf_status = intel_uncore_read(uncore, 
BXT_GT_PERF_STATUS);
-   } else {
-   rp_state_cap = intel_uncore_read(uncore, 
GEN6_RP_STATE_CAP);
+   else
gt_perf_status = intel_uncore_read(uncore, 
GEN6_GT_PERF_STATUS);
-   }
 
/* RPSTAT1 is in the GT power well */
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index d812b27835f8..a3e69eba376f 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -996,20 +996,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val)
 static void gen6_rps_init(struct intel_rps *rps)
 {
struct drm_i915_private *i915 = rps_to_i915(rps);
-   struct intel_uncore *uncore = rps_to_uncore(rps);
+   u32 rp_state_cap = intel_rps_read_state_cap(rps);
 
/* All of these values are in units of 50MHz */
 
/* static values from HW: RP0 > RP1 > RPn (min_freq) */
if (IS_GEN9_LP(i915)) {
-   u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
-
rps->rp0_freq = (rp_state_cap >> 16) & 0xff;
rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
rps->min_freq = (rp_state_cap >>  0) & 0xff;
} else {
-   u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
-
rps->rp0_freq = (rp_state_cap >>  0) & 0xff;
rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
rps->min_freq = (rp_state_cap >> 16) & 0xff;
@@ -2140,6 +2136,17 @@ int intel_rps_set_min_frequency(struct intel_rps *rps, 
u32 val)
return set_min_freq(rps, val);
 }
 
+u32 intel_rps_read_state_cap(struct intel_rps *rps)
+{
+   struct drm_i915_private *i915 = rps_to_i915(rps);
+   struct intel_uncore *uncore = rps_to_uncore(rps);
+
+   if (IS_GEN9_LP(i915))
+   return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
+   else
+   return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
+}
+
 /* External interface for intel_ips.ko */
 
 static struct drm_i915_private __rcu *ips_mchdev;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
b/drivers/gpu/drm/i915/gt/intel_rps.h
index 4213bcce1667..11960d64ca82 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -41,6 +41,7 @@ u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
 u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
 u32 intel_rps_read_punit_req(struct intel_rps *rps);
 u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
+u32 intel_rps_read_state_cap(struct intel_rps *rps);
 
 void gen5_rps_irq_handler(struct intel_rps *rps);
 void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 44969f5dde50..eec0d349ea6a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -420,13 +420,11 @@ static int i915_frequency_info(struct seq_file *m, void 
*unused)
int max_freq;
 
rp_state_limits = intel_uncore_read(_priv->uncore, 
GEN6_RP_STATE_LIMITS);
-   if (IS_GEN9_LP(dev_priv)) {
-   rp_state_cap = intel_uncore_read(_priv->uncore, 
BXT_RP_STATE_CAP);
+   rp_state_cap = intel_rps_read_state_cap(rps);
+   if (IS_GEN9_LP(dev_priv))
gt_perf_status = intel_uncore_read(_priv->uncore, 
BXT_GT_PERF_STATUS);
-   } else {
-   rp_state_cap = intel_uncore_read(_priv->uncore, 
GEN6_RP_STATE_CAP);
+   else
gt_perf_status = intel_uncore_read(_priv->uncore, 

[Intel-gfx] [PATCH v5 9/9] drm/i915/dg2: Configure PCON in DP pre-enable path

2021-08-05 Thread Matt Roper
From: Ankit Nautiyal 

Add the functions to configure HDMI2.1 pcon for DG2, before DP link
training.

Signed-off-by: Ankit Nautiyal 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_ddi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index d8162951b78f..e932fd0fe7e2 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2402,6 +2402,7 @@ static void dg2_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
if (!is_mst)
intel_dp_set_power(intel_dp, DP_SET_POWER_D0);
 
+   intel_dp_configure_protocol_converter(intel_dp, crtc_state);
intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true);
/*
 * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit
@@ -2409,6 +2410,8 @@ static void dg2_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
 * training
 */
intel_dp_sink_set_fec_ready(intel_dp, crtc_state);
+   intel_dp_check_frl_training(intel_dp);
+   intel_dp_pcon_dsc_configure(intel_dp, crtc_state);
 
/*
 * 5.h Follow DisplayPort specification training sequence (see notes for
-- 
2.25.4



[Intel-gfx] [PATCH v5 6/9] drm/i915/xehpsdv: Read correct RP_STATE_CAP register

2021-08-05 Thread Matt Roper
The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this
register is now a per-tile register at GTTMMADDR offset 0x250014.

Cc: Rodrigo Vivi 
Signed-off-by: Matt Roper 
Signed-off-by: Lucas De Marchi 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++-
 drivers/gpu/drm/i915/i915_reg.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index a3e69eba376f..3489f5f0cac1 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -2141,7 +2141,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps)
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
 
-   if (IS_GEN9_LP(i915))
+   if (IS_XEHPSDV(i915))
+   return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
+   else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
else
return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f8d3cd11eced..77f6dcaba2b9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4115,6 +4115,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   RPN_CAP_MASK REG_GENMASK(23, 16)
 #define BXT_RP_STATE_CAP_MMIO(0x138170)
 #define GEN9_RP_STATE_LIMITS   _MMIO(0x138148)
+#define XEHPSDV_RP_STATE_CAP   _MMIO(0x250014)
 
 /*
  * Logical Context regs
-- 
2.25.4



[Intel-gfx] [PATCH v5 1/9] drm/i915/dg2: Add support for new DG2-G11 revid 0x5

2021-08-05 Thread Matt Roper
The bspec has been updated with a new revision 0x5 that translates to B1
GT stepping and C0 display stepping.

Bspec: 44477
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_step.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index b5fb961e1b62..6cf967631395 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -118,6 +118,7 @@ static const struct intel_step_info 
dg2_g10_revid_step_tbl[] = {
 static const struct intel_step_info dg2_g11_revid_step_tbl[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_B0 },
[0x4] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
+   [0x5] = { .gt_step = STEP_B1, .display_step = STEP_C0 },
 };
 
 void intel_step_init(struct drm_i915_private *i915)
-- 
2.25.4



[Intel-gfx] [PATCH v5 8/9] drm/i915/dg2: Maintain backward-compatible nested batch behavior

2021-08-05 Thread Matt Roper
For tgl+, the per-context setting of MI_MODE[12] determines whether
the bits of a nested MI_BATCH_BUFFER_START instruction should be
interpreted in the traditional manner or whether they should
instead use a new tgl+ meaning that breaks backward compatibility, but
allows nesting into 3rd-level batchbuffers.  For previous platforms,
the hardware default for this register bit is to maintain
backward-compatible behavior unless a context intentionally opts into
the new behavior; however Xe_HPG flips the hardware default behavior.

>From a SW perspective, we want to maintain the backward-compatible
behavior for userspace, so we'll apply a fake workaround to set it back
to the legacy behavior on platforms where the hardware default is to
break compatibility.  At the moment there is no Linux userspace that
utilizes third-level batchbuffers, so this will avoid userspace from
needing to make any changes.  using the legacy meaning is the correct
thing to do.  If/when we have userspace consumers that want to utilize
third-level batch nesting, we can provide a context parameter to allow
them to opt-in.

Bspec: 45974, 45718
Cc: John Harrison 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++--
 drivers/gpu/drm/i915/i915_reg.h |  1 +
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index aae609d7d85d..97b3cd81b721 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -644,6 +644,37 @@ static void dg1_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
 }
 
+static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
+struct i915_wa_list *wal)
+{
+   /*
+* This is a "fake" workaround defined by software to ensure we
+* maintain reliable, backward-compatible behavior for userspace with
+* regards to how nested MI_BATCH_BUFFER_START commands are handled.
+*
+* The per-context setting of MI_MODE[12] determines whether the bits
+* of a nested MI_BATCH_BUFFER_START instruction should be interpreted
+* in the traditional manner or whether they should instead use a new
+* tgl+ meaning that breaks backward compatibility, but allows nesting
+* into 3rd-level batchbuffers.  When this new capability was first
+* added in TGL, it remained off by default unless a context
+* intentionally opted in to the new behavior.  However Xe_HPG now
+* flips this on by default and requires that we explicitly opt out if
+* we don't want the new behavior.
+*
+* From a SW perspective, we want to maintain the backward-compatible
+* behavior for userspace, so we'll apply a fake workaround to set it
+* back to the legacy behavior on platforms where the hardware default
+* is to break compatibility.  At the moment there is no Linux
+* userspace that utilizes third-level batchbuffers, so this will avoid
+* userspace from needing to make any changes.  using the legacy
+* meaning is the correct thing to do.  If/when we have userspace
+* consumers that want to utilize third-level batch nesting, we can
+* provide a context parameter to allow them to opt-in.
+*/
+   wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN);
+}
+
 static void
 __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
   struct i915_wa_list *wal,
@@ -651,11 +682,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
 {
struct drm_i915_private *i915 = engine->i915;
 
+   wa_init_start(wal, name, engine->name);
+
+   /* Applies to all engines */
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
+   fakewa_disable_nestedbb_mode(engine, wal);
+
if (engine->class != RENDER_CLASS)
return;
 
-   wa_init_start(wal, name, engine->name);
-
if (IS_DG1(i915))
dg1_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 12)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 77f6dcaba2b9..269685955fbd 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2821,6 +2821,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define MI_MODE_MMIO(0x209c)
 # define VS_TIMER_DISPATCH (1 << 6)
 # define MI_FLUSH_ENABLE   (1 << 12)
+# define TGL_NESTED_BB_EN  (1 << 12)
 # define ASYNC_FLIP_PERF_DISABLE   (1 << 14)
 # define MODE_IDLE (1 << 9)
 # define STOP_RING  

[Intel-gfx] [PATCH v5 2/9] drm/i915/xehp: Loop over all gslices for INSTDONE processing

2021-08-05 Thread Matt Roper
We no longer have traditional slices on Xe_HP platforms, but the
INSTDONE registers are replicated according to gslice representation
which is similar.  We can mostly re-use the existing instdone code with
just a few modifications:

 * Create an alternate instdone loop macro that will iterate over the
   flat DSS space, but still provide the gslice/dss steering values for
   compatibility with the legacy code.

 * We should allocate INSTDONE storage space according to the maximum
   number of gslices rather than the maximum number of legacy slices to
   ensure we have enough storage space to hold all of the values.  XeHP
   design has 8 gslices, whereas older platforms never had more than 3
   slices.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 48 +++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 -
 drivers/gpu/drm/i915/gt/intel_sseu.h |  7 +++
 drivers/gpu/drm/i915/i915_gpu_error.c| 32 +
 4 files changed, 66 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..58ed67894b3d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1163,16 +1163,16 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
u32 mmio_base = engine->mmio_base;
int slice;
int subslice;
+   int iter;
 
memset(instdone, 0, sizeof(*instdone));
 
-   switch (GRAPHICS_VER(i915)) {
-   default:
+   if (GRAPHICS_VER(i915) >= 8) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
if (engine->id != RCS0)
-   break;
+   return;
 
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1182,21 +1182,32 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
instdone->slice_common_extra[1] =
intel_uncore_read(uncore, 
GEN12_SC_INSTDONE_EXTRA2);
}
-   for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
-   instdone->sampler[slice][subslice] =
-   read_subslice_reg(engine, slice, subslice,
- GEN7_SAMPLER_INSTDONE);
-   instdone->row[slice][subslice] =
-   read_subslice_reg(engine, slice, subslice,
- GEN7_ROW_INSTDONE);
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+   for_each_instdone_gslice_dss_xehp(i915, sseu, iter, 
slice, subslice) {
+   instdone->sampler[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
GEN7_SAMPLER_INSTDONE);
+   instdone->row[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ GEN7_ROW_INSTDONE);
+   }
+   } else {
+   for_each_instdone_slice_subslice(i915, sseu, slice, 
subslice) {
+   instdone->sampler[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
GEN7_SAMPLER_INSTDONE);
+   instdone->row[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ GEN7_ROW_INSTDONE);
+   }
}
-   break;
-   case 7:
+   } else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
if (engine->id != RCS0)
-   break;
+   return;
 
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1204,22 +1215,15 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE);
instdone->row[0][0] =
intel_uncore_read(uncore, GEN7_ROW_INSTDONE);
-
-   break;
-   case 6:
-   case 5:
-   case 4:
+   } else if (GRAPHICS_VER(i915) >= 4) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id == RCS0)
/* HACK: Using the wrong struct 

[Intel-gfx] [PATCH v5 4/9] drm/i915/xehpsdv: Add compute DSS type

2021-08-05 Thread Matt Roper
From: Stuart Summers 

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

v2:
 - Generalize a comment about uapi access to geometry/compute masks; the
   proposed uapi has changed since the comment was first written, and
   will show up in a future series once the userspace code is published.
   (Lucas)

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Stuart Summers 
Signed-off-by: Steve Hampson 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +---
 drivers/gpu/drm/i915/gt/intel_sseu.h |  5 ++-
 drivers/gpu/drm/i915/i915_reg.h  |  3 +-
 include/uapi/drm/i915_drm.h  |  3 --
 4 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbd272943c3f..9cf157a2454f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info 
*sseu, u8 slice)
 }
 
 void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
- u32 ss_mask)
+ u8 *subslice_mask, u32 ss_mask)
 {
int offset = slice * sseu->ss_stride;
 
-   memcpy(>subslice_mask[offset], _mask, sseu->ss_stride);
+   memcpy(_mask[offset], _mask, sseu->ss_stride);
 }
 
 unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info 
*sseu)
return total;
 }
 
-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-   u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+   u32 ss_mask;
+
+   ss_mask = ss_en >> (s * sseu->max_subslices);
+   ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+   return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+   u32 g_ss_en, u32 c_ss_en, u16 eu_en)
 {
int s, ss;
 
-   /* ss_en represents entire subslice mask across all slices */
+   /* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-  sizeof(ss_en) * BITS_PER_BYTE);
+  sizeof(g_ss_en) * BITS_PER_BYTE);
 
for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,22 @@ static void gen11_compute_sseu_info(struct sseu_dev_info 
*sseu,
 
sseu->slice_mask |= BIT(s);
 
-   intel_sseu_set_subslices(sseu, s, ss_en);
+   /*
+* XeHP introduces the concept of compute vs geometry DSS. To
+* reduce variation between GENs around subslice usage, store a
+* mask for both the geometry and compute enabled masks since
+* userspace will need to be able to query these masks
+* independently.  Also compute a total enabled subslice count
+* for the purposes of selecting subslices to use in a
+* particular GEM context.
+*/
+   intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+get_ss_stride_mask(sseu, s, c_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+get_ss_stride_mask(sseu, s, g_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+get_ss_stride_mask(sseu, s,
+   g_ss_en | c_ss_en));
 
for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +154,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 {
struct sseu_dev_info *sseu = >info.sseu;
struct intel_uncore *uncore = gt->uncore;
-   u32 dss_en;
+   u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@@ -145,10 +170,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
+   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) {
intel_sseu_set_info(sseu, 1, 32, 16);
-   else
+   sseu->has_compute_dss = 

[Intel-gfx] [PATCH v5 3/9] drm/i915/dg2: Report INSTDONE_GEOM values in error state

2021-08-05 Thread Matt Roper
Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team
has indicated that having these reported in the error state would be
useful for debugging GPU hangs.  These registers are replicated per-DSS
with gslice steering.

Cc: Lionel Landwerlin 
Signed-off-by: Matt Roper 
Acked-by: Lionel Landwerlin 
Reviewed-by: Matt Atwood 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c|  7 +++
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
 drivers/gpu/drm/i915/i915_gpu_error.c| 10 --
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 58ed67894b3d..332efea696a5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1202,6 +1202,13 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
  GEN7_ROW_INSTDONE);
}
}
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
+   for_each_instdone_gslice_dss_xehp(i915, sseu, iter, 
slice, subslice)
+   instdone->geom_svg[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
XEHPG_INSTDONE_GEOM_SVG);
+   }
} else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 0b4846b01626..bfbfe53c23dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -69,6 +69,9 @@ struct intel_instdone {
u32 slice_common_extra[2];
u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
+
+   /* Added in XeHPG */
+   u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 8230bc3ac8a9..91d5da7b0a2b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -431,6 +431,7 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
const struct sseu_dev_info *sseu = >engine->gt->info.sseu;
int slice;
int subslice;
+   int iter;
 
err_printf(m, "  INSTDONE: 0x%08x\n",
   ee->instdone.instdone);
@@ -445,8 +446,6 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
return;
 
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
-   int iter;
-
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, 
subslice)
err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
   slice, subslice,
@@ -471,6 +470,13 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
if (GRAPHICS_VER(m->i915) < 12)
return;
 
+   if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) {
+   for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, 
subslice)
+   err_printf(m, "  GEOM_SVGUNIT_INSTDONE[%d][%d]: 
0x%08x\n",
+  slice, subslice,
+  ee->instdone.geom_svg[slice][subslice]);
+   }
+
err_printf(m, "  SC_INSTDONE_EXTRA: 0x%08x\n",
   ee->instdone.slice_common_extra[0]);
err_printf(m, "  SC_INSTDONE_EXTRA2: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 167eaa87501b..8bfd646fc403 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2686,6 +2686,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN12_SC_INSTDONE_EXTRA2   _MMIO(0x7108)
 #define GEN7_SAMPLER_INSTDONE  _MMIO(0xe160)
 #define GEN7_ROW_INSTDONE  _MMIO(0xe164)
+#define XEHPG_INSTDONE_GEOM_SVG_MMIO(0x666c)
 #define MCFG_MCR_SELECTOR  _MMIO(0xfd0)
 #define SF_MCR_SELECTOR_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR  _MMIO(0xfdc)
-- 
2.25.4



Re: [Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Matthew Brost
On Thu, Aug 05, 2021 at 01:05:09PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> When a non-persistent context exits we currently mark it as banned in
> order to trigger fast termination of any outstanding GPU jobs it may have
> left running.
> 
> In doing so we apply a very strict 1ms limit in which the left over job
> has to preempt before we issues an engine resets.
> 
> Some workloads are not able to cleanly preempt in that time window and it
> can be argued that it would instead be better to give them a bit more
> grace since avoiding engine resets is generally preferrable.
> 
> To achieve this the patch splits handling of banned contexts from simply
> closed non-persistent ones and then applies different timeouts for both
> and also extends the criteria which determines if a request should be
> scheduled back in after preemption or not.
> 
> 15ms preempt timeout grace is given to exited non-persistent contexts
> which have been empirically tested to satisfy customers requirements
> and still provides reasonably quick cleanup post exit.
> 

I think you need to rework your thinking here a bit as this a very
execlists specific solution and the GuC needs to be considered.

> v2:
>  * Streamline fast path checks.
> 
> v3:
>  * Simplify by using only schedulable status.
>  * Increase timeout to 20ms.
> 
> v4:
>  * Fix live_execlists selftest.
> 
> v5:
>  * Fix logic in kill_engines.
> 
> v6:
>  * Rebase.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: Zhen Han 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +--
>  drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
>  drivers/gpu/drm/i915/gt/intel_context.h   | 17 +-
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
>  .../drm/i915/gt/intel_execlists_submission.c  | 11 --
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++--
>  drivers/gpu/drm/i915/i915_request.c   |  2 +-
>  7 files changed, 57 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index cff72679ad7c..21fe5d4057ab 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1065,7 +1065,8 @@ static struct intel_engine_cs *active_engine(struct 
> intel_context *ce)
>   return engine;
>  }
>  
> -static void kill_engines(struct i915_gem_engines *engines, bool ban)
> +static void
> +kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
>  {
>   struct i915_gem_engines_iter it;
>   struct intel_context *ce;
> @@ -1079,8 +1080,15 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>*/
>   for_each_gem_engine(ce, engines, it) {
>   struct intel_engine_cs *engine;
> + bool skip = false;
> +
> + if (ban)
> + skip = intel_context_ban(ce, NULL);
> + else if (!persistent)
> + skip = !intel_context_clear_schedulable(ce);

schedulable doesn't hook into the backend at all, while
intel_context_ban does. In the case of GuC submission intel_context_ban
changes to preemption timeout to 1 us and disables scheduling resulting
in the context getting kicked off the hardware immediately. You likely
need to update intel_context_clear_schedulable to use the same vfunc as
intel_context_ban() but accept an argument for the value of the
preemption timeout. For a ban user a lower value, for clearing
schedulable use a higher value.

>  
> - if (ban && intel_context_ban(ce, NULL))
> + /* Already previously banned or made non-schedulable? */
> + if (skip)
>   continue;
>  
>   /*
> @@ -1093,7 +1101,7 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>   engine = active_engine(ce);
>  
>   /* First attempt to gracefully cancel the context */
> - if (engine && !__cancel_engine(engine) && ban)
> + if (engine && !__cancel_engine(engine) && (ban || !persistent))
>   /*
>* If we are unable to send a preemptive pulse to bump
>* the context from the GPU, we have to resort to a full
> @@ -1105,8 +1113,6 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>  
>  static void kill_context(struct i915_gem_context *ctx)
>  {
> - bool ban = (!i915_gem_context_is_persistent(ctx) ||
> - !ctx->i915->params.enable_hangcheck);
>   struct i915_gem_engines *pos, *next;
>  
>   spin_lock_irq(>stale.lock);
> @@ -1119,7 +1125,8 @@ static void kill_context(struct i915_gem_context *ctx)
>  
>   spin_unlock_irq(>stale.lock);
>  
> - kill_engines(pos, ban);
> + kill_engines(pos, !ctx->i915->params.enable_hangcheck,
> +  

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Tvrtko Ursulin



On 05/08/2021 16:04, Patchwork wrote:

*Patch Details*
*Series:*   drm/i915: Be more gentle when exiting non-persistent contexts
*URL:*	https://patchwork.freedesktop.org/series/93420/ 


*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html 




  CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20775


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_20775 absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_20775, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html



Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_20775:



  IGT changes


Possible regressions

  * igt@i915_selftest@live@gt_lrc:
  o fi-rkl-guc: PASS


-> DMESG-WARN




<6> [233.928677] i915: Running intel_lrc_live_selftests/live_lrc_isolation
<3> [233.988780] i915 :00:02.0: [drm] *ERROR* rcs0 context redzone 
overwritten!

Something GuC specific by the look of it, or at least I haven't found the same 
signature elsewhere. But in any case it is not related to this patch.

Regards,

Tvrtko




Known issues

Here are the changes found in Patchwork_20775 that come from known issues:


  IGT changes


Issues hit

  *

igt@amdgpu/amd_basic@query-info:

  o fi-bsw-kefka: NOTRUN -> SKIP


(fdo#109271
) +17
similar issues
  *

igt@gem_exec_fence@basic-busy@bcs0:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
) +26
similar issues
  *

igt@gem_huc_copy@huc-copy:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
 /
i915#2190 )
  *

igt@i915_pm_rpm@basic-rte:

  o fi-kbl-soraka: NOTRUN -> FAIL


(i915#579 )
  *

igt@i915_selftest@live@gt_pm:

  o fi-kbl-soraka: NOTRUN -> DMESG-FAIL


(i915#1886
 /
i915#2291 )
  *

igt@i915_selftest@live@late_gt_pm:

  o fi-bsw-nick: PASS


-> DMESG-FAIL


(i915#2927 )
  *

igt@kms_chamelium@common-hpd-after-suspend:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
 /
fdo#111827
) +8
similar issues
  *

igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
 / i915#533
)
  *

igt@runner@aborted:

  o fi-bsw-nick: NOTRUN -> FAIL


(fdo#109271
 /
i915#1436 )


Possible 

Re: [Intel-gfx] [PATCH v3 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Jason Gunthorpe
On Tue, Aug 03, 2021 at 10:52:25AM -0600, Alex Williamson wrote:
> On Tue, 3 Aug 2021 13:41:52 -0300
> Jason Gunthorpe  wrote:
> > On Tue, Aug 03, 2021 at 10:34:06AM -0600, Alex Williamson wrote:
> > > I think the vfio_pci_find_reset_target() function needs to be re-worked
> > > to just tell us true/false that it's ok to reset the provided device,
> > > not to anoint an arbitrary target device.  Thanks,  
> > 
> > Yes, though this logic is confusing, why do we need to check if any
> > device needs a reset at this point? If we are being asked to reset
> > vdev shouldn't vdev needs_reset?
> > 
> > Or is the function more of a 'synchronize pending reset' kind of
> > thing?
> 
> Yes, the latter.  For instance think about a multi-function PCI device
> such as a GPU.  The functions have dramatically different capabilities,
> some might have function level reset abilities and others not.  We want
> to be able to trigger a bus reset as the last device of the set is
> released, no matter the order they're released and no matter the
> capabilities of the device we're currently processing.  Thanks,

I worked on this for awhile, I think this is much clearer about what
this algorithm is trying to do:

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 5d6db93d6c680f..e418bcbb68facc 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -223,7 +223,7 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device 
*vdev)
}
 }
 
-static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
+static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set);
 static void vfio_pci_disable(struct vfio_pci_device *vdev);
 static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data);
 
@@ -404,6 +404,9 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
int i, bar;
 
+   /* For needs_reset */
+   lockdep_assert_held(>vdev.dev_set->lock);
+
/* Stop the device from further DMA */
pci_clear_master(pdev);
 
@@ -487,9 +490,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 out:
pci_disable_device(pdev);
 
-   vfio_pci_try_bus_reset(vdev);
-
-   if (!disable_idle_d3)
+   if (!vfio_pci_dev_set_try_reset(vdev->vdev.dev_set) && !disable_idle_d3)
vfio_pci_set_power_state(vdev, PCI_D3hot);
 }
 
@@ -2145,36 +2146,6 @@ static struct pci_driver vfio_pci_driver = {
.err_handler= _err_handlers,
 };
 
-static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
-{
-   struct vfio_devices *devs = data;
-   struct vfio_device *device;
-   struct vfio_pci_device *vdev;
-
-   if (devs->cur_index == devs->max_index)
-   return -ENOSPC;
-
-   device = vfio_device_get_from_dev(>dev);
-   if (!device)
-   return -EINVAL;
-
-   if (pci_dev_driver(pdev) != _pci_driver) {
-   vfio_device_put(device);
-   return -EBUSY;
-   }
-
-   vdev = container_of(device, struct vfio_pci_device, vdev);
-
-   /* Fault if the device is not unused */
-   if (device->open_count) {
-   vfio_device_put(device);
-   return -EBUSY;
-   }
-
-   devs->devices[devs->cur_index++] = vdev;
-   return 0;
-}
-
 static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
 {
struct vfio_devices *devs = data;
@@ -2208,79 +2179,86 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct 
pci_dev *pdev, void *data)
return 0;
 }
 
+static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
+{
+   struct vfio_device_set *dev_set = data;
+   struct vfio_device *cur;
+
+   lockdep_assert_held(_set->lock);
+
+   list_for_each_entry(cur, _set->device_list, dev_set_list)
+   if (cur->dev == >dev)
+   return 0;
+   return -EBUSY;
+}
+
+static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set)
+{
+   struct vfio_pci_device *cur;
+   bool needs_reset = false;
+
+   list_for_each_entry(cur, _set->device_list, vdev.dev_set_list) {
+   /* No VFIO device in the set can have an open device FD */
+   if (cur->vdev.open_count)
+   return false;
+   needs_reset |= cur->needs_reset;
+   }
+   return needs_reset;
+}
+
 /*
- * If a bus or slot reset is available for the provided device and:
+ * If a bus or slot reset is available for the provided dev_set and:
  *  - All of the devices affected by that bus or slot reset are unused
- *(!refcnt)
  *  - At least one of the affected devices is marked dirty via
  *needs_reset (such as by lack of FLR support)
- * Then attempt to perform that bus or slot reset.  Callers are required
- * to hold vdev->dev_set->lock, protecting the bus/slot reset group from
- * concurrent opens.  A 

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/dp: Use max params for older panels

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915/dp: Use max params for older panels
URL   : https://patchwork.freedesktop.org/series/93390/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10445_full -> Patchwork_20769_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20769_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][1] ([i915#1839])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-iclb3/igt@feature_discov...@display-2x.html

  * igt@gem_ctx_persistence@hostile:
- shard-snb:  NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-snb6/igt@gem_ctx_persiste...@hostile.html

  * igt@gem_ctx_persistence@legacy-engines-hang@render:
- shard-tglb: [PASS][3] -> [FAIL][4] ([i915#2410])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb2/igt@gem_ctx_persistence@legacy-engines-h...@render.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-tglb6/igt@gem_ctx_persistence@legacy-engines-h...@render.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][5] -> [TIMEOUT][6] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb5/igt@gem_...@unwedge-stress.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-tglb8/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][7] ([i915#2846])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-apl8/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-kbl:  [PASS][8] -> [FAIL][9] ([i915#2842])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-kbl3/igt@gem_exec_fair@basic-n...@vcs1.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-kbl4/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842]) +1 similar 
issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb6/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-tglb2/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][12] ([i915#2842]) +1 similar issue
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-iclb4/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][13] -> [SKIP][14] ([fdo#109271])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-kbl7/igt@gem_exec_fair@basic-p...@vecs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-kbl6/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_schedule@u-semaphore-user:
- shard-snb:  NOTRUN -> [SKIP][15] ([fdo#109271]) +209 similar 
issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-snb5/igt@gem_exec_sched...@u-semaphore-user.html

  * igt@gem_mmap_gtt@cpuset-medium-copy-odd:
- shard-iclb: [PASS][16] -> [FAIL][17] ([i915#2428])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-iclb3/igt@gem_mmap_...@cpuset-medium-copy-odd.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-iclb7/igt@gem_mmap_...@cpuset-medium-copy-odd.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-snb:  NOTRUN -> [WARN][18] ([i915#2658])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-snb5/igt@gem_pwr...@basic-exhaustion.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
- shard-apl:  NOTRUN -> [SKIP][19] ([fdo#109271] / [i915#1937])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-apl2/igt@i915_pm_lpsp@kms-l...@kms-lpsp-dp.html

  * igt@i915_suspend@forcewake:
- shard-apl:  NOTRUN -> [DMESG-WARN][20] ([i915#180])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-apl8/igt@i915_susp...@forcewake.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip:
- shard-skl:  NOTRUN -> [SKIP][21] ([fdo#109271] / [i915#3777])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-skl1/igt@kms_big...@y-tiled-max-hw-stride-32bpp-rotate-0-hflip.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
- shard-skl:  NOTRUN -> [FAIL][22] ([i915#3722])
   [22]: 

Re: [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering

2021-08-05 Thread Matt Roper
On Wed, Aug 04, 2021 at 01:22:17PM -0700, Lucas De Marchi wrote:
> On Thu, Jul 29, 2021 at 09:59:55AM -0700, Matt Roper wrote:
> > Although DG2_G10 platforms will always have all SQIDI's present and
> > don't need steering for registers in a SQIDI MMIO range, this isn't true
> > for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.
> > 
> > We handle SQIDI ranges a bit differently from other types of explicit
> > steering.  The SQIDI ranges belong to either the MCFG unit or the SF
> > unit, both of which have their own dedicated steering registers and do
> > not use the typical 0xFDC steering control that all other types of
> > ranges use.  Thus we only need to worry about picking a valid initial
> > value for the MCFG and SF steering registers (0xFD0 and 0xFD8
> > resepectively) at driver init; they won't change after we set them up so
> 
> respectively
> 
> > we don't need to worry about re-steering them explicitly at runtime.
> > 
> > Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV,
> > while only values of 2 and 3 are valid for DG2-G11, we'll just
> > initialize the MCFG and SF steering registers to a constant value of "2"
> > for all XeHP-based platforms for simplicity --- that will work in all
> > cases.
> > 
> > Bspec: 66534
> > Cc: Radhakrishna Sripada 
> > Signed-off-by: Matt Roper 
> > ---
> > drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +
> > drivers/gpu/drm/i915/i915_reg.h |  2 ++
> > 2 files changed, 25 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 8717337a6c81..6895b083523d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -889,17 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private 
> > *i915, struct i915_wa_list *wal)
> > GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
> > }
> > 
> > -static void __add_mcr_wa(struct drm_i915_private *i915, struct 
> > i915_wa_list *wal,
> > -unsigned slice, unsigned subslice)
> > +static void __set_mcr_steering(struct i915_wa_list *wal,
> > +  i915_reg_t steering_reg,
> > +  unsigned int slice, unsigned int subslice)
> > {
> > u32 mcr, mcr_mask;
> > 
> > mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
> > mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
> > 
> > -   drm_dbg(>drm, "MCR slice/subslice = %x\n", mcr);
> > +   wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
> > +}
> > +
> > +static void __add_mcr_wa(struct drm_i915_private *i915, struct 
> > i915_wa_list *wal,
> > +unsigned int slice, unsigned int subslice)
> > +{
> > +   drm_dbg(>drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
> 
> maybe we could leave the debug message in __set_mcr_steering() and add
> what steering register we are setting? Up to you.
> 

I've got a separate patch that adds more clear steering debug
information via a drm_printer and then prints it both in the dmesg log
and in a new debugfs node.  The patch depends on some debugfs changes
that haven't shown up yet so I didn't include it here, but I'll rebase
and send it soon if the debugfs changes don't happen first.


Matt

> 
> Reviewed-by: Lucas De Marchi 
> 
> 
> > 
> > -   wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > +   __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
> > }
> > 
> > static void
> > @@ -953,7 +960,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list 
> > *wal)
> >  * - L3 Bank (fusable)
> >  * - MSLICE (fusable)
> >  * - LNCF (sub-unit within mslice; always present if mslice is present)
> > -* - SQIDI (always on)
> >  *
> >  * We'll do our default/implicit steering based on GSLICE (in the
> >  * sliceid field) and DSS (in the subsliceid field).  If we can
> > @@ -1003,6 +1009,18 @@ xehp_init_mcr(struct intel_gt *gt, struct 
> > i915_wa_list *wal)
> > WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> > 
> > __add_mcr_wa(i915, wal, slice, subslice);
> > +
> > +   /*
> > +* SQIDI ranges are special because they use different steering
> > +* registers than everything else we work with.  On XeHP SDV and
> > +* DG2-G10, any value in the steering registers will work fine since
> > +* all instances are present, but DG2-G11 only has SQIDI instances at
> > +* ID's 2 and 3, so we need to steer to one of those.  For simplicity
> > +* we'll just steer to a hardcoded "2" since that value will work
> > +* everywhere.
> > +*/
> > +   __set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > +   __set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> > }
> > 
> > static void
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h 
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index f4113e7e8271..39ce6befff52 100644
> > --- 

Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Daniel Vetter
On Thu, Aug 5, 2021 at 4:47 PM Christian König  wrote:
>
> Am 05.08.21 um 16:07 schrieb Daniel Vetter:
> > On Thu, Aug 5, 2021 at 3:44 PM Christian König  
> > wrote:
> >> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> >>> This is a very confusingly named function, because not just does it
> >>> init an object, it arms it and provides a point of no return for
> >>> pushing a job into the scheduler. It would be nice if that's a bit
> >>> clearer in the interface.
> >>>
> >>> But the real reason is that I want to push the dependency tracking
> >>> helpers into the scheduler code, and that means drm_sched_job_init
> >>> must be called a lot earlier, without arming the job.
> >>>
> >>> v2:
> >>> - don't change .gitignore (Steven)
> >>> - don't forget v3d (Emma)
> >>>
> >>> v3: Emma noticed that I leak the memory allocated in
> >>> drm_sched_job_init if we bail out before the point of no return in
> >>> subsequent driver patches. To be able to fix this change
> >>> drm_sched_job_cleanup() so it can handle being called both before and
> >>> after drm_sched_job_arm().
> >>>
> >>> Also improve the kerneldoc for this.
> >>>
> >>> v4:
> >>> - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
> >>> usual (Melissa)
> >>>
> >>> - Christian pointed out that drm_sched_entity_select_rq() also needs
> >>> to be moved into drm_sched_job_arm, which made me realize that the
> >>> job->id definitely needs to be moved too.
> >>>
> >>> Shuffle things to fit between job_init and job_arm.
> >>>
> >>> v5:
> >>> Reshuffle the split between init/arm once more, amdgpu abuses
> >>> drm_sched.ready to signal gpu reset failures. Also document this
> >>> somewhat. (Christian)
> >>>
> >>> v6:
> >>> Rebase on top of the msm drm/sched support. Note that the
> >>> drm_sched_job_init() call is completely misplaced, and hence also the
> >>> split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
> >>> patch will address.
> >>>
> >>> Acked-by: Melissa Wen 
> >>> Cc: Melissa Wen 
> >>> Acked-by: Emma Anholt 
> >>> Acked-by: Steven Price  (v2)
> >>> Reviewed-by: Boris Brezillon  (v5)
> >>> Signed-off-by: Daniel Vetter 
> >> At least the amdgpu parts look ok of hand, but I can't judge the rest I
> >> think.
> > The thing that really scares me here and that I got wrong a few times
> > is the cleanup for drm_sched_job at the various points. Can you give
> > those parts in drm/scheduler/ a full review pls, just to make sure? I
> > can note that in the tag ofc, just like a bit more confidence here
> > that it's not busted :-)
>
> I can take another look, but I won't have time for that in the next two
> weeks - vacation and kid starting school.

Hm ok I'll ask others, since this is kinda needed for the msm fix. At
least the msm design relies on this split being present, so fixing it
without this split here would be a pile of rather pointless work.
-Daniel

> Christian.
>
> >
> >> So only Acked-by: Christian König 
> > Thanks, Daniel
> >
> >>> Cc: Lucas Stach 
> >>> Cc: Russell King 
> >>> Cc: Christian Gmeiner 
> >>> Cc: Qiang Yu 
> >>> Cc: Rob Herring 
> >>> Cc: Tomeu Vizoso 
> >>> Cc: Steven Price 
> >>> Cc: Alyssa Rosenzweig 
> >>> Cc: David Airlie 
> >>> Cc: Daniel Vetter 
> >>> Cc: Sumit Semwal 
> >>> Cc: "Christian König" 
> >>> Cc: Masahiro Yamada 
> >>> Cc: Kees Cook 
> >>> Cc: Adam Borowski 
> >>> Cc: Nick Terrell 
> >>> Cc: Mauro Carvalho Chehab 
> >>> Cc: Paul Menzel 
> >>> Cc: Sami Tolvanen 
> >>> Cc: Viresh Kumar 
> >>> Cc: Alex Deucher 
> >>> Cc: Dave Airlie 
> >>> Cc: Nirmoy Das 
> >>> Cc: Deepak R Varma 
> >>> Cc: Lee Jones 
> >>> Cc: Kevin Wang 
> >>> Cc: Chen Li 
> >>> Cc: Luben Tuikov 
> >>> Cc: "Marek Olšák" 
> >>> Cc: Dennis Li 
> >>> Cc: Maarten Lankhorst 
> >>> Cc: Andrey Grodzovsky 
> >>> Cc: Sonny Jiang 
> >>> Cc: Boris Brezillon 
> >>> Cc: Tian Tao 
> >>> Cc: etna...@lists.freedesktop.org
> >>> Cc: l...@lists.freedesktop.org
> >>> Cc: linux-me...@vger.kernel.org
> >>> Cc: linaro-mm-...@lists.linaro.org
> >>> Cc: Emma Anholt 
> >>> Cc: Rob Clark 
> >>> Cc: Sean Paul 
> >>> Cc: linux-arm-...@vger.kernel.org
> >>> Cc: freedr...@lists.freedesktop.org
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
> >>>drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
> >>>drivers/gpu/drm/lima/lima_sched.c|  2 +
> >>>drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
> >>>drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
> >>>drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
> >>>drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
> >>>drivers/gpu/drm/scheduler/sched_main.c   | 69 
> >>>drivers/gpu/drm/v3d/v3d_gem.c|  2 +
> >>>include/drm/gpu_scheduler.h  |  7 ++-
> >>>11 files changed, 94 insertions(+), 22 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> 

Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Daniel Vetter
On Thu, Aug 5, 2021 at 3:57 PM Christian König  wrote:
> Am 05.08.21 um 15:25 schrieb Daniel Vetter:
> > On Thu, Aug 5, 2021 at 3:18 PM Christian König  
> > wrote:
> >>
> >>
> >> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> >>> This is essentially part of drm_sched_dependency_optimized(), which
> >>> only amdgpu seems to make use of. Use it a bit more.
> >>>
> >>> This would mean that as-is amdgpu can't use the dependency helpers, at
> >>> least not with the current approach amdgpu has for deciding whether a
> >>> vm_flush is needed. Since amdgpu also has very special rules around
> >>> implicit fencing it can't use those helpers either, and adding a
> >>> drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
> >>> onerous. That way the special case handling for amdgpu sticks even
> >>> more out and we have higher chances that reviewers that go across all
> >>> drivers wont miss it.
> >> Well you should probably drop the sentence about the vm_flush, this is
> >> completely unrelated.
> >>
> >> Additional to that I still don't think that this is a good idea.
> >> Dependency handling is something completely driver specific.
> >>
> >> E.g. even when you have submitted jobs back to back they still might
> >> need a cache flush in between and that is not only for amdgpu like this.
> >>
> >> What you can do is to optimize for while looking at the fences later on
> >> and then note that you have done so and what the last hw fence is you
> >> used instead.
> > Out of 6 drivers using drm/sched 5 can use this. When we get i915
> > over, that one will be added to the list. amdgpu can't use any of this
> > anyway due to the vm_id allocation requirements, which is why I
> > mention that. Also note that all the callbacks are still there, so you
> > can just ignore this all and still build your own. Like amdgpu does.
>
> The VMID allocation stuff is rather easy to handle, that's why I noted
> we should remove that sentence.
>
> The problematic stuff is handling the cache flush and pipeline sync
> which you make impossible with this here.

Well the vmid is tied to the flush, but yeah the commit message
doesn't make this clear.

> > So I'm not sure what exactly your object is, aside from "this doesn't
> > fit for amdgpu", which a) I know b) the commit message explains c)
> > doesn't actually hurt amdgpu in the slightest. And we still get the
> > benefit that for most drivers it's a nice optimization.
>
> Well exactly that's what I wanted to avoid. We still can use this in
> amdgpu even with the VMID allocation stuff and I still hope to do so.
>
> Can't we add this as a wrapper or similar?

This patch is not the only thing that will prevent you from using
these helpers, because amdgpu also needs to keep track of all the
fences in the xarray, which these helpers don't - they get cleared out
as we hand them off to the scheduler. So it's more surgery than just
not having this, and I'm honestly not sure it's worth it since you'd
need to duplicate quite a bit more than just the functions to add
dependencies.
-Daniel

-Daniel

> Christian.
>
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>> Reviewed-by: Lucas Stach 
> >>> Acked-by: Melissa Wen 
> >>> Signed-off-by: Daniel Vetter 
> >>> Cc: "Christian König" 
> >>> Cc: Daniel Vetter 
> >>> Cc: Luben Tuikov 
> >>> Cc: Andrey Grodzovsky 
> >>> Cc: Alex Deucher 
> >>> ---
> >>>drivers/gpu/drm/scheduler/sched_main.c | 7 +++
> >>>1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> >>> b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index f77456929139..49e507f91ec0 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct 
> >>> drm_sched_job *job,
> >>>if (!fence)
> >>>return 0;
> >>>
> >>> + /* if it's a fence from us it's guaranteed to be earlier */
> >>> + if (fence->context == job->entity->fence_context ||
> >>> + fence->context == job->entity->fence_context + 1) {
> >>> + dma_fence_put(fence);
> >>> + return 0;
> >>> + }
> >>> +
> >>>/* Deduplicate if we already depend on a fence from the same 
> >>> context.
> >>> * This lets the size of the array of deps scale with the number of
> >>> * engines involved, rather than the number of BOs.
> >
>


--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915: Be more gentle when exiting non-persistent contexts
URL   : https://patchwork.freedesktop.org/series/93420/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20775


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20775 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20775, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20775:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
Known issues


  Here are the changes found in Patchwork_20775 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html

  * igt@gem_exec_fence@basic-busy@bcs0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][4] ([fdo#109271]) +26 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@gem_exec_fence@basic-b...@bcs0.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#2190])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_rpm@basic-rte:
- fi-kbl-soraka:  NOTRUN -> [FAIL][6] ([i915#579])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][7] ([i915#1886] / [i915#2291])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@late_gt_pm:
- fi-bsw-nick:[PASS][8] -> [DMESG-FAIL][9] ([i915#2927])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][10] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-soraka:  NOTRUN -> [SKIP][11] ([fdo#109271] / [i915#533])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@runner@aborted:
- fi-bsw-nick:NOTRUN -> [FAIL][12] ([fdo#109271] / [i915#1436])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-nick/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [INCOMPLETE][13] ([i915#2940]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#2927]: https://gitlab.freedesktop.org/drm/intel/issues/2927
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (40 -> 35)
--

  Additional (1): fi-kbl-soraka 
  Missing(6): fi-ilk-m540 

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/sched dependency handling and implicit sync fixes

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes
URL   : https://patchwork.freedesktop.org/series/93415/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20773


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20773 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20773, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20773:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [PASS][1] -> [DMESG-FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  
Known issues


  Here are the changes found in Patchwork_20773 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][4] ([i915#3462])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-rkl-guc/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [INCOMPLETE][5] ([i915#2940]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3462]: https://gitlab.freedesktop.org/drm/intel/issues/3462


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10450 -> Patchwork_20773

  CI-20190529: 20190529
  CI_DRM_10450: 51d9c8293e8446e921b74d996982ade862fcfa5c @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20773: 2836aa5b3f16292d5043778e200038dc658fa8b1 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

2836aa5b3f16 dma-resv: Give the docs a do-over
c023ae95f5f1 drm/i915: Don't break exclusive fence ordering
9a678298fbf9 drm/i915: delete exclude argument from 
i915_sw_fence_await_reservation
8d1e08eee56b drm/etnaviv: Don't break exclusive fence ordering
0691454484ff drm/msm: Don't break exclusive fence ordering
7768c5a01737 drm/sched: Check locking in drm_sched_job_await_implicit
a0a15e60e8a4 drm/sched: Don't store self-dependencies
c4596cb48171 drm/gem: Delete gem array fencing helpers
914d59644238 drm/msm: Use scheduler dependency handling
94e8ea6bdac4 drm/etnaviv: Use scheduler dependency handling
43177403c5b4 drm/v3d: Use scheduler dependency handling
e94f03075601 drm/v3d: Move drm_sched_job_init to v3d_job_init
1074a5fc4e39 drm/lima: use scheduler dependency tracking
42d75ea77670 drm/panfrost: use scheduler dependency tracking
3ed6103f784e drm/sched: improve docs around drm_sched_entity
4578cd7fba03 drm/sched: drop entity parameter from drm_sched_push_job
bcde98968570 drm/sched: Add dependency tracking
590f3db49271 drm/sched: Barriers are needed for entity->last_scheduled
e8aa4762329c drm/msm: Fix drm/sched point of no return rules
fbee9b54db37 drm/sched: Split drm_sched_job_init

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/index.html


[Intel-gfx] ✗ Fi.CI.BUILD: failure for Provide core infrastructure for managing open/release (rev8)

2021-08-05 Thread Patchwork
== Series Details ==

Series: Provide core infrastructure for managing open/release (rev8)
URL   : https://patchwork.freedesktop.org/series/92556/
State : failure

== Summary ==

Applying: vfio/samples: Remove module get/put
Applying: vfio/mbochs: Fix missing error unwind of mbochs_used_mbytes
Applying: vfio: Introduce a vfio_uninit_group_dev() API call
Applying: vfio: Provide better generic support for open/release vfio_device_ops
Applying: vfio/samples: Delete useless open/close
Applying: vfio/fsl: Move to the device set infrastructure
Applying: vfio/platform: Use open_device() instead of open coding a refcnt 
scheme
Applying: vfio/pci: Move to the device set infrastructure
Applying: vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set
Applying: vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device set
error: sha1 information is lacking or useless (drivers/vfio/pci/vfio_pci.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0010 vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the 
device set
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".




Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Daniel Vetter
On Thu, Aug 5, 2021 at 3:44 PM Christian König  wrote:
> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
> >
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
> >
> > v2:
> > - don't change .gitignore (Steven)
> > - don't forget v3d (Emma)
> >
> > v3: Emma noticed that I leak the memory allocated in
> > drm_sched_job_init if we bail out before the point of no return in
> > subsequent driver patches. To be able to fix this change
> > drm_sched_job_cleanup() so it can handle being called both before and
> > after drm_sched_job_arm().
> >
> > Also improve the kerneldoc for this.
> >
> > v4:
> > - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
> >usual (Melissa)
> >
> > - Christian pointed out that drm_sched_entity_select_rq() also needs
> >to be moved into drm_sched_job_arm, which made me realize that the
> >job->id definitely needs to be moved too.
> >
> >Shuffle things to fit between job_init and job_arm.
> >
> > v5:
> > Reshuffle the split between init/arm once more, amdgpu abuses
> > drm_sched.ready to signal gpu reset failures. Also document this
> > somewhat. (Christian)
> >
> > v6:
> > Rebase on top of the msm drm/sched support. Note that the
> > drm_sched_job_init() call is completely misplaced, and hence also the
> > split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
> > patch will address.
> >
> > Acked-by: Melissa Wen 
> > Cc: Melissa Wen 
> > Acked-by: Emma Anholt 
> > Acked-by: Steven Price  (v2)
> > Reviewed-by: Boris Brezillon  (v5)
> > Signed-off-by: Daniel Vetter 
>
> At least the amdgpu parts look ok of hand, but I can't judge the rest I
> think.

The thing that really scares me here and that I got wrong a few times
is the cleanup for drm_sched_job at the various points. Can you give
those parts in drm/scheduler/ a full review pls, just to make sure? I
can note that in the tag ofc, just like a bit more confidence here
that it's not busted :-)

> So only Acked-by: Christian König 

Thanks, Daniel

>
> > Cc: Lucas Stach 
> > Cc: Russell King 
> > Cc: Christian Gmeiner 
> > Cc: Qiang Yu 
> > Cc: Rob Herring 
> > Cc: Tomeu Vizoso 
> > Cc: Steven Price 
> > Cc: Alyssa Rosenzweig 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: Sumit Semwal 
> > Cc: "Christian König" 
> > Cc: Masahiro Yamada 
> > Cc: Kees Cook 
> > Cc: Adam Borowski 
> > Cc: Nick Terrell 
> > Cc: Mauro Carvalho Chehab 
> > Cc: Paul Menzel 
> > Cc: Sami Tolvanen 
> > Cc: Viresh Kumar 
> > Cc: Alex Deucher 
> > Cc: Dave Airlie 
> > Cc: Nirmoy Das 
> > Cc: Deepak R Varma 
> > Cc: Lee Jones 
> > Cc: Kevin Wang 
> > Cc: Chen Li 
> > Cc: Luben Tuikov 
> > Cc: "Marek Olšák" 
> > Cc: Dennis Li 
> > Cc: Maarten Lankhorst 
> > Cc: Andrey Grodzovsky 
> > Cc: Sonny Jiang 
> > Cc: Boris Brezillon 
> > Cc: Tian Tao 
> > Cc: etna...@lists.freedesktop.org
> > Cc: l...@lists.freedesktop.org
> > Cc: linux-me...@vger.kernel.org
> > Cc: linaro-mm-...@lists.linaro.org
> > Cc: Emma Anholt 
> > Cc: Rob Clark 
> > Cc: Sean Paul 
> > Cc: linux-arm-...@vger.kernel.org
> > Cc: freedr...@lists.freedesktop.org
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
> >   drivers/gpu/drm/lima/lima_sched.c|  2 +
> >   drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
> >   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
> >   drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
> >   drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
> >   drivers/gpu/drm/scheduler/sched_main.c   | 69 
> >   drivers/gpu/drm/v3d/v3d_gem.c|  2 +
> >   include/drm/gpu_scheduler.h  |  7 ++-
> >   11 files changed, 94 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 139cd3bf1ad6..32e80bc6af22 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser 
> > *p,
> >   if (r)
> >   goto error_unlock;
> >
> > + drm_sched_job_arm(>base);
> > +
> >   /* No memory allocation is allowed while holding the notifier lock.
> >* The lock is held until amdgpu_cs_submit is finished and fence is
> >* added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ 

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/sched dependency handling and implicit sync fixes

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes
URL   : https://patchwork.freedesktop.org/series/93415/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
fbee9b54db37 drm/sched: Split drm_sched_job_init
-:237: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#237: FILE: drivers/gpu/drm/scheduler/sched_fence.c:173:
+   unsigned seq;

-:333: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()
#333: FILE: drivers/gpu/drm/scheduler/sched_main.c:623:
+   BUG_ON(!entity);

-:402: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#402: FILE: include/drm/gpu_scheduler.h:391:
+struct drm_sched_fence *drm_sched_fence_alloc(

-:410: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 1 checks, 249 lines checked
e8aa4762329c drm/msm: Fix drm/sched point of no return rules
-:74: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 39 lines checked
590f3db49271 drm/sched: Barriers are needed for entity->last_scheduled
-:88: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 43 lines checked
bcde98968570 drm/sched: Add dependency tracking
-:195: CHECK:LINE_SPACING: Please don't use multiple blank lines
#195: FILE: drivers/gpu/drm/scheduler/sched_main.c:729:
+
+

-:271: WARNING:TYPO_SPELLING: 'ommitted' may be misspelled - perhaps 'omitted'?
#271: FILE: include/drm/gpu_scheduler.h:244:
+* drm_sched_job_add_implicit_dependencies() this can be ommitted and
 

-:286: CHECK:LINE_SPACING: Please don't use multiple blank lines
#286: FILE: include/drm/gpu_scheduler.h:378:
+
+

-:289: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 2 checks, 230 lines checked
4578cd7fba03 drm/sched: drop entity parameter from drm_sched_push_job
-:227: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 110 lines checked
3ed6103f784e drm/sched: improve docs around drm_sched_entity
-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 620e762f9a98 ("drm/scheduler: 
move entity handling into separate file")'
#17: 
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into

-:413: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 346 lines checked
42d75ea77670 drm/panfrost: use scheduler dependency tracking
-:214: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 158 lines checked
1074a5fc4e39 drm/lima: use scheduler dependency tracking
-:118: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 75 lines checked
e94f03075601 drm/v3d: Move drm_sched_job_init to v3d_job_init
-:344: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 288 lines checked
43177403c5b4 drm/v3d: Use scheduler dependency handling
-:207: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 162 lines checked
94e8ea6bdac4 drm/etnaviv: Use scheduler dependency handling
-:13: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#13: 
I wanted to to in the previous round (and did, for all other drivers).

-:122: WARNING:LINE_SPACING: Missing a blank line after declarations
#122: FILE: drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:552:
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {

-:297: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 0 checks, 243 lines checked
914d59644238 drm/msm: Use scheduler dependency handling
-:132: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 

Re: [Intel-gfx] [PATCH v5 15/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-08-05 Thread Daniel Vetter
On Thu, Aug 5, 2021 at 3:19 PM Christian König  wrote:
>
> Am 05.08.21 um 12:47 schrieb Daniel Vetter:
> > You really need to hold the reservation here or all kinds of funny
> > things can happen between grabbing the dependencies and inserting the
> > new fences.
> >
> > Acked-by: Melissa Wen 
> > Signed-off-by: Daniel Vetter 
> > Cc: "Christian König" 
> > Cc: Daniel Vetter 
> > Cc: Luben Tuikov 
> > Cc: Andrey Grodzovsky 
> > Cc: Alex Deucher 
>
> The function name in the subject line should be updated, apart from that
> feel free to add my rb to this patch.

Fixed locally and r-b added, I think the later parts of this series
will need to be resent anyway. Thanks for your review.
-Daniel

>
> Christian.
>
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index 49e507f91ec0..1abb40b07324 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -715,6 +715,8 @@ int drm_sched_job_add_implicit_dependencies(struct 
> > drm_sched_job *job,
> >   struct dma_fence **fences;
> >   unsigned int i, fence_count;
> >
> > + dma_resv_assert_held(obj->resv);
> > +
> >   if (!write) {
> >   struct dma_fence *fence = 
> > dma_resv_get_excl_unlocked(obj->resv);
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Daniel Vetter
On Thu, Aug 5, 2021 at 3:18 PM Christian König  wrote:
>
>
>
> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> > This is essentially part of drm_sched_dependency_optimized(), which
> > only amdgpu seems to make use of. Use it a bit more.
> >
> > This would mean that as-is amdgpu can't use the dependency helpers, at
> > least not with the current approach amdgpu has for deciding whether a
> > vm_flush is needed. Since amdgpu also has very special rules around
> > implicit fencing it can't use those helpers either, and adding a
> > drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
> > onerous. That way the special case handling for amdgpu sticks even
> > more out and we have higher chances that reviewers that go across all
> > drivers wont miss it.
>
> Well you should probably drop the sentence about the vm_flush, this is
> completely unrelated.
>
> Additional to that I still don't think that this is a good idea.
> Dependency handling is something completely driver specific.
>
> E.g. even when you have submitted jobs back to back they still might
> need a cache flush in between and that is not only for amdgpu like this.
>
> What you can do is to optimize for while looking at the fences later on
> and then note that you have done so and what the last hw fence is you
> used instead.

Out of 6 drivers using drm/sched 5 can use this. When we get i915
over, that one will be added to the list. amdgpu can't use any of this
anyway due to the vm_id allocation requirements, which is why I
mention that. Also note that all the callbacks are still there, so you
can just ignore this all and still build your own. Like amdgpu does.

So I'm not sure what exactly your object is, aside from "this doesn't
fit for amdgpu", which a) I know b) the commit message explains c)
doesn't actually hurt amdgpu in the slightest. And we still get the
benefit that for most drivers it's a nice optimization.
-Daniel

> Regards,
> Christian.
>
> >
> > Reviewed-by: Lucas Stach 
> > Acked-by: Melissa Wen 
> > Signed-off-by: Daniel Vetter 
> > Cc: "Christian König" 
> > Cc: Daniel Vetter 
> > Cc: Luben Tuikov 
> > Cc: Andrey Grodzovsky 
> > Cc: Alex Deucher 
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 7 +++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index f77456929139..49e507f91ec0 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job 
> > *job,
> >   if (!fence)
> >   return 0;
> >
> > + /* if it's a fence from us it's guaranteed to be earlier */
> > + if (fence->context == job->entity->fence_context ||
> > + fence->context == job->entity->fence_context + 1) {
> > + dma_fence_put(fence);
> > + return 0;
> > + }
> > +
> >   /* Deduplicate if we already depend on a fence from the same context.
> >* This lets the size of the array of deps scale with the number of
> >* engines involved, rather than the number of BOs.
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[Intel-gfx] ✗ Fi.CI.BAT: failure for remove rcu support from i915_address_space (rev4)

2021-08-05 Thread Patchwork
== Series Details ==

Series: remove rcu support from i915_address_space (rev4)
URL   : https://patchwork.freedesktop.org/series/93314/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10449 -> Patchwork_20772


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20772 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20772, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20772:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
Known issues


  Here are the changes found in Patchwork_20772 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [FAIL][3] ([i915#1372]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3844]: https://gitlab.freedesktop.org/drm/intel/issues/3844
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (39 -> 35)
--

  Additional (1): fi-jsl-1 
  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10449 -> Patchwork_20772

  CI-20190529: 20190529
  CI_DRM_10449: b0b7ea6dcb6afb51059e3ae01afece47c41fd0c1 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20772: be11100f27eef517b2b401b8afe9401e9e599d0f @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

be11100f27ee drm/i915: Stop rcu support for i915_address_space
fea76eb28a60 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
c1770ae51252 drm/i915: Drop __rcu from gem_context->vm
47f45eed8f19 drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
22042c81dd12 drm/i915: Add i915_gem_context_is_full_ppgtt
37d02c39555f drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
1bd404a5dddc drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
0e6cb5de1e74 drm/i915: Drop code to handle set-vm races from execbuf

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/index.html


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for remove rcu support from i915_address_space (rev4)

2021-08-05 Thread Patchwork
== Series Details ==

Series: remove rcu support from i915_address_space (rev4)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34: warning: incorrect type 
in argument 1 (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)




[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for remove rcu support from i915_address_space (rev4)

2021-08-05 Thread Patchwork
== Series Details ==

Series: remove rcu support from i915_address_space (rev4)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0e6cb5de1e74 drm/i915: Drop code to handle set-vm races from execbuf
-:17: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:46: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 2 warnings, 0 checks, 12 lines checked
1bd404a5dddc drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
-:148: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 80 lines checked
37d02c39555f drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
-:54: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 23 lines checked
22042c81dd12 drm/i915: Add i915_gem_context_is_full_ppgtt
-:105: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 53 lines checked
47f45eed8f19 drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#12: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:61: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 18 lines checked
c1770ae51252 drm/i915: Drop __rcu from gem_context->vm
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#11: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:23: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#23: 
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

-:42: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit a4e7ccdac38e ("drm/i915: Move 
context management under GEM")'
#42: 
commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1

-:345: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 2 errors, 2 warnings, 0 checks, 232 lines checked
fea76eb28a60 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
-:15: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit aabbe344dc3c ("drm/i915: Use RCU 
for unlocked vm_idr lookup")'
#15: 
commit aabbe344dc3ca5f7d8263a02608ba6179e8a4499

-:52: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 13 lines checked
be11100f27ee drm/i915: Stop rcu support for i915_address_space
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#11: 
- i915_dpt has very simple lifetime (somehow we create a display pagetable vm

-:27: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit cf977e18610e ("drm/i915/gem: 
Spring clean debugfs")'
#27: 
commit cf977e18610e66e48c31619e7e0cfa871be9eada

-:35: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit db80a1294c23 ("drm/i915/gem: 
Remove per-client stats from debugfs/i915_gem_objects")'
#35: 
commit db80a1294c231b6ac725085f046bb2931e00c9db

-:47: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#47: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:59: WARNING:TYPO_SPELLING: 'Preceeding' may be misspelled - perhaps 
'Preceding'?
#59: 
  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  ^^


Re: [Intel-gfx] [PATCH v3 07/14] vfio/platform: Use open_device() instead of open coding a refcnt scheme

2021-08-05 Thread Eric Auger
Hi Jason,

On 7/29/21 2:49 AM, Jason Gunthorpe wrote:
> Platform simply wants to run some code when the device is first
> opened/last closed. Use the core framework and locking for this.  Aside
> from removing a bit of code this narrows the locking scope from a global
> lock.
>
> Signed-off-by: Jason Gunthorpe 
> Signed-off-by: Yishai Hadas 
> Reviewed-by: Cornelia Huck 
> Reviewed-by: Christoph Hellwig 
Reviewed-by: Eric Auger 

Thanks

Eric

> ---
>  drivers/vfio/platform/vfio_platform_common.c  | 79 ---
>  drivers/vfio/platform/vfio_platform_private.h |  1 -
>  2 files changed, 32 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform_common.c 
> b/drivers/vfio/platform/vfio_platform_common.c
> index bdde8605178cd2..6af7ce7d619c25 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -218,65 +218,52 @@ static int vfio_platform_call_reset(struct 
> vfio_platform_device *vdev,
>   return -EINVAL;
>  }
>  
> -static void vfio_platform_release(struct vfio_device *core_vdev)
> +static void vfio_platform_close_device(struct vfio_device *core_vdev)
>  {
>   struct vfio_platform_device *vdev =
>   container_of(core_vdev, struct vfio_platform_device, vdev);
> + const char *extra_dbg = NULL;
> + int ret;
>  
> - mutex_lock(_lock);
> -
> - if (!(--vdev->refcnt)) {
> - const char *extra_dbg = NULL;
> - int ret;
> -
> - ret = vfio_platform_call_reset(vdev, _dbg);
> - if (ret && vdev->reset_required) {
> - dev_warn(vdev->device, "reset driver is required and 
> reset call failed in release (%d) %s\n",
> -  ret, extra_dbg ? extra_dbg : "");
> - WARN_ON(1);
> - }
> - pm_runtime_put(vdev->device);
> - vfio_platform_regions_cleanup(vdev);
> - vfio_platform_irq_cleanup(vdev);
> + ret = vfio_platform_call_reset(vdev, _dbg);
> + if (WARN_ON(ret && vdev->reset_required)) {
> + dev_warn(
> + vdev->device,
> + "reset driver is required and reset call failed in 
> release (%d) %s\n",
> + ret, extra_dbg ? extra_dbg : "");
>   }
> -
> - mutex_unlock(_lock);
> + pm_runtime_put(vdev->device);
> + vfio_platform_regions_cleanup(vdev);
> + vfio_platform_irq_cleanup(vdev);
>  }
>  
> -static int vfio_platform_open(struct vfio_device *core_vdev)
> +static int vfio_platform_open_device(struct vfio_device *core_vdev)
>  {
>   struct vfio_platform_device *vdev =
>   container_of(core_vdev, struct vfio_platform_device, vdev);
> + const char *extra_dbg = NULL;
>   int ret;
>  
> - mutex_lock(_lock);
> -
> - if (!vdev->refcnt) {
> - const char *extra_dbg = NULL;
> -
> - ret = vfio_platform_regions_init(vdev);
> - if (ret)
> - goto err_reg;
> + ret = vfio_platform_regions_init(vdev);
> + if (ret)
> + return ret;
>  
> - ret = vfio_platform_irq_init(vdev);
> - if (ret)
> - goto err_irq;
> + ret = vfio_platform_irq_init(vdev);
> + if (ret)
> + goto err_irq;
>  
> - ret = pm_runtime_get_sync(vdev->device);
> - if (ret < 0)
> - goto err_rst;
> + ret = pm_runtime_get_sync(vdev->device);
> + if (ret < 0)
> + goto err_rst;
>  
> - ret = vfio_platform_call_reset(vdev, _dbg);
> - if (ret && vdev->reset_required) {
> - dev_warn(vdev->device, "reset driver is required and 
> reset call failed in open (%d) %s\n",
> -  ret, extra_dbg ? extra_dbg : "");
> - goto err_rst;
> - }
> + ret = vfio_platform_call_reset(vdev, _dbg);
> + if (ret && vdev->reset_required) {
> + dev_warn(
> + vdev->device,
> + "reset driver is required and reset call failed in open 
> (%d) %s\n",
> + ret, extra_dbg ? extra_dbg : "");
> + goto err_rst;
>   }
> -
> - vdev->refcnt++;
> -
> - mutex_unlock(_lock);
>   return 0;
>  
>  err_rst:
> @@ -284,8 +271,6 @@ static int vfio_platform_open(struct vfio_device 
> *core_vdev)
>   vfio_platform_irq_cleanup(vdev);
>  err_irq:
>   vfio_platform_regions_cleanup(vdev);
> -err_reg:
> - mutex_unlock(_lock);
>   return ret;
>  }
>  
> @@ -616,8 +601,8 @@ static int vfio_platform_mmap(struct vfio_device 
> *core_vdev, struct vm_area_stru
>  
>  static const struct vfio_device_ops vfio_platform_ops = {
>   .name   = "vfio-platform",
> - .open   = vfio_platform_open,
> - .release= vfio_platform_release,
> + .open_device= vfio_platform_open_device,
> 

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Update small joiner ram size

2021-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915: Update small joiner ram size
URL   : https://patchwork.freedesktop.org/series/93410/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10449 -> Patchwork_20771


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/index.html

Known issues


  Here are the changes found in Patchwork_20771 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [PASS][1] -> [FAIL][2] ([i915#1888])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  
 Possible fixes 

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [FAIL][3] ([i915#1372]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3844]: https://gitlab.freedesktop.org/drm/intel/issues/3844
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (39 -> 35)
--

  Additional (1): fi-jsl-1 
  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10449 -> Patchwork_20771

  CI-20190529: 20190529
  CI_DRM_10449: b0b7ea6dcb6afb51059e3ae01afece47c41fd0c1 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20771: 76ad0b5c9154228fa572485a518c25a1e8fe1c4d @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

76ad0b5c9154 drm/i915: Update small joiner ram size

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/index.html


[Intel-gfx] [CI v2] drm/i915: Tweaked Wa_14010685332 for all PCHs

2021-08-05 Thread Anshuman Gupta
dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform
despite Wa_14010685332 original sequence, thus blocks entry to deeper s0ix 
state.

The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked
Wa_14010685332 sequence for every PCH since PCH_CNP.

v2:
- removed RKL from comment and simplified condition. [Rodrigo]

Fixes: b896898c7369 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 
platforms")
Cc: Matt Roper 
Cc: Rodrigo Vivi 
Cc: Imre Deak 
Signed-off-by: Anshuman Gupta 
Reviewed-by: Rodrigo Vivi 
---
 .../drm/i915/display/intel_display_power.c| 16 +++---
 drivers/gpu/drm/i915/i915_irq.c   | 21 ---
 2 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 5da293369f30..cce1a926fcc1 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -6329,13 +6329,13 @@ void intel_display_power_suspend_late(struct 
drm_i915_private *i915)
if (DISPLAY_VER(i915) >= 11 || IS_GEMINILAKE(i915) ||
IS_BROXTON(i915)) {
bxt_enable_dc9(i915);
-   /* Tweaked Wa_14010685332:icp,jsp,mcc */
-   if (INTEL_PCH_TYPE(i915) >= PCH_ICP && INTEL_PCH_TYPE(i915) <= 
PCH_MCC)
-   intel_de_rmw(i915, SOUTH_CHICKEN1,
-SBCLK_RUN_REFCLK_DIS, 
SBCLK_RUN_REFCLK_DIS);
} else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) {
hsw_enable_pc8(i915);
}
+
+   /* Tweaked Wa_14010685332:cnp,icp,jsp,mcc,tgp,adp */
+   if (INTEL_PCH_TYPE(i915) >= PCH_CNP && INTEL_PCH_TYPE(i915) < PCH_DG1)
+   intel_de_rmw(i915, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 
SBCLK_RUN_REFCLK_DIS);
 }
 
 void intel_display_power_resume_early(struct drm_i915_private *i915)
@@ -6344,13 +6344,13 @@ void intel_display_power_resume_early(struct 
drm_i915_private *i915)
IS_BROXTON(i915)) {
gen9_sanitize_dc_state(i915);
bxt_disable_dc9(i915);
-   /* Tweaked Wa_14010685332:icp,jsp,mcc */
-   if (INTEL_PCH_TYPE(i915) >= PCH_ICP && INTEL_PCH_TYPE(i915) <= 
PCH_MCC)
-   intel_de_rmw(i915, SOUTH_CHICKEN1, 
SBCLK_RUN_REFCLK_DIS, 0);
-
} else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) {
hsw_disable_pc8(i915);
}
+
+   /* Tweaked Wa_14010685332:cnp,icp,jsp,mcc,tgp,adp */
+   if (INTEL_PCH_TYPE(i915) >= PCH_CNP && INTEL_PCH_TYPE(i915) < PCH_DG1)
+   intel_de_rmw(i915, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 0);
 }
 
 void intel_display_power_suspend(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 17d336218b67..9bc4f4a8e12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3079,24 +3079,6 @@ static void valleyview_irq_reset(struct drm_i915_private 
*dev_priv)
spin_unlock_irq(_priv->irq_lock);
 }
 
-static void cnp_display_clock_wa(struct drm_i915_private *dev_priv)
-{
-   struct intel_uncore *uncore = _priv->uncore;
-
-   /*
-* Wa_14010685332:cnp/cmp,tgp,adp
-* TODO: Clarify which platforms this applies to
-* TODO: Figure out if this workaround can be applied in the s0ix 
suspend/resume handlers as
-* on earlier platforms and whether the workaround is also needed for 
runtime suspend/resume
-*/
-   if (INTEL_PCH_TYPE(dev_priv) == PCH_CNP ||
-   (INTEL_PCH_TYPE(dev_priv) >= PCH_TGP && INTEL_PCH_TYPE(dev_priv) < 
PCH_DG1)) {
-   intel_uncore_rmw(uncore, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS,
-SBCLK_RUN_REFCLK_DIS);
-   intel_uncore_rmw(uncore, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 
0);
-   }
-}
-
 static void gen8_display_irq_reset(struct drm_i915_private *dev_priv)
 {
struct intel_uncore *uncore = _priv->uncore;
@@ -3130,7 +3112,6 @@ static void gen8_irq_reset(struct drm_i915_private 
*dev_priv)
if (HAS_PCH_SPLIT(dev_priv))
ibx_irq_reset(dev_priv);
 
-   cnp_display_clock_wa(dev_priv);
 }
 
 static void gen11_display_irq_reset(struct drm_i915_private *dev_priv)
@@ -3174,8 +3155,6 @@ static void gen11_display_irq_reset(struct 
drm_i915_private *dev_priv)
 
if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP)
GEN3_IRQ_RESET(uncore, SDE);
-
-   cnp_display_clock_wa(dev_priv);
 }
 
 static void gen11_irq_reset(struct drm_i915_private *dev_priv)
-- 
2.26.2



[Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Tvrtko Ursulin
From: Tvrtko Ursulin 

When a non-persistent context exits we currently mark it as banned in
order to trigger fast termination of any outstanding GPU jobs it may have
left running.

In doing so we apply a very strict 1ms limit in which the left over job
has to preempt before we issues an engine resets.

Some workloads are not able to cleanly preempt in that time window and it
can be argued that it would instead be better to give them a bit more
grace since avoiding engine resets is generally preferrable.

To achieve this the patch splits handling of banned contexts from simply
closed non-persistent ones and then applies different timeouts for both
and also extends the criteria which determines if a request should be
scheduled back in after preemption or not.

15ms preempt timeout grace is given to exited non-persistent contexts
which have been empirically tested to satisfy customers requirements
and still provides reasonably quick cleanup post exit.

v2:
 * Streamline fast path checks.

v3:
 * Simplify by using only schedulable status.
 * Increase timeout to 20ms.

v4:
 * Fix live_execlists selftest.

v5:
 * Fix logic in kill_engines.

v6:
 * Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: Zhen Han 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +--
 drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_context.h   | 17 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
 .../drm/i915/gt/intel_execlists_submission.c  | 11 --
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++--
 drivers/gpu/drm/i915/i915_request.c   |  2 +-
 7 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index cff72679ad7c..21fe5d4057ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1065,7 +1065,8 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
return engine;
 }
 
-static void kill_engines(struct i915_gem_engines *engines, bool ban)
+static void
+kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
 {
struct i915_gem_engines_iter it;
struct intel_context *ce;
@@ -1079,8 +1080,15 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 */
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
+   bool skip = false;
+
+   if (ban)
+   skip = intel_context_ban(ce, NULL);
+   else if (!persistent)
+   skip = !intel_context_clear_schedulable(ce);
 
-   if (ban && intel_context_ban(ce, NULL))
+   /* Already previously banned or made non-schedulable? */
+   if (skip)
continue;
 
/*
@@ -1093,7 +1101,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
engine = active_engine(ce);
 
/* First attempt to gracefully cancel the context */
-   if (engine && !__cancel_engine(engine) && ban)
+   if (engine && !__cancel_engine(engine) && (ban || !persistent))
/*
 * If we are unable to send a preemptive pulse to bump
 * the context from the GPU, we have to resort to a full
@@ -1105,8 +1113,6 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 
 static void kill_context(struct i915_gem_context *ctx)
 {
-   bool ban = (!i915_gem_context_is_persistent(ctx) ||
-   !ctx->i915->params.enable_hangcheck);
struct i915_gem_engines *pos, *next;
 
spin_lock_irq(>stale.lock);
@@ -1119,7 +1125,8 @@ static void kill_context(struct i915_gem_context *ctx)
 
spin_unlock_irq(>stale.lock);
 
-   kill_engines(pos, ban);
+   kill_engines(pos, !ctx->i915->params.enable_hangcheck,
+i915_gem_context_is_persistent(ctx));
 
spin_lock_irq(>stale.lock);
GEM_BUG_ON(i915_sw_fence_signaled(>fence));
@@ -1165,7 +1172,8 @@ static void engines_idle_release(struct i915_gem_context 
*ctx,
 
 kill:
if (list_empty(>link)) /* raced, already closed */
-   kill_engines(engines, true);
+   kill_engines(engines, true,
+i915_gem_context_is_persistent(ctx));
 
i915_sw_fence_commit(>fence);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 745e84c72c90..bc1701ef1578 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -382,6 +382,8 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
ce->ring = NULL;
ce->ring_size = 

Re: [Intel-gfx] [PATCH 24/33] drm/i915/guc: Implement banned contexts for GuC submission

2021-08-05 Thread Tvrtko Ursulin



On 27/07/2021 01:23, Matthew Brost wrote:

When using GuC submission, if a context gets banned disable scheduling
and mark all inflight requests as complete.

Cc: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
  drivers/gpu/drm/i915/gt/intel_context.h   |  13 ++
  drivers/gpu/drm/i915/gt/intel_context_types.h |   2 +
  drivers/gpu/drm/i915/gt/intel_reset.c |  32 +---
  .../gpu/drm/i915/gt/intel_ring_submission.c   |  20 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|   2 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 151 --
  drivers/gpu/drm/i915/i915_trace.h |  10 ++
  8 files changed, 195 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e3df01a201d7..05c3ee191710 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1084,7 +1084,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
  
-		if (ban && intel_context_set_banned(ce))

+   if (ban && intel_context_ban(ce, NULL))
continue;
  
  		/*

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 2ed9bf5f91a5..814d9277096a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -16,6 +16,7 @@
  #include "intel_engine_types.h"
  #include "intel_ring_types.h"
  #include "intel_timeline_types.h"
+#include "i915_trace.h"
  
  #define CE_TRACE(ce, fmt, ...) do {	\

const struct intel_context *ce__ = (ce);\
@@ -243,6 +244,18 @@ static inline bool intel_context_set_banned(struct 
intel_context *ce)
return test_and_set_bit(CONTEXT_BANNED, >flags);
  }
  
+static inline bool intel_context_ban(struct intel_context *ce,

+struct i915_request *rq)
+{
+   bool ret = intel_context_set_banned(ce);
+
+   trace_intel_context_ban(ce);
+   if (ce->ops->ban)
+   ce->ops->ban(ce, rq);


Do you want to skip this call if already banned?


+
+   return ret;
+}
+
  static inline bool
  intel_context_force_single_submission(const struct intel_context *ce)
  {
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 035108c10b2c..57c19ee3e313 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -35,6 +35,8 @@ struct intel_context_ops {
  
  	int (*alloc)(struct intel_context *ce);
  
+	void (*ban)(struct intel_context *ce, struct i915_request *rq);

+
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, 
void **vaddr);
int (*pin)(struct intel_context *ce, void *vaddr);
void (*unpin)(struct intel_context *ce);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 4d281bc8a38c..91200c43951f 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -22,7 +22,6 @@
  #include "intel_reset.h"
  
  #include "uc/intel_guc.h"

-#include "uc/intel_guc_submission.h"
  
  #define RESET_MAX_RETRIES 3
  
@@ -39,21 +38,6 @@ static void rmw_clear_fw(struct intel_uncore *uncore, i915_reg_t reg, u32 clr)

intel_uncore_rmw_fw(uncore, reg, clr, 0);
  }
  
-static void skip_context(struct i915_request *rq)

-{
-   struct intel_context *hung_ctx = rq->context;
-
-   list_for_each_entry_from_rcu(rq, _ctx->timeline->requests, link) {
-   if (!i915_request_is_active(rq))
-   return;
-
-   if (rq->context == hung_ctx) {
-   i915_request_set_error_once(rq, -EIO);
-   __i915_request_skip(rq);
-   }
-   }
-}


More importantly I must be missing something - this code has been moved 
to ring_context_ban - what am I not seeing on the execlists side of things?!


Regards,

Tvrtko


-
  static void client_mark_guilty(struct i915_gem_context *ctx, bool banned)
  {
struct drm_i915_file_private *file_priv = ctx->file_priv;
@@ -88,10 +72,8 @@ static bool mark_guilty(struct i915_request *rq)
bool banned;
int i;
  
-	if (intel_context_is_closed(rq->context)) {

-   intel_context_set_banned(rq->context);
+   if (intel_context_is_closed(rq->context))
return true;
-   }
  
  	rcu_read_lock();

ctx = rcu_dereference(rq->context->gem_context);
@@ -123,11 +105,9 @@ static bool mark_guilty(struct i915_request *rq)
banned = !i915_gem_context_is_recoverable(ctx);
if (time_before(jiffies, prev_hang + CONTEXT_FAST_HANG_JIFFIES))
banned = true;
- 

[Intel-gfx] [PATCH v5 20/20] dma-resv: Give the docs a do-over

2021-08-05 Thread Daniel Vetter
Specifically document the new/clarified rules around how the shared
fences do not have any ordering requirements against the exclusive
fence.

But also document all the things a bit better, given how central
struct dma_resv to dynamic buffer management the docs have been very
inadequat.

- Lots more links to other pieces of the puzzle. Unfortunately
  ttm_buffer_object has no docs, so no links :-(

- Explain/complain a bit about dma_resv_locking_ctx(). I still don't
  like that one, but fixing the ttm call chains is going to be
  horrible. Plus we want to plug in real slowpath locking when we do
  that anyway.

- Main part of the patch is some actual docs for struct dma_resv.

Overall I think we still have a lot of bad naming in this area (e.g.
dma_resv.fence is singular, but contains the multiple shared fences),
but I think that's more indicative of how the semantics and rules are
just not great.

Another thing that's real awkard is how chaining exclusive fences
right now means direct dma_resv.exclusive_fence pointer access with an
rcu_assign_pointer. Not so great either.

v2:
- Fix a pile of typos (Matt, Jason)
- Hammer it in that breaking the rules leads to use-after-free issues
  around dma-buf sharing (Christian)

Reviewed-by: Christian König 
Cc: Jason Ekstrand 
Cc: Matthew Auld 
Reviewed-by: Matthew Auld 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/dma-buf/dma-resv.c |  24 ++---
 include/linux/dma-buf.h|   7 +++
 include/linux/dma-resv.h   | 104 +++--
 3 files changed, 124 insertions(+), 11 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index e744fd87c63c..84fbe60629e3 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -48,6 +48,8 @@
  * write operations) or N shared fences (read operations).  The RCU
  * mechanism is used to protect read access to fences from locked
  * write-side updates.
+ *
+ * See struct dma_resv for more details.
  */
 
 DEFINE_WD_CLASS(reservation_ww_class);
@@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
  * @num_fences: number of fences we want to add
  *
  * Should be called before dma_resv_add_shared_fence().  Must
- * be called with obj->lock held.
+ * be called with @obj locked through dma_resv_lock().
+ *
+ * Note that the preallocated slots need to be re-reserved if @obj is unlocked
+ * at any time before calling dma_resv_add_shared_fence(). This is validated
+ * when CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to a shared slot, obj->lock must be held, and
+ * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_shared() has been called.
+ *
+ * See also _resv.fence for a discussion of the semantics.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -278,9 +286,11 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
 /**
  * dma_resv_add_excl_fence - Add an exclusive fence.
  * @obj: the reservation object
- * @fence: the shared fence to add
+ * @fence: the exclusive fence to add
  *
- * Add a fence to the exclusive slot.  The obj->lock must be held.
+ * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
+ * Note that this function replaces all fences attached to @obj, see also
+ * _resv.fence_excl for a discussion of the semantics.
  */
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  * fence
  *
  * Callers are not required to hold specific locks, but maybe hold
- * dma_resv_lock() already
+ * dma_resv_lock() already.
+ *
  * RETURNS
- * true if all fences signaled, else false
+ *
+ * True if all fences signaled, else false.
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 678b2006be78..fc62b5f9980c 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -420,6 +420,13 @@ struct dma_buf {
 * - Dynamic importers should set fences for any access that they can't
 *   disable immediately from their _buf_attach_ops.move_notify
 *   callback.
+*
+* IMPORTANT:
+*
+* All drivers must obey the struct dma_resv rules, specifically the
+* rules for updating fences, see _resv.fence_excl and
+* _resv.fence. If these dependency rules are broken access tracking
+* can be lost resulting in use after free issues.
 */
struct dma_resv *resv;
 
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index e1ca2080a1ff..9100dd3dc21f 100644
--- 

[Intel-gfx] [PATCH v5 19/20] drm/i915: Don't break exclusive fence ordering

2021-08-05 Thread Daniel Vetter
There's only one exclusive slot, and we must not break the ordering.
Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/

v2: Improve commit message per Lucas' suggestion.

Cc: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1ed7475de454..25ba2765d27d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2240,6 +2240,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
struct i915_vma *vma = ev->vma;
unsigned int flags = ev->flags;
struct drm_i915_gem_object *obj = vma->obj;
+   bool async, write;
 
assert_vma_held(vma);
 
@@ -2271,7 +2272,10 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
flags &= ~EXEC_OBJECT_ASYNC;
}
 
-   if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) {
+   async = flags & EXEC_OBJECT_ASYNC;
+   write = flags & EXEC_OBJECT_WRITE;
+
+   if (err == 0 && (!async || write)) {
err = i915_request_await_object
(eb->request, obj, flags & EXEC_OBJECT_WRITE);
}
-- 
2.32.0



[Intel-gfx] [PATCH v5 12/20] drm/msm: Use scheduler dependency handling

2021-08-05 Thread Daniel Vetter
drm_sched_job_init is already at the right place, so this boils down
to deleting code.

Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/msm/msm_gem.h|  5 -
 drivers/gpu/drm/msm/msm_gem_submit.c | 19 +--
 drivers/gpu/drm/msm/msm_ringbuffer.c | 12 
 3 files changed, 5 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index f9e3ffb2309a..8bf0ac707fd7 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -312,11 +312,6 @@ struct msm_gem_submit {
struct ww_acquire_ctx ticket;
uint32_t seqno; /* Sequence number of the submit on the ring */
 
-   /* Array of struct dma_fence * to block on before submitting this job.
-*/
-   struct xarray deps;
-   unsigned long last_dep;
-
/* Hw fence, which is created when the scheduler executes the job, and
 * is signaled when the hw finishes (via seqno write from cmdstream)
 */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 96cea0ba4cfd..fb5a2eab27a2 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
return ERR_PTR(ret);
}
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
kref_init(>ref);
submit->dev = dev;
submit->aspace = queue->ctx->aspace;
@@ -72,8 +70,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
 {
struct msm_gem_submit *submit =
container_of(kref, struct msm_gem_submit, ref);
-   unsigned long index;
-   struct dma_fence *fence;
unsigned i;
 
if (submit->fence_id) {
@@ -82,12 +78,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
mutex_unlock(>queue->lock);
}
 
-   xa_for_each (>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-
-   xa_destroy(>deps);
-
dma_fence_put(submit->user_fence);
dma_fence_put(submit->hw_fence);
 
@@ -343,8 +333,9 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
if (no_implicit)
continue;
 
-   ret = drm_gem_fence_array_add_implicit(>deps, obj,
-   write);
+   ret = drm_sched_job_add_implicit_dependencies(>base,
+ obj,
+ write);
if (ret)
break;
}
@@ -588,7 +579,7 @@ static struct drm_syncobj **msm_parse_deps(struct 
msm_gem_submit *submit,
if (ret)
break;
 
-   ret = drm_gem_fence_array_add(>deps, fence);
+   ret = drm_sched_job_add_dependency(>base, fence);
if (ret)
break;
 
@@ -798,7 +789,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
goto out_unlock;
}
 
-   ret = drm_gem_fence_array_add(>deps, in_fence);
+   ret = drm_sched_job_add_dependency(>base, in_fence);
if (ret)
goto out_unlock;
}
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index bd54c1412649..652b1dedd7c1 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -11,17 +11,6 @@ static uint num_hw_submissions = 8;
 MODULE_PARM_DESC(num_hw_submissions, "The max # of jobs to write into 
ringbuffer (default 8)");
 module_param(num_hw_submissions, uint, 0600);
 
-static struct dma_fence *msm_job_dependency(struct drm_sched_job *job,
-   struct drm_sched_entity *s_entity)
-{
-   struct msm_gem_submit *submit = to_msm_submit(job);
-
-   if (!xa_empty(>deps))
-   return xa_erase(>deps, submit->last_dep++);
-
-   return NULL;
-}
-
 static struct dma_fence *msm_job_run(struct drm_sched_job *job)
 {
struct msm_gem_submit *submit = to_msm_submit(job);
@@ -52,7 +41,6 @@ static void msm_job_free(struct drm_sched_job *job)
 }
 
 const struct drm_sched_backend_ops msm_sched_ops = {
-   .dependency = msm_job_dependency,
.run_job = msm_job_run,
.free_job = msm_job_free
 };
-- 
2.32.0



[Intel-gfx] [PATCH v5 18/20] drm/i915: delete exclude argument from i915_sw_fence_await_reservation

2021-08-05 Thread Daniel Vetter
No longer used, the last user disappeared with

commit d07f0e59b2c762584478920cd2d11fba2980a94a
Author: Chris Wilson 
Date:   Fri Oct 28 13:58:44 2016 +0100

drm/i915: Move GEM activity tracking into a common struct reservation_object

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/display/intel_display.c | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c  | 2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c | 6 +-
 drivers/gpu/drm/i915/i915_sw_fence.h | 1 -
 4 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 86b86deca701..0ec736026132 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11248,7 +11248,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 */
if (intel_crtc_needs_modeset(crtc_state)) {
ret = 
i915_sw_fence_await_reservation(>commit_ready,
- 
old_obj->base.resv, NULL,
+ 
old_obj->base.resv,
  false, 0,
  GFP_KERNEL);
if (ret < 0)
@@ -11282,7 +11282,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
struct dma_fence *fence;
 
ret = i915_sw_fence_await_reservation(>commit_ready,
- obj->base.resv, NULL,
+ obj->base.resv,
  false,
  
i915_fence_timeout(dev_priv),
  GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index f0435c6feb68..fde88fa90780 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -104,7 +104,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
clflush = clflush_work_create(obj);
if (clflush) {
i915_sw_fence_await_reservation(>base.chain,
-   obj->base.resv, NULL, true,
+   obj->base.resv, true,

i915_fence_timeout(to_i915(obj->base.dev)),
I915_FENCE_GFP);
dma_resv_add_excl_fence(obj->base.resv, >base.dma);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..91711a46b1c7 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -567,7 +567,6 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
bool write,
unsigned long timeout,
gfp_t gfp)
@@ -587,9 +586,6 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
return ret;
 
for (i = 0; i < count; i++) {
-   if (shared[i]->ops == exclude)
-   continue;
-
pending = i915_sw_fence_await_dma_fence(fence,
shared[i],
timeout,
@@ -609,7 +605,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
excl = dma_resv_get_excl_unlocked(resv);
}
 
-   if (ret >= 0 && excl && excl->ops != exclude) {
+   if (ret >= 0 && excl) {
pending = i915_sw_fence_await_dma_fence(fence,
excl,
timeout,
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 30a863353ee6..6572f01668e4 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -86,7 +86,6 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
bool write,
unsigned long 

[Intel-gfx] [PATCH v5 16/20] drm/msm: Don't break exclusive fence ordering

2021-08-05 Thread Daniel Vetter
There's only one exclusive slot, and we must not break the ordering.

Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but
- msm has a synchronous dma_fence_wait for anything from another
  context, so doesn't seem to care much,
- and it probably makes sense to lift this into dma-resv.c code as a
  proper concept, so that drivers don't have to hack up their own
  solution each on their own.

v2: Improve commit message per Lucas' suggestion.

Cc: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index fb5a2eab27a2..66633dfd58a2 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -330,7 +330,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
return ret;
}
 
-   if (no_implicit)
+   /* exclusive fences must be ordered */
+   if (no_implicit && !write)
continue;
 
ret = drm_sched_job_add_implicit_dependencies(>base,
-- 
2.32.0



[Intel-gfx] [PATCH v5 17/20] drm/etnaviv: Don't break exclusive fence ordering

2021-08-05 Thread Daniel Vetter
There's only one exclusive slot, and we must not break the ordering.
Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/

v2: Improve commit message per Lucas' suggestion.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: etna...@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index e3d43678eb09..8d1703da971a 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -178,19 +178,21 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
for (i = 0; i < submit->nr_bos; i++) {
struct etnaviv_gem_submit_bo *bo = >bos[i];
struct dma_resv *robj = bo->obj->base.resv;
+   bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
 
-   if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
+   if (!(write)) {
ret = dma_resv_reserve_shared(robj, 1);
if (ret)
return ret;
}
 
-   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
+   /* exclusive fences must be ordered */
+   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
continue;
 
ret = 
drm_sched_job_add_implicit_dependencies(>sched_job,
  >obj->base,
- bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+ write);
if (ret)
return ret;
}
-- 
2.32.0



[Intel-gfx] [PATCH v5 15/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-08-05 Thread Daniel Vetter
You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 49e507f91ec0..1abb40b07324 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -715,6 +715,8 @@ int drm_sched_job_add_implicit_dependencies(struct 
drm_sched_job *job,
struct dma_fence **fences;
unsigned int i, fence_count;
 
+   dma_resv_assert_held(obj->resv);
+
if (!write) {
struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
 
-- 
2.32.0



[Intel-gfx] [PATCH v5 13/20] drm/gem: Delete gem array fencing helpers

2021-08-05 Thread Daniel Vetter
Integrated into the scheduler now and all users converted over.

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/drm_gem.c | 96 ---
 include/drm/drm_gem.h |  5 --
 2 files changed, 101 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 09c820045859..37e2e2820f08 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1272,99 +1272,3 @@ drm_gem_unlock_reservations(struct drm_gem_object 
**objs, int count,
ww_acquire_fini(acquire_ctx);
 }
 EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence)
-{
-   struct dma_fence *entry;
-   unsigned long index;
-   u32 id = 0;
-   int ret;
-
-   if (!fence)
-   return 0;
-
-   /* Deduplicate if we already depend on a fence from the same context.
-* This lets the size of the array of deps scale with the number of
-* engines involved, rather than the number of BOs.
-*/
-   xa_for_each(fence_array, index, entry) {
-   if (entry->context != fence->context)
-   continue;
-
-   if (dma_fence_is_later(fence, entry)) {
-   dma_fence_put(entry);
-   xa_store(fence_array, index, fence, GFP_KERNEL);
-   } else {
-   dma_fence_put(fence);
-   }
-   return 0;
-   }
-
-   ret = xa_alloc(fence_array, , fence, xa_limit_32b, GFP_KERNEL);
-   if (ret != 0)
-   dma_fence_put(fence);
-
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write)
-{
-   int ret;
-   struct dma_fence **fences;
-   unsigned int i, fence_count;
-
-   if (!write) {
-   struct dma_fence *fence =
-   dma_resv_get_excl_unlocked(obj->resv);
-
-   return drm_gem_fence_array_add(fence_array, fence);
-   }
-
-   ret = dma_resv_get_fences(obj->resv, NULL,
-   _count, );
-   if (ret || !fence_count)
-   return ret;
-
-   for (i = 0; i < fence_count; i++) {
-   ret = drm_gem_fence_array_add(fence_array, fences[i]);
-   if (ret)
-   break;
-   }
-
-   for (; i < fence_count; i++)
-   dma_fence_put(fences[i]);
-   kfree(fences);
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 35e7f44c2a75..e55a767188af 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -407,11 +407,6 @@ int drm_gem_lock_reservations(struct drm_gem_object 
**objs, int count,
  struct ww_acquire_ctx *acquire_ctx);
 void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write);
 int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
u32 handle, u64 *offset);
 
-- 
2.32.0



[Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Daniel Vetter
This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.

Reviewed-by: Lucas Stach 
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
 drivers/gpu/drm/scheduler/sched_main.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f77456929139..49e507f91ec0 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
if (!fence)
return 0;
 
+   /* if it's a fence from us it's guaranteed to be earlier */
+   if (fence->context == job->entity->fence_context ||
+   fence->context == job->entity->fence_context + 1) {
+   dma_fence_put(fence);
+   return 0;
+   }
+
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.
-- 
2.32.0



[Intel-gfx] [PATCH v5 11/20] drm/etnaviv: Use scheduler dependency handling

2021-08-05 Thread Daniel Vetter
We need to pull the drm_sched_job_init much earlier, but that's very
minor surgery.

v2: Actually fix up cleanup paths by calling drm_sched_job_init, which
I wanted to to in the previous round (and did, for all other drivers).
Spotted by Lucas.

v3: Rebase over renamed functions to add dependencies.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: etna...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h|  5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 60 ++-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 63 +---
 drivers/gpu/drm/etnaviv/etnaviv_sched.h  |  3 +-
 4 files changed, 37 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
index 98e60df882b6..63688e6e4580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
@@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
u64 va;
struct etnaviv_gem_object *obj;
struct etnaviv_vram_mapping *mapping;
-   struct dma_fence *excl;
-   unsigned int nr_shared;
-   struct dma_fence **shared;
 };
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
@@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
struct etnaviv_file_private *ctx;
struct etnaviv_gpu *gpu;
struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
-   struct dma_fence *out_fence, *in_fence;
+   struct dma_fence *out_fence;
int out_fence_id;
struct list_head node; /* GPU active submit list */
struct etnaviv_cmdbuf cmdbuf;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 4dd7d9d541c0..e3d43678eb09 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -188,16 +188,11 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
continue;
 
-   if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-   ret = dma_resv_get_fences(robj, >excl,
- >nr_shared,
- >shared);
-   if (ret)
-   return ret;
-   } else {
-   bo->excl = dma_resv_get_excl_unlocked(robj);
-   }
-
+   ret = 
drm_sched_job_add_implicit_dependencies(>sched_job,
+ >obj->base,
+ bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+   if (ret)
+   return ret;
}
 
return ret;
@@ -403,8 +398,6 @@ static void submit_cleanup(struct kref *kref)
 
wake_up_all(>gpu->fence_event);
 
-   if (submit->in_fence)
-   dma_fence_put(submit->in_fence);
if (submit->out_fence) {
/* first remove from IDR, so fence can not be found anymore */
mutex_lock(>gpu->fence_lock);
@@ -529,7 +522,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
ret = etnaviv_cmdbuf_init(priv->cmdbuf_suballoc, >cmdbuf,
  ALIGN(args->stream_size, 8) + 8);
if (ret)
-   goto err_submit_objects;
+   goto err_submit_put;
 
submit->ctx = file->driver_priv;
etnaviv_iommu_context_get(submit->ctx->mmu);
@@ -537,51 +530,62 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
submit->exec_state = args->exec_state;
submit->flags = args->flags;
 
+   ret = drm_sched_job_init(>sched_job,
+>sched_entity[args->pipe],
+submit->ctx);
+   if (ret)
+   goto err_submit_put;
+
ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
if (ret)
-   goto err_submit_objects;
+   goto err_submit_job;
 
if ((priv->mmu_global->version != ETNAVIV_IOMMU_V2) &&
!etnaviv_cmd_validate_one(gpu, stream, args->stream_size / 4,
  relocs, args->nr_relocs)) {
ret = -EINVAL;
-   goto err_submit_objects;
+   goto err_submit_job;
}
 
if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
-   submit->in_fence = sync_file_get_fence(args->fence_fd);
-   if (!submit->in_fence) {
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {
ret = -EINVAL;
-   goto err_submit_objects;
+

[Intel-gfx] [PATCH v5 10/20] drm/v3d: Use scheduler dependency handling

2021-08-05 Thread Daniel Vetter
With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen  (v1)
Acked-by: Emma Anholt 
Cc: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  5 -
 drivers/gpu/drm/v3d/v3d_gem.c   | 26 +-
 drivers/gpu/drm/v3d/v3d_sched.c | 29 +
 3 files changed, 10 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index c1d433b4cf93..b900a050d5e2 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -234,11 +234,6 @@ struct v3d_job {
struct drm_gem_object **bo;
u32 bo_count;
 
-   /* Array of struct dma_fence * to block on before submitting this job.
-*/
-   struct xarray deps;
-   unsigned long last_dep;
-
/* v3d fence to be signaled by IRQ handler when the job is complete. */
struct dma_fence *irq_fence;
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 42587248c54e..a3529809d547 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -259,8 +259,8 @@ v3d_lock_bo_reservations(struct v3d_job *job,
return ret;
 
for (i = 0; i < job->bo_count; i++) {
-   ret = drm_gem_fence_array_add_implicit(>deps,
-  job->bo[i], true);
+   ret = drm_sched_job_add_implicit_dependencies(>base,
+ job->bo[i], true);
if (ret) {
drm_gem_unlock_reservations(job->bo, job->bo_count,
acquire_ctx);
@@ -356,8 +356,6 @@ static void
 v3d_job_free(struct kref *ref)
 {
struct v3d_job *job = container_of(ref, struct v3d_job, refcount);
-   unsigned long index;
-   struct dma_fence *fence;
int i;
 
for (i = 0; i < job->bo_count; i++) {
@@ -366,11 +364,6 @@ v3d_job_free(struct kref *ref)
}
kvfree(job->bo);
 
-   xa_for_each(>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(>deps);
-
dma_fence_put(job->irq_fence);
dma_fence_put(job->done_fence);
 
@@ -457,7 +450,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret < 0)
return ret;
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
 v3d_priv);
if (ret)
@@ -467,7 +459,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret == -EINVAL)
goto fail_job;
 
-   ret = drm_gem_fence_array_add(>deps, in_fence);
+   ret = drm_sched_job_add_dependency(>base, in_fence);
if (ret)
goto fail_job;
 
@@ -477,7 +469,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
 fail_job:
drm_sched_job_cleanup(>base);
 fail:
-   xa_destroy(>deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
@@ -640,8 +631,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
v3d_perfmon_get(bin->base.perfmon);
v3d_push_job(>base);
 
-   ret = drm_gem_fence_array_add(>base.deps,
- 
dma_fence_get(bin->base.done_fence));
+   ret = drm_sched_job_add_dependency(>base.base,
+  
dma_fence_get(bin->base.done_fence));
if (ret)
goto fail_unreserve;
}
@@ -651,7 +642,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (clean_job) {
struct dma_fence *render_fence =
dma_fence_get(render->base.done_fence);
-   ret = drm_gem_fence_array_add(_job->deps, render_fence);
+   ret = drm_sched_job_add_dependency(_job->base,
+  render_fence);
if (ret)
goto fail_unreserve;
clean_job->perfmon = render->base.perfmon;
@@ -853,8 +845,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
mutex_lock(>sched_lock);
v3d_push_job(>base);
 
-   ret = drm_gem_fence_array_add(_job->deps,
- dma_fence_get(job->base.done_fence));
+   ret = drm_sched_job_add_dependency(_job->base,
+  dma_fence_get(job->base.done_fence));
if (ret)
goto 

[Intel-gfx] [PATCH v5 08/20] drm/lima: use scheduler dependency tracking

2021-08-05 Thread Daniel Vetter
Nothing special going on here.

Aside reviewing the code, it seems like drm_sched_job_arm() should be
moved into lima_sched_context_queue_task and put under some mutex
together with drm_sched_push_job(). See the kerneldoc for
drm_sched_push_job().

v2: Rebase over renamed functions to add dependencies.

Signed-off-by: Daniel Vetter 
Cc: Qiang Yu 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/lima/lima_gem.c   |  6 --
 drivers/gpu/drm/lima/lima_sched.c | 21 -
 drivers/gpu/drm/lima/lima_sched.h |  3 ---
 3 files changed, 4 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index c528f40981bb..640acc060467 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -267,7 +267,9 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, 
struct lima_bo *bo,
if (explicit)
return 0;
 
-   return drm_gem_fence_array_add_implicit(>deps, >base.base, 
write);
+   return drm_sched_job_add_implicit_dependencies(>base,
+  >base.base,
+  write);
 }
 
 static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
@@ -285,7 +287,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct 
lima_submit *submit)
if (err)
return err;
 
-   err = drm_gem_fence_array_add(>task->deps, fence);
+   err = drm_sched_job_add_dependency(>task->base, fence);
if (err) {
dma_fence_put(fence);
return err;
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index e968b5a8f0b0..99d5f6f1a882 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
task->num_bos = num_bos;
task->vm = lima_vm_get(vm);
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
return 0;
 }
 
 void lima_sched_task_fini(struct lima_sched_task *task)
 {
-   struct dma_fence *fence;
-   unsigned long index;
int i;
 
drm_sched_job_cleanup(>base);
 
-   xa_for_each(>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(>deps);
-
if (task->bos) {
for (i = 0; i < task->num_bos; i++)
drm_gem_object_put(>bos[i]->base.base);
@@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct 
lima_sched_task *task)
return fence;
 }
 
-static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
-  struct drm_sched_entity *entity)
-{
-   struct lima_sched_task *task = to_lima_task(job);
-
-   if (!xa_empty(>deps))
-   return xa_erase(>deps, task->last_dep++);
-
-   return NULL;
-}
-
 static int lima_pm_busy(struct lima_device *ldev)
 {
int ret;
@@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job)
 }
 
 static const struct drm_sched_backend_ops lima_sched_ops = {
-   .dependency = lima_sched_dependency,
.run_job = lima_sched_run_job,
.timedout_job = lima_sched_timedout_job,
.free_job = lima_sched_free_job,
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index ac70006b0e26..6a11764d87b3 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -23,9 +23,6 @@ struct lima_sched_task {
struct lima_vm *vm;
void *frame;
 
-   struct xarray deps;
-   unsigned long last_dep;
-
struct lima_bo **bos;
int num_bos;
 
-- 
2.32.0



[Intel-gfx] [PATCH v5 06/20] drm/sched: improve docs around drm_sched_entity

2021-08-05 Thread Daniel Vetter
I found a few too many things that are tricky and not documented, so I
started typing.

I found a few more things that looked broken while typing, see the
varios FIXME in drm_sched_entity.

Also some of the usual logics:
- actually include sched_entity.c declarations, that was lost in the
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into
  separate file")

- Ditch the kerneldoc for internal functions, keep the comments where
  they're describing more than what the function name already implies.

- Switch drm_sched_entity to inline docs.

Acked-by: Melissa Wen 
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: "Christian König" 
Cc: Boris Brezillon 
Cc: Steven Price 
Cc: Emma Anholt 
Cc: Lee Jones 
Cc: Andrey Grodzovsky 
---
 Documentation/gpu/drm-mm.rst |   3 +
 drivers/gpu/drm/scheduler/sched_entity.c |  85 -
 include/drm/gpu_scheduler.h  | 145 ++-
 3 files changed, 146 insertions(+), 87 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index d5a73fa2c9ef..0198fa43d254 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -504,3 +504,6 @@ Scheduler Function References
 
 .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
:export:
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_entity.c
+   :export:
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index e4d33db1eb45..27e1573af96e 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -45,8 +45,14 @@
  * @guilty: atomic_t set to 1 when a job on this queue
  *  is found to be guilty causing a timeout
  *
- * Note: the sched_list should have at least one element to schedule
- *   the entity
+ * Note that the _list must have at least one element to schedule the 
entity.
+ *
+ * For changing @priority later on at runtime see
+ * drm_sched_entity_set_priority(). For changing the set of schedulers
+ * @sched_list at runtime see drm_sched_entity_modify_sched().
+ *
+ * An entity is cleaned up by callind drm_sched_entity_fini(). See also
+ * drm_sched_entity_destroy().
  *
  * Returns 0 on success or a negative error code on failure.
  */
@@ -92,6 +98,11 @@ EXPORT_SYMBOL(drm_sched_entity_init);
  * @sched_list: the list of new drm scheds which will replace
  *  existing entity->sched_list
  * @num_sched_list: number of drm sched in sched_list
+ *
+ * Note that this must be called under the same common lock for @entity as
+ * drm_sched_job_arm() and drm_sched_entity_push_job(), or the driver needs to
+ * guarantee through some other means that this is never called while new jobs
+ * can be pushed to @entity.
  */
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
struct drm_gpu_scheduler **sched_list,
@@ -104,13 +115,6 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity 
*entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_modify_sched);
 
-/**
- * drm_sched_entity_is_idle - Check if entity is idle
- *
- * @entity: scheduler entity
- *
- * Returns true if the entity does not have any unscheduled jobs.
- */
 static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 {
rmb(); /* for list_empty to work without lock */
@@ -123,13 +127,7 @@ static bool drm_sched_entity_is_idle(struct 
drm_sched_entity *entity)
return false;
 }
 
-/**
- * drm_sched_entity_is_ready - Check if entity is ready
- *
- * @entity: scheduler entity
- *
- * Return true if entity could provide a job.
- */
+/* Return true if entity could provide a job. */
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)
 {
if (spsc_queue_peek(>job_queue) == NULL)
@@ -192,14 +190,7 @@ long drm_sched_entity_flush(struct drm_sched_entity 
*entity, long timeout)
 }
 EXPORT_SYMBOL(drm_sched_entity_flush);
 
-/**
- * drm_sched_entity_kill_jobs_cb - helper for drm_sched_entity_kill_jobs
- *
- * @f: signaled fence
- * @cb: our callback structure
- *
- * Signal the scheduler finished fence when the entity in question is killed.
- */
+/* Signal the scheduler finished fence when the entity in question is killed. 
*/
 static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
  struct dma_fence_cb *cb)
 {
@@ -224,14 +215,6 @@ drm_sched_job_dependency(struct drm_sched_job *job,
return NULL;
 }
 
-/**
- * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
- *
- * @entity: entity which is cleaned up
- *
- * Makes sure that all remaining jobs in an entity are killed before it is
- * destroyed.
- */
 static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
 {
struct drm_sched_job *job;
@@ -273,9 +256,11 @@ static void 

[Intel-gfx] [PATCH v5 07/20] drm/panfrost: use scheduler dependency tracking

2021-08-05 Thread Daniel Vetter
Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

v2: Correctly clean up the partially set up job, now that job_init()
and job_arm() are apart (Emma).

v3: Rebased over renamed functions for adding depdencies

Acked-by: Emma Anholt 
Reviewed-by: Steven Price  (v3)
Signed-off-by: Daniel Vetter 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 16 ---
 drivers/gpu/drm/panfrost/panfrost_job.c | 38 -
 drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
 3 files changed, 18 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..16212b6b202e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
if (ret)
goto fail;
 
-   ret = drm_gem_fence_array_add(>deps, fence);
+   ret = drm_sched_job_add_dependency(>base, fence);
 
if (ret)
goto fail;
@@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
struct drm_panfrost_submit *args = data;
struct drm_syncobj *sync_out = NULL;
struct panfrost_job *job;
-   int ret = 0;
+   int ret = 0, slot;
 
if (!args->jc)
return -EINVAL;
@@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
 
kref_init(>refcount);
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
job->pfdev = pfdev;
job->jc = args->jc;
job->requirements = args->requirements;
job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
job->file_priv = file->driver_priv;
 
+   slot = panfrost_job_get_slot(job);
+
+   ret = drm_sched_job_init(>base,
+>file_priv->sched_entity[slot],
+NULL);
+   if (ret)
+   goto fail_job_put;
+
ret = panfrost_copy_in_sync(dev, file, args, job);
if (ret)
goto fail_job;
@@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
drm_syncobj_replace_fence(sync_out, job->render_done_fence);
 
 fail_job:
+   drm_sched_job_cleanup(>base);
+fail_job_put:
panfrost_job_put(job);
 fail_out_sync:
if (sync_out)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 4bc962763e1f..a98f507dc779 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct 
panfrost_device *pfdev, in
return >base;
 }
 
-static int panfrost_job_get_slot(struct panfrost_job *job)
+int panfrost_job_get_slot(struct panfrost_job *job)
 {
/* JS0: fragment jobs.
 * JS1: vertex/tiler jobs
@@ -242,13 +242,14 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
 
 static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
  int bo_count,
- struct xarray *deps)
+ struct drm_sched_job *job)
 {
int i, ret;
 
for (i = 0; i < bo_count; i++) {
/* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+   ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
+ true);
if (ret)
return ret;
}
@@ -269,31 +270,21 @@ static void panfrost_attach_object_fences(struct 
drm_gem_object **bos,
 int panfrost_job_push(struct panfrost_job *job)
 {
struct panfrost_device *pfdev = job->pfdev;
-   int slot = panfrost_job_get_slot(job);
-   struct drm_sched_entity *entity = >file_priv->sched_entity[slot];
struct ww_acquire_ctx acquire_ctx;
int ret = 0;
 
-
ret = drm_gem_lock_reservations(job->bos, job->bo_count,
_ctx);
if (ret)
return ret;
 
mutex_lock(>sched_lock);
-
-   ret = drm_sched_job_init(>base, entity, NULL);
-   if (ret) {
-   mutex_unlock(>sched_lock);
-   goto unlock;
-   }
-
drm_sched_job_arm(>base);
 
job->render_done_fence = 

[Intel-gfx] [PATCH v5 09/20] drm/v3d: Move drm_sched_job_init to v3d_job_init

2021-08-05 Thread Daniel Vetter
Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen  (v3)
Acked-by: Emma Anholt 
Cc: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  1 +
 drivers/gpu/drm/v3d/v3d_gem.c   | 86 ++---
 drivers/gpu/drm/v3d/v3d_sched.c | 15 +++---
 3 files changed, 44 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 270134779073..c1d433b4cf93 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -379,6 +379,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv);
+void v3d_job_cleanup(struct v3d_job *job);
 void v3d_job_put(struct v3d_job *job);
 void v3d_reset(struct v3d_dev *v3d);
 void v3d_invalidate_caches(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 957228bef29c..42587248c54e 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -397,6 +397,12 @@ v3d_render_job_free(struct kref *ref)
v3d_job_free(ref);
 }
 
+void v3d_job_cleanup(struct v3d_job *job)
+{
+   drm_sched_job_cleanup(>base);
+   v3d_job_put(job);
+}
+
 void v3d_job_put(struct v3d_job *job)
 {
kref_put(>refcount, job->free);
@@ -438,9 +444,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 struct v3d_job *job, void (*free)(struct kref *ref),
-u32 in_sync)
+u32 in_sync, enum v3d_queue queue)
 {
struct dma_fence *in_fence = NULL;
+   struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
int ret;
 
job->v3d = v3d;
@@ -451,35 +458,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
return ret;
 
xa_init_flags(>deps, XA_FLAGS_ALLOC);
+   ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
+v3d_priv);
+   if (ret)
+   goto fail;
 
ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, _fence);
if (ret == -EINVAL)
-   goto fail;
+   goto fail_job;
 
ret = drm_gem_fence_array_add(>deps, in_fence);
if (ret)
-   goto fail;
+   goto fail_job;
 
kref_init(>refcount);
 
return 0;
+fail_job:
+   drm_sched_job_cleanup(>base);
 fail:
xa_destroy(>deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
 
-static int
-v3d_push_job(struct v3d_file_priv *v3d_priv,
-struct v3d_job *job, enum v3d_queue queue)
+static void
+v3d_push_job(struct v3d_job *job)
 {
-   int ret;
-
-   ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
-v3d_priv);
-   if (ret)
-   return ret;
-
drm_sched_job_arm(>base);
 
job->done_fence = dma_fence_get(>base.s_fence->finished);
@@ -488,8 +493,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
kref_get(>refcount);
 
drm_sched_entity_push_job(>base);
-
-   return 0;
 }
 
 static void
@@ -564,7 +567,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
INIT_LIST_HEAD(>unref_list);
 
ret = v3d_job_init(v3d, file_priv, >base,
-  v3d_render_job_free, args->in_sync_rcl);
+  v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
if (ret) {
kfree(render);
return ret;
@@ -578,7 +581,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
 
ret = v3d_job_init(v3d, file_priv, >base,
-  v3d_job_free, args->in_sync_bcl);
+  v3d_job_free, args->in_sync_bcl, V3D_BIN);
if (ret) {
v3d_job_put(>base);
kfree(bin);
@@ -600,7 +603,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
goto fail;
}
 
-   ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+   ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, 
V3D_CACHE_CLEAN);
if (ret) {
kfree(clean_job);
clean_job = NULL;
@@ -635,9 +638,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (bin) {

[Intel-gfx] [PATCH v5 05/20] drm/sched: drop entity parameter from drm_sched_push_job

2021-08-05 Thread Daniel Vetter
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Acked-by: Emma Anholt 
Acked-by: Melissa Wen 
Reviewed-by: Steven Price  (v1)
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Melissa Wen 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
 drivers/gpu/drm/lima/lima_gem.c  | 3 +--
 drivers/gpu/drm/lima/lima_sched.c| 5 ++---
 drivers/gpu/drm/lima/lima_sched.h| 3 +--
 drivers/gpu/drm/msm/msm_gem_submit.c | 2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
 drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
 drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
 include/drm/gpu_scheduler.h  | 3 +--
 11 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 32e80bc6af22..1d8a914108af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
trace_amdgpu_cs_ioctl(job);
amdgpu_vm_bo_trace_cs(>vm, >ticket);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
 
amdgpu_vm_move_to_lru_tail(p->adev, >vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
 
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
 
return 0;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(>refcount);
 
-   drm_sched_entity_push_job(>sched_job, sched_entity);
+   drm_sched_entity_push_job(>sched_job);
 
 out_unlock:
mutex_unlock(>gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
 
-   fence = lima_sched_context_queue_task(
-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
 
for (i = 0; i < submit->nr_bos; i++) {
if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(>base);
 }
 
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context 
*context,
-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 {
struct dma_fence *fence = dma_fence_get(>base.s_fence->finished);
 
trace_lima_task_submit(task);
-   drm_sched_entity_push_job(>base, >base);
+   drm_sched_entity_push_job(>base);
return fence;
 }
 
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index 90f03c48ef4a..ac70006b0e26 100644
--- 

[Intel-gfx] [PATCH v5 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-08-05 Thread Daniel Vetter
It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(_job->s_fence->finished, -ECANCELED);
 
dma_fence_put(entity->last_scheduled);
+
entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
 
+   /*
+* If the queue is empty we allow drm_sched_entity_select_rq() to
+* locklessly access ->last_scheduled. This only works if we set the
+* pointer before we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(>job_queue);
return sched_job;
 }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
 
-   if (spsc_queue_count(>job_queue) || !entity->sched_list)
+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(>job_queue))
return;
 
-   fence = READ_ONCE(entity->last_scheduled);
+   /*
+* Only when the queue is empty are we guaranteed that the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   fence = entity->last_scheduled;
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
 
-- 
2.32.0



  1   2   >