Re: [PATCH v10 00/23] drm/i915/vm_bind: Add VM_BIND functionality

2023-04-13 Thread Niranjana Vishwanathapura

On Tue, Jan 17, 2023 at 11:15:46PM -0800, Niranjana Vishwanathapura wrote:

DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
 Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
 [PATCH i-g-t v10 0/19] vm_bind: Add VM_BIND validation support

v2: Address various review comments
v3: Address review comments and other fixes
v4: Remove vm_unbind out fence uapi which is not supported yet,
   replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
   non-recoverable faults
v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
   add new patch for async vm_unbind support
v7: Rebased, minor cleanups as per review feedback
v8: Rebased, add capture support
v9: Address capture support feedback from v8
v10: Properly handle vma->resource for mappings with capture request

Test-with: 20230118071350.17498-1-niranjana.vishwanathap...@intel.com

Signed-off-by: Niranjana Vishwanathapura 



Hi,

It has become clear that we have a long way towards fully featured 
implementation of VM_BIND in i915.
Examples of the many challenges include integration with display, integration 
with userspace drivers,
a rewrite of all the i915 IGTs to support execbuf3, alignment with DRM GPU VA 
manager[1] etc.

We are stopping further VM_BIND upstreaming efforts in i915 so we can 
accelerate the merge plan
for the new drm/xe driver[2] which has been designed for VM_BIND from the 
beginning.

Since we are not proceeding further with this i915 VM_BIND patch series, the 
MTL support needed for
setting the MOCS and PAT settings in an immutable way at buffer creation time 
has been posted in a
separate series[3] under review.

Thanks for all your feedback on this series which is much appreciated.

Regards,
Niranjana

[1] https://www.spinics.net/lists/nouveau/msg11069.html  
[2] https://www.spinics.net/lists/dri-devel/msg390882.html 
[3] https://patchwork.freedesktop.org/series/115980/



Niranjana Vishwanathapura (23):
 drm/i915/vm_bind: Expose vm lookup function
 drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
 drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
 drm/i915/vm_bind: Support partially mapped vma resource
 drm/i915/vm_bind: Add support to create persistent vma
 drm/i915/vm_bind: Implement bind and unbind of object
 drm/i915/vm_bind: Support for VM private BOs
 drm/i915/vm_bind: Add support to handle object evictions
 drm/i915/vm_bind: Support persistent vma activeness tracking
 drm/i915/vm_bind: Add out fence support
 drm/i915/vm_bind: Abstract out common execbuf functions
 drm/i915/vm_bind: Use common execbuf functions in execbuf path
 drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
 drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
 drm/i915/vm_bind: Expose i915_request_await_bind()
 drm/i915/vm_bind: Handle persistent vmas in execbuf3
 drm/i915/vm_bind: userptr dma-resv changes
 drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
 drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
 drm/i915/vm_bind: Render VM_BIND documentation
 drm/i915/vm_bind: Async vm_unbind support
 drm/i915/vm_bind: Properly build persistent map sg table
 drm/i915/vm_bind: Support capture of persistent mappings

Documentation/gpu/i915.rst|  78 +-
drivers/gpu/drm/i915/Makefile |   3 +
drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
drivers/gpu/drm/i915/gem/i915_gem_create.c|  72 +-
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   6 +
.../gpu/drm/i915/gem/i915_gem_execbuffer.c| 522 +--
.../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 872 ++
.../drm/i915/gem/i915_gem_execbuffer_common.c | 671 ++
.../drm/i915/gem/i915_gem_execbuffer_common.h |  76 ++
drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
drivers/gpu/drm/i915/gem/i915_gem_object.c|   3 +
drivers/gpu/drm/i915/gem/i915_gem_object.h|   2 +
.../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
drivers/gpu/drm/i9

Re: [RFC PATCH v3 09/17] drm/i915: Do not support vm_bind mode in execbuf2

2022-08-30 Thread Niranjana Vishwanathapura

On Sat, Aug 27, 2022 at 09:43:55PM +0200, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Do not support the vm in vm_bind_mode in execbuf2 ioctl.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 5 +
1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cd75b0ca2555f..f85f10cf9c34b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);

+   if (ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;


This should probably be merged with patch #2 that introduces vm_bind_mode uapi.

Niranjana


--
2.34.1



Re: [RFC PATCH v3 04/17] drm/i915: Implement bind and unbind of object

2022-08-30 Thread Niranjana Vishwanathapura

On Tue, Aug 30, 2022 at 06:37:55PM +0100, Matthew Auld wrote:

On 27/08/2022 20:43, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Implement the bind and unbind of an object at the specified GPU virtual
addresses.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  21 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 322 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   9 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 drivers/gpu/drm/i915/i915_vma.c   |   3 +-
 drivers/gpu/drm/i915/i915_vma.h   |   2 -
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 163 +
 10 files changed, 543 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 522ef9b4aff32..4e1627e96c6e0 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -165,6 +165,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index 0..ebc493b7dafc1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include "i915_drv.h"
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_vma_all(struct i915_address_space *vm);
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index 0..dadd1d4b1761b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include "gem/i915_gem_vm_bind.h"
+#include "gem/i915_gem_context.h"
+#include "gt/gen8_engine_cs.h"
+
+#include "i915_drv.h"
+#include "i915_gem_gtt.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
+ * done asynchronously, when valid out fence is specified.
+ *
+ * VM_BIND locking order is as below.
+ *
+ * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
+ *vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing 
the
+ *mapping.
+ *
+ *In future, when GPU page faults are supported, we can potentially use a
+ *rwsem instead, so that multiple page fault handlers can take the read
+ *side lock to lookup the mapping and hence can run in parallel.
+ *The older execbuf mode of binding do not need this lock.
+ *
+ * 2) The object&

Re: [RFC PATCH v3 05/17] drm/i915: Support for VM private BOs

2022-08-30 Thread Niranjana Vishwanathapura

On Sat, Aug 27, 2022 at 09:43:51PM +0200, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
drivers/gpu/drm/i915/gem/i915_gem_object_types.h   | 3 +++
drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c | 9 +
drivers/gpu/drm/i915/gt/intel_gtt.c| 4 
drivers/gpu/drm/i915/gt/intel_gtt.h| 2 ++
drivers/gpu/drm/i915/i915_vma.c| 1 +
drivers/gpu/drm/i915/i915_vma_types.h  | 2 ++
6 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 9f6b14ec189a2..46308dcf39e99 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -241,6 +241,9 @@ struct drm_i915_gem_object {

const struct drm_i915_gem_object_ops *ops;

+   /* For VM private BO, points to root_obj in VM. NULL otherwise */
+   struct drm_i915_gem_object *priv_root;
+
struct {
/**
 * @vma.lock: protect the list/tree of vmas
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index dadd1d4b1761b..9ff929f187cfd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -93,6 +93,7 @@ void i915_gem_vm_bind_remove(struct i915_vma *vma, bool 
release_obj)

if (!list_empty(&vma->vm_bind_link)) {
list_del_init(&vma->vm_bind_link);
+   list_del_init(&vma->non_priv_vm_bind_link);
i915_vm_bind_it_remove(vma, &vma->vm->va);

/* Release object */
@@ -219,6 +220,11 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto put_obj;
}

+   if (obj->priv_root && obj->priv_root != vm->root_obj) {
+   ret = -EINVAL;
+   goto put_obj;
+   }
+
ret = mutex_lock_interruptible(&vm->vm_bind_lock);
if (ret)
goto put_obj;
@@ -244,6 +250,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,

list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
i915_vm_bind_it_insert(vma, &vm->va);
+   if (!obj->priv_root)
+   list_add_tail(&vma->non_priv_vm_bind_link,
+ &vm->non_priv_vm_bind_list);

out_ww:
if (ret == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index cb188377b7bd9..c4f75826213ae 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -177,6 +177,7 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
void i915_address_space_fini(struct i915_address_space *vm)
{
drm_mm_takedown(&vm->mm);
+   i915_gem_object_put(vm->root_obj);
GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
mutex_destroy(&vm->vm_bind_lock);
}
@@ -292,6 +293,9 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->vm_bind_list);
INIT_LIST_HEAD(&vm->vm_bound_list);
mutex_init(&vm->vm_bind_lock);
+   INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+   vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE);
+   GEM_BUG_ON(IS_ERR(vm->root_obj));
}

void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 06a259475816b..9a2665e4ec2e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -267,6 +267,8 @@ struct i915_address_space {
struct list_head vm_bound_list;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
+   struct list_head non_priv_vm_bind_list;
+   struct drm_i915_gem_object *root_obj;

/* Global GTT */
bool is_ggtt:1;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 092ae4309d8a1..239346e0c07f2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -236,6 +236,7 @@ vma_create(struct drm_i915_gem_object *obj,
mutex_unlock(&vm->mutex);

INIT_LIST_HEAD(&vma->vm_bind_link);
+   INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
return vma;

err_unlock:
diff --git a/drivers/gpu/drm/

Re: [RFC PATCH v3 07/17] drm/i915/vm_bind: Handle persistent vmas

2022-08-30 Thread Niranjana Vishwanathapura

On Sat, Aug 27, 2022 at 09:43:53PM +0200, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Treat VM_BIND vmas as persistent across execbuf ioctl calls and handle
them during the request submission in the execbuff path.

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
.../drm/i915/gem/i915_gem_vm_bind_object.c|  8 +++
drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 ++
drivers/gpu/drm/i915/i915_gem_gtt.c   | 38 +
drivers/gpu/drm/i915/i915_gem_gtt.h   |  3 +
drivers/gpu/drm/i915/i915_vma.c   | 50 +++--
drivers/gpu/drm/i915/i915_vma.h   | 56 +++
drivers/gpu/drm/i915/i915_vma_types.h | 24 
9 files changed, 169 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 389e9f157ca5e..825dce41f7113 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -38,6 +38,7 @@
#include "i915_gem_mman.h"
#include "i915_gem_object.h"
#include "i915_gem_ttm.h"
+#include "i915_gem_vm_bind.h"
#include "i915_memcpy.h"
#include "i915_trace.h"



Not needed, should be removed.


diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 9ff929f187cfd..3b45529fe8d4c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -91,6 +91,13 @@ void i915_gem_vm_bind_remove(struct i915_vma *vma, bool 
release_obj)
{
lockdep_assert_held(&vma->vm->vm_bind_lock);

+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   i915_vma_set_freed(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
if (!list_empty(&vma->vm_bind_link)) {
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
@@ -190,6 +197,7 @@ static struct i915_vma *vm_bind_get_vma(struct 
i915_address_space *vm,

vma->start = va->start;
vma->last = va->start + va->length - 1;
+   i915_vma_set_persistent(vma);


This can be set in vma_create() now that it knows whether the vma
is persistent or not.

Niranjana



return vma;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index c4f75826213ae..97cd0089b516d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -296,6 +296,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE);
GEM_BUG_ON(IS_ERR(vm->root_obj));
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
}

void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 9a2665e4ec2e5..1f3b1967ec175 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 329ff75b80b97..f083724163deb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,44 @@
#include "i915_trace.h"
#include "i915_vgpu.h"

+/**
+ * i915_vm_sync() - Wait for all requests on private vmas of a vm to be 
completed
+ * @vm: address space we need to wait for idle
+ *
+ * Waits till all requests of the vm_binded private objs are completed.
+ *
+ * Returns: 0 on success -ve errcode on failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+   int ret;
+
+   /* Wait for all requests under this vm to finish */
+   ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+   DMA_RESV_USAGE_BOOKKEEP, false,

Re: [RFC PATCH v3 08/17] drm/i915/vm_bind: Add out fence support

2022-08-30 Thread Niranjana Vishwanathapura

On Sat, Aug 27, 2022 at 09:43:54PM +0200, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Add support for handling out fence of vm_bind call.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  3 +
.../drm/i915/gem/i915_gem_vm_bind_object.c| 82 +++
drivers/gpu/drm/i915/i915_vma.c   |  6 +-
drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
4 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index ebc493b7dafc1..d65e6e4fb3972 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -18,4 +18,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,
 struct drm_file *file);

void i915_gem_vm_unbind_vma_all(struct i915_address_space *vm);
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 3b45529fe8d4c..e57b9c492a7f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -5,6 +5,8 @@

#include 

+#include 
+
#include "gem/i915_gem_vm_bind.h"
#include "gem/i915_gem_context.h"
#include "gt/gen8_engine_cs.h"
@@ -109,6 +111,67 @@ void i915_gem_vm_bind_remove(struct i915_vma *vma, bool 
release_obj)
}
}

+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+}
+
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct i915_vma *vma,
  struct drm_i915_gem_vm_unbind *va)
@@ -243,6 +306,15 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}

+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
+   pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER;


Setting pin_flags should part of patch #4.


+
for_i915_gem_ww(&ww, ret, true) {
retry:
ret = i915_gem_object_lock(vma->obj, &ww);
@@ -267,12 +339,22 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
ret = i915_gem_ww_ctx_backoff(&ww);
if (!ret)
goto retry;
+


Redundent white space. remove.


} else {
/* Hold object reference until vm_unbind */
i915_gem_object_get(vma->obj);
}
}

+   if (va->

Re: [RFC PATCH v3 13/17] drm/i915/vm_bind: userptr dma-resv changes

2022-08-30 Thread Niranjana Vishwanathapura

On Sat, Aug 27, 2022 at 09:43:59PM +0200, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
.../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 139 ++
drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  10 ++
.../drm/i915/gem/i915_gem_vm_bind_object.c|  16 ++
drivers/gpu/drm/i915/gt/intel_gtt.c   |   2 +
drivers/gpu/drm/i915/gt/intel_gtt.h   |   4 +
drivers/gpu/drm/i915/i915_vma_types.h |   2 +
6 files changed, 142 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 8e0dde26194e0..72d6771da2113 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -23,6 +23,7 @@
#include "i915_gem_vm_bind.h"
#include "i915_trace.h"

+#define __EXEC3_USERPTR_USED   BIT_ULL(34)
#define __EXEC3_HAS_PIN BIT_ULL(33)
#define __EXEC3_ENGINE_PINNED   BIT_ULL(32)
#define __EXEC3_INTERNAL_FLAGS  (~0ull << 32)
@@ -157,10 +158,45 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
spin_unlock(&vm->vm_rebind_lock);
}

+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *last_vma = NULL;
+   struct i915_vma *vma;
+   int err;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   list_for_each_entry(vma, &vm->vm_userptr_invalidated_list,
+   vm_userptr_invalidated_link) {
+   list_del_init(&vma->vm_userptr_invalidated_link);
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   last_vma = vma;
+   }


This should be done under the list lock. As it is a spinlock, we
should scoop them first under that spinlock and call submit_init()
outside that lock.


+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link)
+   if (i915_gem_object_is_userptr(vma->obj)) {
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   last_vma = vma;
+   }
+
+   if (last_vma)
+   eb->args->flags |= __EXEC3_USERPTR_USED;
+
+   return 0;
+}
+
static int eb_lookup_vma_all(struct i915_execbuffer *eb)
{
unsigned int i, current_batch = 0;
struct i915_vma *vma;
+   int err = 0;

for (i = 0; i < eb->num_batches; i++) {
vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -171,6 +207,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}

+   err = eb_lookup_persistent_userptr_vmas(eb);
+   if (err)
+   return err;
+
eb_scoop_unbound_vma_all(eb->context->vm);

return 0;
@@ -286,33 +326,6 @@ static int eb_validate_persistent_vma_all(struct 
i915_execbuffer *eb)
return ret;
}

-static int eb_validate_vma_all(struct i915_execbuffer *eb)
-{
-   /* only throttle once, even if we didn't need to throttle */
-   for (bool throttle = true;; throttle = false) {
-   int err;
-
-   err = eb_pin_engine(eb, throttle);
-   if (!err)
-   err = eb_lock_vma_all(eb);
-
-   if (!err)
-   err = eb_validate_persistent_vma_all(eb);
-
-   if (!err)
-   return 0;
-
-   if (err != -EDEADLK)
-   return err;
-
-   err = i915_gem_ww_ctx_backoff(&eb->ww);
-   if (err)
-   return err;
-   }
-
-   return 0;
-}
-
/*
 * Using two helper loops for the order of which requests / batches are created
 * and added the to backend. Requests are created in order from the parent to
@@ -360,15 +373,51 @@ static void eb_move_all_persistent_vma_to_active(struct 
i915_execbuffer *eb)

static int eb_move_to_gpu(struct i915_execbuffer *eb)
{
+   int err = 0, j;
+
lockdep_assert_held(&eb->context->vm->vm_bind_lock);
assert_object_held(eb->context->vm->root_obj);

eb_move_all_persistent_vma_to_active(eb);

-   /* Unconditionally flush any chipset caches (for streaming writes). */
-   intel_gt_chipset_flush(eb->gt);
+#ifdef CONFIG_MMU_NOTIFIER
+   if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+   struct i915_

Re: [Intel-gfx] [RFC PATCH v3 10/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-08-31 Thread Niranjana Vishwanathapura

On Wed, Aug 31, 2022 at 08:38:48AM +0100, Tvrtko Ursulin wrote:


On 27/08/2022 20:43, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

And the legacy support like relocations etc are removed.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 1000 +
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|2 +
 include/uapi/drm/i915_drm.h   |   62 +
 4 files changed, 1065 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 4e1627e96c6e0..38cd1c5bc1a55 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index 0..a3d767cd9f808
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,1000 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_drv.h"
+#include "i915_file_private.h"
+#include "i915_gem_context.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+struct eb_fence {
+   struct drm_syncobj *syncobj;
+   struct dma_fence *dma_fence;
+   u64 value;
+   struct dma_fence_chain *chain_fence;
+};
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   stru

Re: [RFC PATCH v3 04/17] drm/i915: Implement bind and unbind of object

2022-08-31 Thread Niranjana Vishwanathapura

On Tue, Aug 30, 2022 at 07:19:17PM +0100, Matthew Auld wrote:

On 27/08/2022 20:43, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Implement the bind and unbind of an object at the specified GPU virtual
addresses.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  21 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 322 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   9 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 drivers/gpu/drm/i915/i915_vma.c   |   3 +-
 drivers/gpu/drm/i915/i915_vma.h   |   2 -
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 163 +
 10 files changed, 543 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 522ef9b4aff32..4e1627e96c6e0 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -165,6 +165,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index 0..ebc493b7dafc1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include "i915_drv.h"
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_vma_all(struct i915_address_space *vm);
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index 0..dadd1d4b1761b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include "gem/i915_gem_vm_bind.h"
+#include "gem/i915_gem_context.h"
+#include "gt/gen8_engine_cs.h"
+
+#include "i915_drv.h"
+#include "i915_gem_gtt.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
+ * done asynchronously, when valid out fence is specified.
+ *
+ * VM_BIND locking order is as below.
+ *
+ * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
+ *vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing 
the
+ *mapping.
+ *
+ *In future, when GPU page faults are supported, we can potentially use a
+ *rwsem instead, so that multiple page fault handlers can take the read
+ *side lock to lookup the mapping and hence can run in parallel.
+ *The older execbuf mode of binding do not need this lock.
+ *
+ * 2) The object&

[PATCH] drm/i915: Rename ggtt_view as gtt_view

2022-09-01 Thread Niranjana Vishwanathapura
So far, different views (normal, partial, rotated and remapped)
into the same object are only supported for GGTT mappings.
But with the upcoming VM_BIND feature, PPGTT will also use the
partial view mapping. Hence rename ggtt_view to more generic
gtt_view.

Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
 drivers/gpu/drm/i915/display/intel_display.h  |  2 +-
 .../drm/i915/display/intel_display_types.h|  2 +-
 drivers/gpu/drm/i915/display/intel_fb.c   | 18 ++---
 drivers/gpu/drm/i915/display/intel_fb_pin.c   |  4 +-
 drivers/gpu/drm/i915/display/intel_fb_pin.h   |  4 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  | 16 ++---
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  4 +-
 drivers/gpu/drm/i915/gt/intel_reset.c |  2 +-
 drivers/gpu/drm/i915/i915_debugfs.c   | 56 +++
 drivers/gpu/drm/i915/i915_drv.h   |  4 +-
 drivers/gpu/drm/i915/i915_gem.c   |  6 +-
 drivers/gpu/drm/i915/i915_vma.c   | 40 +--
 drivers/gpu/drm/i915/i915_vma.h   | 18 ++---
 drivers/gpu/drm/i915/i915_vma_types.h | 42 ++--
 drivers/gpu/drm/i915/selftests/i915_vma.c | 68 +--
 19 files changed, 149 insertions(+), 149 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index be7cff722196..8251f87064f6 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -670,7 +670,7 @@ bool intel_plane_uses_fence(const struct intel_plane_state 
*plane_state)
 
return DISPLAY_VER(dev_priv) < 4 ||
(plane->fbc &&
-plane_state->view.gtt.type == I915_GGTT_VIEW_NORMAL);
+plane_state->view.gtt.type == I915_GTT_VIEW_NORMAL);
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/display/intel_display.h 
b/drivers/gpu/drm/i915/display/intel_display.h
index e895277c4cd9..e322011877bb 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -45,7 +45,7 @@ struct drm_modeset_acquire_ctx;
 struct drm_plane;
 struct drm_plane_state;
 struct i915_address_space;
-struct i915_ggtt_view;
+struct i915_gtt_view;
 struct intel_atomic_state;
 struct intel_crtc;
 struct intel_crtc_state;
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index 0da9b208d56e..01977cd237eb 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -105,7 +105,7 @@ struct intel_fb_view {
 * In the normal view the FB object's backing store sg list is used
 * directly and hence the remap information here is not used.
 */
-   struct i915_ggtt_view gtt;
+   struct i915_gtt_view gtt;
 
/*
 * The GTT view (gtt.type) specific information for each FB color
diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
b/drivers/gpu/drm/i915/display/intel_fb.c
index b191915ab351..eefa33c555ac 100644
--- a/drivers/gpu/drm/i915/display/intel_fb.c
+++ b/drivers/gpu/drm/i915/display/intel_fb.c
@@ -1395,7 +1395,7 @@ static u32 calc_plane_remap_info(const struct 
intel_framebuffer *fb, int color_p
   plane_view_height_tiles(fb, color_plane, dims, 
y));
}
 
-   if (view->gtt.type == I915_GGTT_VIEW_ROTATED) {
+   if (view->gtt.type == I915_GTT_VIEW_ROTATED) {
drm_WARN_ON(&i915->drm, remap_info->linear);
check_array_bounds(i915, view->gtt.rotated.plane, color_plane);
 
@@ -1420,7 +1420,7 @@ static u32 calc_plane_remap_info(const struct 
intel_framebuffer *fb, int color_p
/* rotate the tile dimensions to match the GTT view */
swap(tile_width, tile_height);
} else {
-   drm_WARN_ON(&i915->drm, view->gtt.type != 
I915_GGTT_VIEW_REMAPPED);
+   drm_WARN_ON(&i915->drm, view->gtt.type != 
I915_GTT_VIEW_REMAPPED);
 
check_array_bounds(i915, view->gtt.remapped.plane, color_plane);
 
@@ -1503,12 +1503,12 @@ calc_plane_normal_size(const struct intel_framebuffer 
*fb, int color_plane,
 }
 
 static void intel_fb_view_init(struct drm_i915_private *i915, struct 
intel_fb_view *view,
-  enum i915_ggtt_view_type view_type)
+  enum i915_gtt_view_type view_type)
 {
memset(view, 0, sizeof(*view));
view->gtt.type = view_type;
 
-   if (view_type == I915_GGTT_VIEW_REMAPPED && IS_ALDERLAKE_P(i915))
+   if (view_type == I915_GTT_VIEW_REMAPPED && IS_ALDERLAKE_P(i915))
vi

Re: [RFC PATCH v3 04/17] drm/i915: Implement bind and unbind of object

2022-09-01 Thread Niranjana Vishwanathapura

On Thu, Sep 01, 2022 at 03:31:13PM +1000, Dave Airlie wrote:

On Sun, 28 Aug 2022 at 05:45, Andi Shyti  wrote:


From: Niranjana Vishwanathapura 

Implement the bind and unbind of an object at the specified GPU virtual
addresses.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  21 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 322 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   9 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 drivers/gpu/drm/i915/i915_vma.c   |   3 +-
 drivers/gpu/drm/i915/i915_vma.h   |   2 -
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 163 +
 10 files changed, 543 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c






diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index be6e028c3b57d..f746fecae85ed 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -289,6 +289,20 @@ struct i915_vma {
/** This object's place on the active/inactive lists */
struct list_head vm_link;

+   /** @vm_bind_link: node for the vm_bind related lists of vm */
+   struct list_head vm_bind_link;
+
+   /** Interval tree structures for persistent vma */
+
+   /** @rb: node for the interval tree of vm for persistent vmas */
+   struct rb_node rb;
+   /** @start: start endpoint of the rb node */
+   u64 start;
+   /** @last: Last endpoint of the rb node */
+   u64 last;
+   /** @__subtree_last: last in subtree */
+   u64 __subtree_last;


Was a drm_mm node considered for this or was it overkill?



We already have a drm_mm node (i915_vma.node). But currently in i915
driver, VA managment and binding of vmas are tightly coupled.
Ideally we want to decouple it and then use the same drm_mm node for
persistent vma lookup as well, instead of this new interval tree.
But decouple it is not trivial I think it needs to be carefully
done in a separate patch series to not cause any regression.

The new interval/rb tree here is an optimization for fast lookup of
persistent vma (instead of list walk) whether it is bound or not.
Eventually though, with above cleanup we should be able to use the
i915_vma.node for vma lookup (even when it is not bound).

I was briefly discussed in earlier version of this series (though topic
was different there).
https://lists.freedesktop.org/archives/intel-gfx/2022-July/301159.html

Niranjana


Dave.


Re: [Intel-gfx] [RFC PATCH v3 10/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-09-01 Thread Niranjana Vishwanathapura

On Thu, Sep 01, 2022 at 08:58:57AM +0100, Tvrtko Ursulin wrote:



On 01/09/2022 06:09, Niranjana Vishwanathapura wrote:

On Wed, Aug 31, 2022 at 08:38:48AM +0100, Tvrtko Ursulin wrote:


On 27/08/2022 20:43, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

And the legacy support like relocations etc are removed.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---


[snip]


+static void signal_fence_array(const struct i915_execbuffer *eb,
+   struct dma_fence * const fence)
+{
+    unsigned int n;
+
+    for (n = 0; n < eb->num_fences; n++) {
+    struct drm_syncobj *syncobj;
+    unsigned int flags;
+
+    syncobj = ptr_unpack_bits(eb->fences[n].syncobj, &flags, 2);
+    if (!(flags & I915_TIMELINE_FENCE_SIGNAL))
+    continue;
+
+    if (eb->fences[n].chain_fence) {
+    drm_syncobj_add_point(syncobj,
+  eb->fences[n].chain_fence,
+  fence,
+  eb->fences[n].value);
+    /*
+ * The chain's ownership is transferred to the
+ * timeline.
+ */
+    eb->fences[n].chain_fence = NULL;
+    } else {
+    drm_syncobj_replace_fence(syncobj, fence);
+    }
+    }
+}
Semi-random place to ask - how many of the code here is direct 
copy of existing functions from i915_gem_execbuffer.c? There seems 
to be some 100% copies at least. And then some more with small 
tweaks. Spend some time and try to figure out some code sharing?




During VM_BIND design review, maintainers expressed thought on keeping
execbuf3 completely separate and not touch the legacy execbuf path.


Got a link so this maintainer can see what exactly was said? Just to 
make sure there isn't any misunderstanding on what "completely 
separate" means to different people.


Here is one (search for copypaste/copy-paste)
https://patchwork.freedesktop.org/patch/486608/?series=93447&rev=3
It is hard to search for old discussion threads. May be maintainers
can provide feedback here directly. Dave, Daniel? :)




I also think, execbuf3 should be fully separate. We can do some code
sharing where is a close 100% copy (there is a TODO in cover letter).
There are some changes like the timeline fence array handling here
which looks similar, but the uapi is not exactly the same. Probably,
we should keep them separate and not try to force code sharing at
least at this point.


Okay did not spot that TODO in the cover. But fair since it is RFC to 
be unfinished.


I do however think it should be improved before considering the merge. 
Because looking at the patch, 100% copies are:


for_each_batch_create_order
for_each_batch_add_order
eb_throttle
eb_pin_timeline
eb_pin_engine
eb_put_engine
__free_fence_array
put_fence_array
await_fence_array
signal_fence_array
retire_requests
eb_request_add
eb_requests_get
eb_requests_put
eb_find_context

Quite a lot.

Then there is a bunch of almost same functions which could be shared 
if there weren't two incompatible local struct i915_execbuffer's. 
Especially given when the out fence TODO item gets handled a chunk 
more will also become a 100% copy.




There are difinitely a few which is 100% copies hence should have a
shared code.
But some are not. Like, fence_array stuff though looks very similar,
the uapi structures are different between execbuf3 and legacy execbuf.
The internal flags are also different (eg., __EXEC3_ENGINE_PINNED vs
__EXEC_ENGINE_PINNED) which causes minor differences hence not a
100% copy.

So, I am not convinced if it is worth carrying legacy stuff into
execbuf3 code. I think we need to look at these on a case by case
basis and see if abstracting common functionality to a separate
shared code makes sense or it is better to keep the code separate.

This could be done by having a common struct i915_execbuffer and then 
eb2 and eb3 specific parts which inherit from it. After that is done 
it should be easier to see if it makes sense to do something more and 
how.


I am not a big fan of it. I think we should not try to load the execbuf3
code with the legacy stuff.

Niranjana



Regards,

Tvrtko


Re: [PATCH 02/16] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2022-09-28 Thread Niranjana Vishwanathapura

On Wed, Sep 28, 2022 at 06:39:05PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 28 +---
 drivers/gpu/drm/i915/i915_sw_fence.h | 23 +--
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index cc2a8821d22a..b7a10c374a08 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -7,7 +7,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "i915_sw_fence.h"
 #include "i915_selftest.h"
@@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
return ret;
 }
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp)
+/**
+ * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
+ * object with specified usage.
+ * @fence: the fence that needs to wait
+ * @resv: dma-resv object
+ * @usage: dma_resv_usage (See enum dma_resv_usage)
+ * @timeout: how long to wait in jiffies
+ * @gfp: allocation mode
+ *
+ * Setup the @fence to asynchronously wait on dma-resv object @resv for usage
+ * @usage to complete before signaling.


s/usage @usage/@usage/ ?


Thanks for the review Matt.
Sure, will fix.




+ *
+ * Returns 0 if there is nothing to wait on, -ve upon error and >0 upon


What does "-ve" mean btw?



-ve error code. Will update

Regards,
Niranjana


Acked-by: Matthew Auld 


+ * successfully setting up the wait.
+ */
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp)
 {
struct dma_resv_iter cursor;
struct dma_fence *f;
@@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
debug_fence_assert(fence);
might_sleep_if(gfpflags_allow_blocking(gfp));
-   dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+   dma_resv_iter_begin(&cursor, resv, usage);
dma_resv_for_each_fence_unlocked(&cursor, f) {
pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
gfp);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index f752bfc7c6e1..9c4859dc4c0d 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 #include 
+#include 
 #include 
 #include 
 #include  /* for NOTIFY_DONE */
 #include 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 enum i915_sw_fence_notify {
@@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
  unsigned long timeout,
  gfp_t gfp);
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp);
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp);
+
+static inline int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ bool write,
+ unsigned long timeout,
+ gfp_t gfp)
+{
+   return __i915_sw_fence_await_reservation(fence, resv,
+dma_resv_usage_rw(write),
+timeout, gfp);
+}
 bool i915_sw_fence_await(struct i915_sw_fence *fence);
 void i915_sw_fence_complete(struct i915_sw_fence *fence);


Re: [PATCH 03/16] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2022-09-28 Thread Niranjana Vishwanathapura

On Wed, Sep 28, 2022 at 06:40:01PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 19 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..4aa7b5582b8e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -15,10 +15,19 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
-static u32 object_max_page_size(struct intel_memory_region **placements,
-   unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Calculates the max of the min_page_size of a list of placements passed in.
+ *
+ * Return: max of the min_page_size


"max of the min_page_size, or I915_GTT_PAGE_SIZE_4K if zero placements."



ok, will update.

Regards,
Niranjana


Acked-by: Matthew Auld 


+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements)
 {
-   u32 max_page_size = 0;
+   u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
int i;
for (i = 0; i < n_placements; i++) {
@@ -28,7 +37,6 @@ static u32 object_max_page_size(struct intel_memory_region 
**placements,
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
}
-   GEM_BUG_ON(!max_page_size);
return max_page_size;
 }
@@ -99,7 +107,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private 
*i915, u64 size,
i915_gem_flush_free_objects(i915);
-   size = round_up(size, object_max_page_size(placements, n_placements));
+   size = round_up(size, i915_gem_object_max_page_size(placements,
+   n_placements));
if (size == 0)
return ERR_PTR(-EINVAL);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index a3b7551a57fc..d53d01b1860a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements);
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);


Re: [PATCH 05/16] drm/i915/vm_bind: Implement bind and unbind of object

2022-09-28 Thread Niranjana Vishwanathapura

On Wed, Sep 28, 2022 at 06:52:21PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 306 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 112 +++
 10 files changed, 495 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cd75b0ca2555..f85f10cf9c34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
+   if (ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..e529162abd2c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,306 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threa

Re: [Intel-gfx] [PATCH 05/16] drm/i915/vm_bind: Implement bind and unbind of object

2022-09-28 Thread Niranjana Vishwanathapura

On Wed, Sep 28, 2022 at 01:06:18PM -0700, Welty, Brian wrote:



On 9/27/2022 11:19 PM, Niranjana Vishwanathapura wrote:

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 306 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 112 +++
 10 files changed, 495 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cd75b0ca2555..f85f10cf9c34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
+   if (ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..e529162abd2c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,306 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threa

Re: [PATCH 05/16] drm/i915/vm_bind: Implement bind and unbind of object

2022-09-29 Thread Niranjana Vishwanathapura

On Thu, Sep 29, 2022 at 11:51:51AM +0100, Matthew Auld wrote:

On 29/09/2022 10:03, Matthew Auld wrote:

On 29/09/2022 06:24, Niranjana Vishwanathapura wrote:

On Wed, Sep 28, 2022 at 06:52:21PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 306 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c    |   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 112 +++
 10 files changed, 495 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile 
b/drivers/gpu/drm/i915/Makefile

index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
 gem/i915_gem_ttm_move.o \
 gem/i915_gem_ttm_pm.o \
 gem/i915_gem_userptr.o \
+    gem/i915_gem_vm_bind_object.o \
 gem/i915_gem_wait.o \
 gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c

index cd75b0ca2555..f85f10cf9c34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct 
i915_execbuffer *eb)

 if (unlikely(IS_ERR(ctx)))
 return PTR_ERR(ctx);
+    if (ctx->vm->vm_bind_mode) {
+    i915_gem_context_put(ctx);
+    return -EOPNOTSUPP;
+    }
+
 eb->gem_context = ctx;
 if (i915_gem_context_has_full_ppgtt(ctx))
 eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h

new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git 
a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

new file mode 100644
index ..e529162abd2c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,306 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+ START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to 
bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU 
virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to 
the same physical
+ * pages of an object (aliasing). These mappings (also 
referred to as persistent
+ * mappings) will be persistent across multiple GPU 
submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list 
of all required
+ * mappings during each submission (as required by older 
execbuf mode).

+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline 
out fence for

+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via 
I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an 
address space (VM)
+ * during VM creation time via 
I915_VM_CREATE_FLAG

Re: [PATCH 06/16] drm/i915/vm_bind: Support for VM private BOs

2022-09-29 Thread Niranjana Vishwanathapura

On Wed, Sep 28, 2022 at 06:54:27PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 41 ++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  3 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  9 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  4 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 include/uapi/drm/i915_drm.h   | 30 ++
 12 files changed, 106 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 4aa7b5582b8e..692d95ef5d3e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
 #include "pxp/intel_pxp.h"
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -252,6 +253,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   u32 vm_id;
 };
 static void repr_placements(char *buf, size_t size,
@@ -401,9 +403,24 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct drm_i915_gem_create_ext_vm_private ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   ext_data->vm_id = ext.vm_id;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 /**
@@ -419,6 +436,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_gem_create_ext *args = data;
struct create_ext ext_data = { .i915 = i915 };
+   struct i915_address_space *vm = NULL;
struct drm_i915_gem_object *obj;
int ret;
@@ -432,6 +450,12 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (ret)
return ret;
+   if (ext_data.vm_id) {
+   vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id);
+   if (unlikely(!vm))
+   return -ENOENT;
+   }
+
if (!ext_data.n_placements) {
ext_data.placements[0] =
intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
@@ -458,8 +482,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
ext_data.placements,
ext_data.n_placements,
ext_data.flags);
-   if (IS_ERR(obj))
-   return PTR_ERR(obj);
+   if (IS_ERR(obj)) {
+   ret = PTR_ERR(obj);
+   goto vm_put;
+   }
+
+   if (vm) {
+   obj->base.resv = vm->root_obj->base.resv;
+   obj->priv_root = i915_gem_object_get(vm->root_obj);
+   i915_vm_put(vm);
+   }
return i915_gem_publish(obj, file, &args->size, &args->handle);
+vm_put:
+   if (vm)
+   i915_vm_put(vm);
+
+   return ret;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index f5062d0c6333..6433173c3e84 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -218,6 +218,12 @@ struct dma_buf *i915_gem_prime_export(struct 
drm_gem_object *gem_obj, int flags)
struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+   if (obj->priv_root) {
+   drm_dbg(obj->b

Re: [PATCH 12/16] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-09-29 Thread Niranjana Vishwanathapura

On Thu, Sep 29, 2022 at 04:00:47PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 571 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 636 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bf952f478555..3473ee5825bb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..92af88bc8deb
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,571 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct drm_i915_gem_execbuffer3 *args;
+
+   struct intel_gt *gt;
+   struct intel_context *context;
+   struct i915_gem_context *gem_context;
+
+   struct i915_request 

Re: [PATCH 05/16] drm/i915/vm_bind: Implement bind and unbind of object

2022-09-29 Thread Niranjana Vishwanathapura

On Thu, Sep 29, 2022 at 11:49:30AM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 306 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 112 +++
 10 files changed, 495 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cd75b0ca2555..f85f10cf9c34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
+   if (ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..e529162abd2c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,306 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threa

Re: [PATCH 05/16] drm/i915/vm_bind: Implement bind and unbind of object

2022-09-29 Thread Niranjana Vishwanathapura

On Thu, Sep 29, 2022 at 06:28:39PM +0100, Matthew Auld wrote:

On 29/09/2022 17:38, Niranjana Vishwanathapura wrote:

On Thu, Sep 29, 2022 at 11:49:30AM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 306 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c    |   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 112 +++
 10 files changed, 495 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile 
b/drivers/gpu/drm/i915/Makefile

index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
 gem/i915_gem_ttm_move.o \
 gem/i915_gem_ttm_pm.o \
 gem/i915_gem_userptr.o \
+    gem/i915_gem_vm_bind_object.o \
 gem/i915_gem_wait.o \
 gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c

index cd75b0ca2555..f85f10cf9c34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct 
i915_execbuffer *eb)

 if (unlikely(IS_ERR(ctx)))
 return PTR_ERR(ctx);
+    if (ctx->vm->vm_bind_mode) {
+    i915_gem_context_put(ctx);
+    return -EOPNOTSUPP;
+    }
+
 eb->gem_context = ctx;
 if (i915_gem_context_has_full_ppgtt(ctx))
 eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h

new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

new file mode 100644
index ..e529162abd2c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,306 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+ START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind 
GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual 
addresses on a
+ * specified address space (VM). Multiple mappings can map to 
the same physical
+ * pages of an object (aliasing). These mappings (also referred 
to as persistent
+ * mappings) will be persistent across multiple GPU submissions 
(execbuf calls)
+ * issued by the UMD, without user having to provide a list of 
all required

+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline 
out fence for

+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via 
I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an 
address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND 
extension.

+ *
+ * VM_BIND/UNBI

Re: [PATCH 16/16] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

2022-09-30 Thread Niranjana Vishwanathapura

On Fri, Sep 30, 2022 at 11:01:17AM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 +++-
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c|  3 +++
 include/uapi/drm/i915_drm.h | 24 -
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f4e648ec01ed..c20bd6e8aaf8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1808,9 +1808,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
if (!HAS_FULL_PPGTT(i915))
return -ENODEV;
-   if (args->flags)
+   if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
return -EINVAL;
+   if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+   !HAS_VM_BIND(i915))
+   return -EOPNOTSUPP;
+
ppgtt = i915_ppgtt_create(to_gt(i915), 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@@ -1828,6 +1832,9 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (err)
goto err_put;
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND)
+   ppgtt->vm.vm_bind_mode = true;
+
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 84a2f6b16f57..e77393d74c6f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -974,6 +974,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_ONE_EU_PER_FUSE_BIT(i915)  
(INTEL_INFO(i915)->has_one_eu_per_fuse_bit)
+#define HAS_VM_BIND(dev_priv) (GRAPHICS_VER(dev_priv) >= 12)


s/dev_priv/i915/


Ok, will fix




+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 342c8ca6414e..f45b3c684bcf 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_PERF_REVISION:
value = i915_perf_ioctl_version();
break;
+   case I915_PARAM_VM_BIND_VERSION:
+   value = HAS_VM_BIND(i915);
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index f3a5b198b3e7..9a033acc254b 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -755,6 +755,27 @@ typedef struct drm_i915_irq_wait {
 /* Query if the kernel supports the I915_USERPTR_PROBE flag. */
 #define I915_PARAM_HAS_USERPTR_PROBE 56
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *previously with VM_BIND, the ioctl will not support unbinding multiple
+ *mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *any existing mappings.
+ *
+ * 2: The restrictions on unbinding partial or multiple mappings is
+ *lifted, Similarly, binding will replace any mappings in the given range.


Should we just remove 2 for now? It looks like 1 is this series.


Yah, will remove.




+ *
+ * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
+ *
+ * vm_bind versions are backward compatible.
+ */
+#define I915_PARAM_VM_BIND_VERSION 57
+
 /* Must be kept compact -- no holes and well documented */
 /**
@@ -2622,7 +2643,8 @@ struct drm_i915_gem_vm_control {
/** @extensions: Zero-terminated chain of extensions. */
__u64 extensions;
-   /** @flags: reserved for future usage, currently MBZ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND   (1u << 0)


Some kernel-doc for that would be good, even if it's kind of obvious.



Ok, will add comment.

Regards,
Niranjana


Acked-by: Matthew Auld 


+#define I915_VM_CREATE_FLAGS_UNKNOWN   (-(I915_VM_CREATE_FLAGS_USE_VM_BIND << 
1))
__u32 flags;
/** @vm_id: Id of the VM created or to be destroyed */


Re: [PATCH 10/16] drm/i915/vm_bind: Abstract out common execbuf functions

2022-09-30 Thread Niranjana Vishwanathapura

On Fri, Sep 30, 2022 at 11:45:34AM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Abstract out the common execbuf functionalities into a
separate file where possible, thus allowing code sharing.

Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 664 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 3 files changed, 739 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9bf939ef18ea..bf952f478555 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_create.o \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
+   gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
new file mode 100644
index ..a7efd74afc9c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
@@ -0,0 +1,664 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_gem_execbuffer_common.h"
+
+#define __EXEC_COMMON_FENCE_WAIT   BIT(0)
+#define __EXEC_COMMON_FENCE_SIGNAL BIT(1)
+
+static struct i915_request *eb_throttle(struct intel_context *ce)
+{
+   struct intel_ring *ring = ce->ring;
+   struct intel_timeline *tl = ce->timeline;
+   struct i915_request *rq;
+
+   /*
+* Completely unscientific finger-in-the-air estimates for suitable
+* maximum user request size (to avoid blocking) and then backoff.
+*/
+   if (intel_ring_update_space(ring) >= PAGE_SIZE)
+   return NULL;
+
+   /*
+* Find a request that after waiting upon, there will be at least half
+* the ring available. The hysteresis allows us to compete for the
+* shared ring and should mean that we sleep less often prior to
+* claiming our resources, but not so long that the ring completely
+* drains before we can submit our next request.
+*/
+   list_for_each_entry(rq, &tl->requests, link) {
+   if (rq->ring != ring)
+   continue;
+
+   if (__intel_ring_space(rq->postfix,
+  ring->emit, ring->size) > ring->size / 2)
+   break;
+   }
+   if (&rq->link == &tl->requests)
+   return NULL; /* weird, we will check again later for real */
+
+   return i915_request_get(rq);
+}
+
+static int eb_pin_timeline(struct intel_context *ce, bool throttle,
+  bool nonblock)
+{
+   struct intel_timeline *tl;
+   struct i915_request *rq = NULL;
+
+   /*
+* Take a local wakeref for preparing to dispatch the execbuf as
+* we expect to access the hardware fairly frequently in the
+* process, and require the engine to be kept awake between accesses.
+* Upon dispatch, we acquire another prolonged wakeref that we hold
+* until the timeline is idle, which in turn releases the wakeref
+* taken on the engine, and the parent device.
+*/
+   tl = intel_context_timeline_lock(ce);
+   if (IS_ERR(tl))
+   return PTR_ERR(tl);
+
+   intel_context_enter(ce);
+   if (throttle)
+   rq = eb_throttle(ce);
+   intel_context_timeline_unlock(tl);
+
+   if (rq) {
+   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
+
+   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
+ timeout) < 0) {
+   i915_request_put(rq);
+
+   /*
+* Error path, cannot use intel_context_timeline_lock as
+* that is user interruptable and this clean up step
+* must be done.
+*/
+   mutex_lock(&ce->timeline->mutex);
+   intel_context_exit(ce);
+   mutex_unlock(&ce->timeline->mutex);
+
+   if (nonblock)
+   return -EWOULDBLOCK;
+   else
+   return -EINT

Re: [PATCH 14/16] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-01 Thread Niranjana Vishwanathapura

On Fri, Sep 30, 2022 at 10:47:48AM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 188 +-
 1 file changed, 187 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 92af88bc8deb..1aeeff5e8540 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -19,6 +19,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -42,7 +43,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +61,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 /**
@@ -127,6 +137,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +168,121 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
 }
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb,
+ bool final)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link)
+   __i915_vma_unpin(vma);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+   if (!final)
+   return;
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb, bool final)
 {
+   eb_release_persistent_vma_all(eb, final);
eb_unpin_engine(eb);
 }
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   

[PATCH v2 02/17] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2022-10-02 Thread Niranjana Vishwanathapura
Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 28 +---
 drivers/gpu/drm/i915/i915_sw_fence.h | 23 +--
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index cc2a8821d22a..ae06d35db056 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -7,7 +7,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "i915_sw_fence.h"
 #include "i915_selftest.h"
@@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
return ret;
 }
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp)
+/**
+ * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
+ * object with specified usage.
+ * @fence: the fence that needs to wait
+ * @resv: dma-resv object
+ * @usage: dma_resv_usage (See enum dma_resv_usage)
+ * @timeout: how long to wait in jiffies
+ * @gfp: allocation mode
+ *
+ * Setup the @fence to asynchronously wait on dma-resv object @resv for
+ * @usage to complete before signaling.
+ *
+ * Returns 0 if there is nothing to wait on, -ve error code upon error
+ * and >0 upon successfully setting up the wait.
+ */
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp)
 {
struct dma_resv_iter cursor;
struct dma_fence *f;
@@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
debug_fence_assert(fence);
might_sleep_if(gfpflags_allow_blocking(gfp));
 
-   dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+   dma_resv_iter_begin(&cursor, resv, usage);
dma_resv_for_each_fence_unlocked(&cursor, f) {
pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
gfp);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index f752bfc7c6e1..9c4859dc4c0d 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 
 #include 
+#include 
 #include 
 #include 
 #include  /* for NOTIFY_DONE */
 #include 
 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 
 enum i915_sw_fence_notify {
@@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
  unsigned long timeout,
  gfp_t gfp);
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp);
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp);
+
+static inline int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ bool write,
+ unsigned long timeout,
+ gfp_t gfp)
+{
+   return __i915_sw_fence_await_reservation(fence, resv,
+dma_resv_usage_rw(write),
+timeout, gfp);
+}
 
 bool i915_sw_fence_await(struct i915_sw_fence *fence);
 void i915_sw_fence_complete(struct i915_sw_fence *fence);
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v2 00/17] drm/i915/vm_bind: Add VM_BIND functionality

2022-10-02 Thread Niranjana Vishwanathapura
DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

TODOs:
* Support out fence for VM_UNBIND ioctl.
* Async VM_UNBIND support.
* Optimizations.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
  [PATCH i-g-t v2 0/8] vm_bind: Add VM_BIND validation support

v2: Address various review comments

Signed-off-by: Niranjana Vishwanathapura 

Niranjana Vishwanathapura (17):
  drm/i915/vm_bind: Expose vm lookup function
  drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  drm/i915/vm_bind: Add support to create persistent vma
  drm/i915/vm_bind: Implement bind and unbind of object
  drm/i915/vm_bind: Support for VM private BOs
  drm/i915/vm_bind: Add support to handle object evictions
  drm/i915/vm_bind: Support persistent vma activeness tracking
  drm/i915/vm_bind: Add out fence support
  drm/i915/vm_bind: Abstract out common execbuf functions
  drm/i915/vm_bind: Use common execbuf functions in execbuf path
  drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  drm/i915/vm_bind: Expose i915_request_await_bind()
  drm/i915/vm_bind: Handle persistent vmas in execbuf3
  drm/i915/vm_bind: userptr dma-resv changes
  drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

 drivers/gpu/drm/i915/Makefile |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  20 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  67 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   6 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 516 +--
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 867 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   3 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  17 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 439 +
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  18 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  27 +
 drivers/gpu/drm/i915/i915_driver.c|   4 +
 drivers/gpu/drm/i915/i915_drv.h   |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   3 +
 drivers/gpu/drm/i915/i915_getparam.c  |   3 +
 drivers/gpu/drm/i915/i915_sw_fence.c  |  28 +-
 drivers/gpu/drm/i915/i915_sw_fence.h  |  23 +-
 drivers/gpu/drm/i915/i915_vma.c   | 116 ++-
 drivers/gpu/drm/i915/i915_vma.h   |  57 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  39 +
 include/uapi/drm/i915_drm.h   | 281 +-
 30 files changed, 2842 insertions(+), 519 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v2 01/17] drm/i915/vm_bind: Expose vm lookup function

2022-10-02 Thread Niranjana Vishwanathapura
Make i915_gem_vm_lookup() function non-static as it will be
used by the vm_bind feature.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 0bcde53c50c6..f4e648ec01ed 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -346,7 +346,16 @@ static int proto_context_register(struct 
drm_i915_file_private *fpriv,
return ret;
 }
 
-static struct i915_address_space *
+/**
+ * i915_gem_vm_lookup() - looks up for the VM reference given the vm id
+ * @file_priv: the private data associated with the user's file
+ * @id: the VM id
+ *
+ * Finds the VM reference associated to a specific id.
+ *
+ * Returns the VM pointer on success, NULL in case of failure.
+ */
+struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e5b0f66ea1fe..899fa8f1e0fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, 
void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 
+struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v2 03/17] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2022-10-02 Thread Niranjana Vishwanathapura
Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 18 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..5c6e396ab74d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -15,10 +15,18 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-static u32 object_max_page_size(struct intel_memory_region **placements,
-   unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Returns the largest of min_page_size of the @placements,
+ * or I915_GTT_PAGE_SIZE_4K if @n_placements is 0.
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements)
 {
-   u32 max_page_size = 0;
+   u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
int i;
 
for (i = 0; i < n_placements; i++) {
@@ -28,7 +36,6 @@ static u32 object_max_page_size(struct intel_memory_region 
**placements,
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
}
 
-   GEM_BUG_ON(!max_page_size);
return max_page_size;
 }
 
@@ -99,7 +106,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private 
*i915, u64 size,
 
i915_gem_flush_free_objects(i915);
 
-   size = round_up(size, object_max_page_size(placements, n_placements));
+   size = round_up(size, i915_gem_object_max_page_size(placements,
+   n_placements));
if (size == 0)
return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index a3b7551a57fc..d53d01b1860a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements);
 
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v2 05/17] drm/i915/vm_bind: Implement bind and unbind of object

2022-10-02 Thread Niranjana Vishwanathapura
Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

v2: On older platforms ctx->vm is not set, check for it.
In vm_bind call, add vma to vm_bind_list.
Add more input validity checks.
Update some documentation.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 328 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 110 ++
 10 files changed, 515 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 8f5796cf9c9c..9fb9f6faafd8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
 
+   if (ctx->vm && ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..88314e3a99d1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,328 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND exten

[PATCH v2 04/17] drm/i915/vm_bind: Add support to create persistent vma

2022-10-02 Thread Niranjana Vishwanathapura
Add i915_vma_instance_persistent() to create persistent vmas.
Persistent vmas will use i915_gtt_view to support partial binding.

vma_lookup is tied to segment of the object instead of section
of VA space. Hence, it do not support aliasing. ie., multiple
mappings (at different VA) point to the same gtt_view of object.
Skip vma_lookup for persistent vmas to support aliasing.

v2: Remove unused I915_VMA_PERSISTENT definition,
update validity check in i915_vma_compare(),
remove unwanted is_persistent check in release_references().

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c   | 36 +--
 drivers/gpu/drm/i915/i915_vma.h   | 17 -
 drivers/gpu/drm/i915/i915_vma_types.h |  6 +
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f17c09ead7d7..f9da8010021f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -109,7 +109,8 @@ static void __i915_vma_retire(struct i915_active *ref)
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
   struct i915_address_space *vm,
-  const struct i915_gtt_view *view)
+  const struct i915_gtt_view *view,
+  bool skip_lookup_cache)
 {
struct i915_vma *pos = ERR_PTR(-E2BIG);
struct i915_vma *vma;
@@ -196,6 +197,9 @@ vma_create(struct drm_i915_gem_object *obj,
__set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(vma));
}
 
+   if (skip_lookup_cache)
+   goto skip_rb_insert;
+
rb = NULL;
p = &obj->vma.tree.rb_node;
while (*p) {
@@ -220,6 +224,7 @@ vma_create(struct drm_i915_gem_object *obj,
rb_link_node(&vma->obj_node, rb, p);
rb_insert_color(&vma->obj_node, &obj->vma.tree);
 
+skip_rb_insert:
if (i915_vma_is_ggtt(vma))
/*
 * We put the GGTT vma at the start of the vma-list, followed
@@ -299,7 +304,34 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 
/* vma_create() will resolve the race if another creates the vma */
if (unlikely(!vma))
-   vma = vma_create(obj, vm, view);
+   vma = vma_create(obj, vm, view, false);
+
+   GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
+   return vma;
+}
+
+/**
+ * i915_vma_create_persistent - create a persistent VMA
+ * @obj: parent &struct drm_i915_gem_object to be mapped
+ * @vm: address space in which the mapping is located
+ * @view: additional mapping requirements
+ *
+ * Creates a persistent vma.
+ *
+ * Returns the vma, or an error pointer.
+ */
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+  struct i915_address_space *vm,
+  const struct i915_gtt_view *view)
+{
+   struct i915_vma *vma;
+
+   GEM_BUG_ON(!kref_read(&vm->ref));
+
+   vma = vma_create(obj, vm, view, true);
+   if (!IS_ERR(vma))
+   i915_vma_set_persistent(vma);
 
GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
return vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index aecd9c64486b..c5378ec2f70a 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -44,6 +44,10 @@ struct i915_vma *
 i915_vma_instance(struct drm_i915_gem_object *obj,
  struct i915_address_space *vm,
  const struct i915_gtt_view *view);
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+  struct i915_address_space *vm,
+  const struct i915_gtt_view *view);
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
@@ -138,6 +142,16 @@ static inline u32 i915_ggtt_pin_bias(struct i915_vma *vma)
return i915_vm_to_ggtt(vma->vm)->pin_bias;
 }
 
+static inline bool i915_vma_is_persistent(const struct i915_vma *vma)
+{
+   return test_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_persistent(struct i915_vma *vma)
+{
+   set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
i915_gem_object_get(vma->obj);
@@ -164,7 +178,8 @@ i915_vma_compare(struct i915_vma *vma,
 {
ptrdiff_t cmp;
 
-   GEM_BUG_ON(view && !i915_is_ggtt_or_dpt(vm));
+   GEM_BUG_ON(view && !(i915_is_ggtt_or_dpt(vm) ||
+i915_vma_is_persistent(vma)));
 
cmp = ptrdiff(vma->vm, vm);
if (cmp)
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index ec0f6c9f57d0..3144d71

[PATCH v2 16/17] drm/i915/vm_bind: userptr dma-resv changes

2022-10-02 Thread Niranjana Vishwanathapura
For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 80 +++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 17 
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 15 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 6 files changed, 120 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 5cece16949d8..e5f8eacc0de6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -19,6 +19,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_USERPTR_USED   BIT_ULL(34)
 #define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -141,6 +142,21 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
 {
struct i915_vma *vma, *vn;
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /**
+* Move all invalidated userptr vmas back into vm_bind_list so that
+* they are looked up and revalidated.
+*/
+   spin_lock(&vm->userptr_invalidated_lock);
+   list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+userptr_invalidated_link) {
+   list_del_init(&vma->userptr_invalidated_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
/**
 * Move all unbound vmas back into vm_bind_list so that they are
 * revalidated.
@@ -154,10 +170,47 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
spin_unlock(&vm->vm_rebind_lock);
 }
 
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *last_vma = NULL;
+   struct i915_vma *vma;
+   int err;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   /**
+* The above submit_init() call does the object unbind and
+* hence adds vma into vm_rebind_list. Remove it from that
+* list as it is already scooped for revalidation.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   spin_unlock(&vm->vm_rebind_lock);
+
+   last_vma = vma;
+   }
+
+   if (last_vma)
+   eb->args->flags |= __EXEC3_USERPTR_USED;
+
+   return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
struct i915_vma *vma;
+   int err = 0;
 
for (i = 0; i < eb->num_batches; i++) {
vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -170,6 +223,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 
eb_scoop_unbound_vma_all(eb->context->vm);
 
+   err = eb_lookup_persistent_userptr_vmas(eb);
+   if (err)
+   return err;
+
return 0;
 }
 
@@ -349,6 +406,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
}
}
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /* Check for further userptr invalidations */
+   spin_lock(&vm->userptr_invalidated_lock);
+   if (!list_empty(&vm->userptr_invalidated_list))
+   err = -EAGAIN;
+   spin_unlock(&vm->userptr_invalidated_lock);
+
+   if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+   read_lock(&eb->i915->mm.notifier_lock);
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_done(vma->obj);
+   if (err)
+   break;
+   }
+   read_unlock

[PATCH v2 10/17] drm/i915/vm_bind: Abstract out common execbuf functions

2022-10-02 Thread Niranjana Vishwanathapura
The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Abstract out the common execbuf functionalities into a
separate file where possible, thus allowing code sharing.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 3 files changed, 741 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9bf939ef18ea..bf952f478555 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_create.o \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
+   gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
new file mode 100644
index ..4d1c9ce154b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
@@ -0,0 +1,666 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_gem_execbuffer_common.h"
+
+#define __EXEC_COMMON_FENCE_WAIT   BIT(0)
+#define __EXEC_COMMON_FENCE_SIGNAL BIT(1)
+
+static struct i915_request *eb_throttle(struct intel_context *ce)
+{
+   struct intel_ring *ring = ce->ring;
+   struct intel_timeline *tl = ce->timeline;
+   struct i915_request *rq;
+
+   /*
+* Completely unscientific finger-in-the-air estimates for suitable
+* maximum user request size (to avoid blocking) and then backoff.
+*/
+   if (intel_ring_update_space(ring) >= PAGE_SIZE)
+   return NULL;
+
+   /*
+* Find a request that after waiting upon, there will be at least half
+* the ring available. The hysteresis allows us to compete for the
+* shared ring and should mean that we sleep less often prior to
+* claiming our resources, but not so long that the ring completely
+* drains before we can submit our next request.
+*/
+   list_for_each_entry(rq, &tl->requests, link) {
+   if (rq->ring != ring)
+   continue;
+
+   if (__intel_ring_space(rq->postfix,
+  ring->emit, ring->size) > ring->size / 2)
+   break;
+   }
+   if (&rq->link == &tl->requests)
+   return NULL; /* weird, we will check again later for real */
+
+   return i915_request_get(rq);
+}
+
+static int eb_pin_timeline(struct intel_context *ce, bool throttle,
+  bool nonblock)
+{
+   struct intel_timeline *tl;
+   struct i915_request *rq = NULL;
+
+   /*
+* Take a local wakeref for preparing to dispatch the execbuf as
+* we expect to access the hardware fairly frequently in the
+* process, and require the engine to be kept awake between accesses.
+* Upon dispatch, we acquire another prolonged wakeref that we hold
+* until the timeline is idle, which in turn releases the wakeref
+* taken on the engine, and the parent device.
+*/
+   tl = intel_context_timeline_lock(ce);
+   if (IS_ERR(tl))
+   return PTR_ERR(tl);
+
+   intel_context_enter(ce);
+   if (throttle)
+   rq = eb_throttle(ce);
+   intel_context_timeline_unlock(tl);
+
+   if (rq) {
+   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
+
+   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
+ timeout) < 0) {
+   i915_request_put(rq);
+
+   /*
+* Error path, cannot use intel_context_timeline_lock as
+* that is user interruptable and this clean up step
+* must be done.
+*/
+   mutex_lock(&ce->timeline->mutex);
+   intel_context_exit(ce);
+   mutex_unlock(&ce->timeline->mutex);
+
+   if (nonblock)
+   return -EWOULDBLOCK;
+   else
+   return -EINTR;
+   }
+   i915_request_put(rq);
+   }
+
+   return 0;
+}
+
+/

[PATCH v2 09/17] drm/i915/vm_bind: Add out fence support

2022-10-02 Thread Niranjana Vishwanathapura
Add support for handling out fence for vm_bind call.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 81 +++
 drivers/gpu/drm/i915/i915_vma.c   |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
 include/uapi/drm/i915_drm.h   | 63 ++-
 5 files changed, 157 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@
 
 #include 
 
+struct dma_fence;
 struct drm_device;
 struct drm_file;
 struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,
 
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
 
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index b51f8088d914..2df7d529276f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
 
@@ -100,6 +102,75 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
i915_gem_object_put(vma->obj);
 }
 
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct drm_i915_gem_vm_unbind *va)
 {
@@ -237,6 +308,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER;
 
for_i915_gem_ww(&ww, ret, true) {
@@ -265,6 +343,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
i915_gem_object_get(vma->obj);
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)
+   i915_vm_bind_put_fence(vma);
+put_vma:
if (ret)
i915_vma_destroy(vma);
 unlock_vm:
diff --git a/drivers/gpu/drm/i915/i915_vma.c

[PATCH v2 07/17] drm/i915/vm_bind: Add support to handle object evictions

2022-10-02 Thread Niranjana Vishwanathapura
Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  6 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 
 drivers/gpu/drm/i915/i915_vma.c   | 19 +++
 drivers/gpu/drm/i915/i915_vma.h   | 10 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 
 6 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 1ade9a5847fa..b51f8088d914 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -85,6 +85,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
 {
lockdep_assert_held(&vma->vm->vm_bind_lock);
 
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 65276da137b8..fe5902ea4065 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -296,6 +296,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE);
GEM_BUG_ON(IS_ERR(vm->root_obj));
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 4ae5734f7d6b..443d1918ad4e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5d3d67a4bf47..4c0e933a7ea5 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 
INIT_LIST_HEAD(&vma->vm_bind_link);
INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+   INIT_LIST_HEAD(&vma->vm_rebind_link);
return vma;
 
 err_unlock:
@@ -1686,6 +1687,14 @@ static void force_unbind(struct i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return;
 
+   /*
+* Persistent vma should have been purged by now.
+* If not, issue a warning and purge it.
+*/
+   if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+   !i915_vma_is_purged(vma)))
+   i915_vma_set_purged(vma);
+
atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
WARN_ON(__i915_vma_unbind(vma));
GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2047,6 +2056,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
__i915_vma_evict(vma, false);
 
drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+   if (i915_vma_is_persistent(vma)) {
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (list_empty(&vma->vm_rebind_link) &&
+   !i915_vma_is_purged(vma))
+   list_add_tail(&vma->vm_rebind_link,
+ &vma->vm->vm_rebind_list);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index c5378ec2f70a..9a4a7a8dfe5b 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -152,6 +152,16 @@ static inline void i915_vma_set_persistent(struct i915_vma 
*vma)
set_bit(I915_VMA_P

[PATCH v2 08/17] drm/i915/vm_bind: Support persistent vma activeness tracking

2022-10-02 Thread Niranjana Vishwanathapura
Do not use i915_vma activeness tracking for persistent vmas.

As persistent vmas are part of working set for each execbuf
submission on that address space (VM), a persistent vma is
active if the VM active. As vm->root_obj->base.resv will be
updated for each submission on that VM, it correctly
represent whether the VM is active or not.

Add i915_vm_is_active() and i915_vm_sync() functions based
on vm->root_obj->base.resv with DMA_RESV_USAGE_BOOKKEEP
usage. dma-resv fence list will be updated with this usage
during each submission with this VM in the new execbuf3
ioctl path.

Update i915_vma_is_active(), i915_vma_sync() and the
__i915_vma_unbind_async() functions to properly handle
persistent vmas.

Reviewed-by: Andi Shyti 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +++
 drivers/gpu/drm/i915/i915_vma.c | 28 +
 drivers/gpu/drm/i915/i915_vma.h | 25 +-
 4 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 329ff75b80b9..b7d0844de561 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,45 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
+/**
+ * i915_vm_sync() - Wait until address space is not in use
+ * @vm: address space
+ *
+ * Waits until all requests using the address space are complete.
+ *
+ * Returns: 0 if success, -ve err code upon failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+   int ret;
+
+   /* Wait for all requests under this vm to finish */
+   ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+   DMA_RESV_USAGE_BOOKKEEP, false,
+   MAX_SCHEDULE_TIMEOUT);
+   if (ret < 0)
+   return ret;
+   else if (ret > 0)
+   return 0;
+   else
+   return -ETIMEDOUT;
+}
+
+/**
+ * i915_vm_is_active() - Check if address space is being used
+ * @vm: address space
+ *
+ * Check if any request using the specified address space is
+ * active.
+ *
+ * Returns: true if address space is active, false otherwise.
+ */
+bool i915_vm_is_active(const struct i915_address_space *vm)
+{
+   return !dma_resv_test_signaled(vm->root_obj->base.resv,
+  DMA_RESV_USAGE_BOOKKEEP);
+}
+
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
   struct sg_table *pages)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5dda..a5bbdc59d9df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -51,4 +51,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASKI915_GTT_PAGE_MASK
 
+int i915_vm_sync(struct i915_address_space *vm);
+bool i915_vm_is_active(const struct i915_address_space *vm);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4c0e933a7ea5..abf9ca4267cf 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -420,6 +420,24 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
return err;
 }
 
+/**
+ * i915_vma_sync() - Wait for the vma to be idle
+ * @vma: vma to be tested
+ *
+ * Returns 0 on success and error code on failure
+ */
+int i915_vma_sync(struct i915_vma *vma)
+{
+   int ret;
+
+   /* Wait for the asynchronous bindings and pending GPU reads */
+   ret = i915_active_wait(&vma->active);
+   if (ret || !i915_vma_is_persistent(vma) || i915_vma_is_purged(vma))
+   return ret;
+
+   return i915_vm_sync(vma->vm);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
@@ -1887,6 +1905,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
int err;
 
assert_object_held(obj);
+   if (i915_vma_is_persistent(vma))
+   return -EINVAL;
 
GEM_BUG_ON(!vma->pages);
 
@@ -2097,6 +2117,14 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
i915_vma *vma)
return ERR_PTR(-EBUSY);
}
 
+   if (i915_vma_is_persistent(vma) &&
+   __i915_sw_fence_await_reservation(&vma->resource->chain,
+ vma->vm->root_obj->base.resv,
+ DMA_RESV_USAGE_BOOKKEEP,
+ i915_fence_timeout(vma->vm->i915),
+ GFP_NOWAIT | __GFP_NOWARN) < 0)
+   return ERR_PTR(-EBUSY);
+
fence = __i915_vma_evict(vma, true);
 
drm_mm_remove_node(&vma->node); /* pairs

[PATCH v2 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-02 Thread Niranjana Vishwanathapura
Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 214 +-
 1 file changed, 213 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 66b723842e45..5cece16949d8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -19,6 +19,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
 
@@ -42,7 +43,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +61,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 
 /**
@@ -127,6 +137,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
 
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +168,121 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
 
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
return 0;
 }
 
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
+   return 0;
+}
+
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb,
+ bool final)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link)
+   __i915_vma_unpin(vma);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+   if (!final)
+   return;
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb, bool final)
 {
+   eb_release_persistent_vma_all(eb, final);
eb_unpin_engine(eb);
 }
 
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+ 

[PATCH v2 14/17] drm/i915/vm_bind: Expose i915_request_await_bind()

2022-10-02 Thread Niranjana Vishwanathapura
Rename __i915_request_await_bind() as i915_request_await_bind()
and make it non-static as it will be used in execbuf3 ioctl path.

Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_vma.c | 8 +---
 drivers/gpu/drm/i915/i915_vma.h | 6 ++
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 492a00d09cf3..9e8f30022721 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1889,18 +1889,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
list_del(&vma->obj->userfault_link);
 }
 
-static int
-__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
-{
-   return __i915_request_await_exclusive(rq, &vma->active);
-}
-
 static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request 
*rq)
 {
int err;
 
/* Wait for the vma to be bound before we start! */
-   err = __i915_request_await_bind(rq, vma);
+   err = i915_request_await_bind(rq, vma);
if (err)
return err;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 04770f8ba815..19e57e12b956 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -54,6 +54,12 @@ void i915_vma_unpin_and_release(struct i915_vma **p_vma, 
unsigned int flags);
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
+static inline int
+i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
+{
+   return __i915_request_await_exclusive(rq, &vma->active);
+}
+
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
  struct i915_request *rq,
  struct dma_fence *fence,
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v2 06/17] drm/i915/vm_bind: Support for VM private BOs

2022-10-02 Thread Niranjana Vishwanathapura
Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

v2: Pad struct drm_i915_gem_create_ext_vm_private for 64bit alignment,
add input validity checks.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 49 ++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  3 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  9 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  4 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 include/uapi/drm/i915_drm.h   | 33 +
 12 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 5c6e396ab74d..694d4638ac8b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
 #include "pxp/intel_pxp.h"
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -251,6 +252,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   u32 vm_id;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -400,9 +402,32 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct drm_i915_gem_create_ext_vm_private ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   /* Reserved fields must be 0 */
+   if (ext.rsvd)
+   return -EINVAL;
+
+   /* vm_id 0 is reserved */
+   if (!ext.vm_id)
+   return -ENOENT;
+
+   ext_data->vm_id = ext.vm_id;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 
 /**
@@ -418,6 +443,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_gem_create_ext *args = data;
struct create_ext ext_data = { .i915 = i915 };
+   struct i915_address_space *vm = NULL;
struct drm_i915_gem_object *obj;
int ret;
 
@@ -431,6 +457,12 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (ret)
return ret;
 
+   if (ext_data.vm_id) {
+   vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id);
+   if (unlikely(!vm))
+   return -ENOENT;
+   }
+
if (!ext_data.n_placements) {
ext_data.placements[0] =
intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
@@ -457,8 +489,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
ext_data.placements,
ext_data.n_placements,
ext_data.flags);
-   if (IS_ERR(obj))
-   return PTR_ERR(obj);
+   if (IS_ERR(obj)) {
+   ret = PTR_ERR(obj);
+   goto vm_put;
+   }
+
+   if (vm) {
+   obj->base.resv = vm->root_obj->base.resv;
+   obj->priv_root = i915_gem_object_get(vm->root_obj);
+   i915_vm_put(vm);
+   }
 
return i915_gem_publish(obj, file, &args->size, &args->handle);
+vm_put:
+   if (vm)
+   i915_vm_put(vm);
+
+   return ret;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index f5062d0c6333..6433173c3e84 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -218,6 +218,12 @@ struct dma_buf *i915_gem_prime_export(struct

[PATCH v2 11/17] drm/i915/vm_bind: Use common execbuf functions in execbuf path

2022-10-02 Thread Niranjana Vishwanathapura
Update the execbuf path to use common execbuf functions to
reduce code duplication with the newer execbuf3 path.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++
 1 file changed, 38 insertions(+), 469 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4673e0812277..eb8491bd32c1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -28,6 +28,7 @@
 #include "i915_file_private.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
 #include "i915_gem_evict.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
@@ -235,13 +236,6 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
-struct eb_fence {
-   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-   struct dma_fence *dma_fence;
-   u64 value;
-   struct dma_fence_chain *chain_fence;
-};
-
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2440,29 @@ static const enum intel_engine_id user_ring_map[] = {
[I915_EXEC_VEBOX]   = VECS0
 };
 
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct 
intel_context *ce)
-{
-   struct intel_ring *ring = ce->ring;
-   struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
-
-   /*
-* Completely unscientific finger-in-the-air estimates for suitable
-* maximum user request size (to avoid blocking) and then backoff.
-*/
-   if (intel_ring_update_space(ring) >= PAGE_SIZE)
-   return NULL;
-
-   /*
-* Find a request that after waiting upon, there will be at least half
-* the ring available. The hysteresis allows us to compete for the
-* shared ring and should mean that we sleep less often prior to
-* claiming our resources, but not so long that the ring completely
-* drains before we can submit our next request.
-*/
-   list_for_each_entry(rq, &tl->requests, link) {
-   if (rq->ring != ring)
-   continue;
-
-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
-   break;
-   }
-   if (&rq->link == &tl->requests)
-   return NULL; /* weird, we will check again later for real */
-
-   return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context 
*ce,
-  bool throttle)
-{
-   struct intel_timeline *tl;
-   struct i915_request *rq = NULL;
-
-   /*
-* Take a local wakeref for preparing to dispatch the execbuf as
-* we expect to access the hardware fairly frequently in the
-* process, and require the engine to be kept awake between accesses.
-* Upon dispatch, we acquire another prolonged wakeref that we hold
-* until the timeline is idle, which in turn releases the wakeref
-* taken on the engine, and the parent device.
-*/
-   tl = intel_context_timeline_lock(ce);
-   if (IS_ERR(tl))
-   return PTR_ERR(tl);
-
-   intel_context_enter(ce);
-   if (throttle)
-   rq = eb_throttle(eb, ce);
-   intel_context_timeline_unlock(tl);
-
-   if (rq) {
-   bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
- timeout) < 0) {
-   i915_request_put(rq);
-
-   /*
-* Error path, cannot use intel_context_timeline_lock as
-* that is user interruptable and this clean up step
-* must be done.
-*/
-   mutex_lock(&ce->timeline->mutex);
-   intel_context_exit(ce);
-   mutex_unlock(&ce->timeline->mutex);
-
-   if (nonblock)
-   return -EWOULDBLOCK;
-   else
-   return -EINTR;
-   }
-   i915_request_put(rq);
-   }
-
-   return 0;
-}
-
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
 {
-   struct intel_context *ce = eb->context, *child;
int err;
-   int i = 0, j = 0;
 
GEM_BUG_ON(eb->args-&g

[PATCH v2 13/17] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()

2022-10-02 Thread Niranjana Vishwanathapura
Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c | 16 +++-
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e62d769d8a55..492a00d09cf3 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,21 @@ int i915_vma_sync(struct i915_vma *vma)
return i915_vm_sync(vma->vm);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion
+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
int err;
 
+   /* Ensure vma bind is initiated */
+   if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+   return -EINVAL;
+
if (!fence)
return 0;
 
@@ -457,9 +466,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma 
*vma)
 
return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v2 12/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-02 Thread Niranjana Vishwanathapura
Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 575 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 640 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bf952f478555..3473ee5825bb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..66b723842e45
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,575 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct drm_i915_gem_execbuffer3 *args;
+
+   struct intel_gt *gt;
+   struct intel_context *context;
+   struct i915_gem_context *gem_context;
+
+   struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
+   struct dma_fence *composite_fence;
+
+   

[PATCH v2 17/17] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

2022-10-02 Thread Niranjana Vishwanathapura
Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

v2: update kernel-doc

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 -
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c|  3 +++
 include/uapi/drm/i915_drm.h | 22 -
 4 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f4e648ec01ed..c20bd6e8aaf8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1808,9 +1808,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
if (!HAS_FULL_PPGTT(i915))
return -ENODEV;
 
-   if (args->flags)
+   if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
return -EINVAL;
 
+   if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+   !HAS_VM_BIND(i915))
+   return -EOPNOTSUPP;
+
ppgtt = i915_ppgtt_create(to_gt(i915), 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@@ -1828,6 +1832,9 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (err)
goto err_put;
 
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND)
+   ppgtt->vm.vm_bind_mode = true;
+
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 90ed8e6db2fe..826b0f90879f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -977,6 +977,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_BAR2_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
 
+#define HAS_VM_BIND(i915) (GRAPHICS_VER(i915) >= 12)
+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 342c8ca6414e..f45b3c684bcf 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_PERF_REVISION:
value = i915_perf_ioctl_version();
break;
+   case I915_PARAM_VM_BIND_VERSION:
+   value = HAS_VM_BIND(i915);
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6e0f6b8d66a3..4ed170b8bae7 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -755,6 +755,22 @@ typedef struct drm_i915_irq_wait {
 /* Query if the kernel supports the I915_USERPTR_PROBE flag. */
 #define I915_PARAM_HAS_USERPTR_PROBE 56
 
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *previously with VM_BIND, the ioctl will not support unbinding multiple
+ *mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *any existing mappings.
+ *
+ * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
+ */
+#define I915_PARAM_VM_BIND_VERSION 57
+
 /* Must be kept compact -- no holes and well documented */
 
 /**
@@ -2606,6 +2622,9 @@ struct drm_i915_gem_context_destroy {
  * on the same file. Extensions can be provided to configure exactly how the
  * address space is setup upon creation.
  *
+ * If I915_VM_CREATE_FLAGS_USE_VM_BIND flag is set, VM created will work in
+ * VM_BIND mode.
+ *
  * The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
  * returned in the outparam @id.
  *
@@ -2622,7 +2641,8 @@ struct drm_i915_gem_vm_control {
/** @extensions: Zero-terminated chain of extensions. */
__u64 extensions;
 
-   /** @flags: reserved for future usage, currently MBZ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND   (1u << 0)
+#define I915_VM_CREATE_FLAGS_UNKNOWN   (-(I915_VM_CREATE_FLAGS_USE_VM_BIND << 
1))
__u32 flags;
 
/** @vm_id: Id of the VM created or to be destroyed */
-- 
2.21.0.rc0.32.g243a4c7e27



Re: [PATCH v2 10/17] drm/i915/vm_bind: Abstract out common execbuf functions

2022-10-03 Thread Niranjana Vishwanathapura

On Mon, Oct 03, 2022 at 05:53:37PM +0200, Andi Shyti wrote:

Hi Niranjana,

[...]


+   for_each_child(ce, child) {
+   err = intel_context_pin_ww(child, ww);
+   GEM_BUG_ON(err);/* perma-pinned should incr a counter */
+   }
+
+   for_each_child(ce, child) {
+   err = eb_pin_timeline(child, throttle, nonblock);
+   if (err)
+   goto unwind;
+   ++i;
+   }


any reason for having two separate for_each_child here?



This part is ported as is from i915_gem_execbuffer.c.
Probably the author found it easy to unwind in case of error.

Regards,
Niranjana


Andi


+   err = eb_pin_timeline(ce, throttle, nonblock);
+   if (err)
+   goto unwind;
+
+   return 0;
+
+unwind:
+   for_each_child(ce, child) {
+   if (j++ < i) {
+   mutex_lock(&child->timeline->mutex);
+   intel_context_exit(child);
+   mutex_unlock(&child->timeline->mutex);
+   }
+   }
+   for_each_child(ce, child)
+   intel_context_unpin(child);


[...]


Re: [Intel-gfx] [PATCH 12/16] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-03 Thread Niranjana Vishwanathapura

On Thu, Sep 29, 2022 at 09:02:01AM -0700, Niranjana Vishwanathapura wrote:

On Thu, Sep 29, 2022 at 04:00:47PM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
drivers/gpu/drm/i915/Makefile |   1 +
.../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 571 ++
drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
drivers/gpu/drm/i915/i915_driver.c|   1 +
include/uapi/drm/i915_drm.h   |  61 ++
5 files changed, 636 insertions(+)
create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bf952f478555..3473ee5825bb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..92af88bc8deb
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,571 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct drm_i915_gem_execbuffer3 *args;
+
+   struct intel_gt *gt;
+   struct intel_context *context;
+   struct i915_ge

Re: [PATCH v2 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-04 Thread Niranjana Vishwanathapura

On Mon, Oct 03, 2022 at 01:18:27PM +0100, Matthew Auld wrote:

On 03/10/2022 07:12, Niranjana Vishwanathapura wrote:

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 214 +-
 1 file changed, 213 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 66b723842e45..5cece16949d8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -19,6 +19,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -42,7 +43,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +61,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 /**
@@ -127,6 +137,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +168,121 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
return 0;
 }
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
+   return 0;
+}
+
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb,
+ bool final)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link)
+   __i915_vma_unpin(vma);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+   if (!final)
+   return;
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb, bool final)
 {
+   eb_release_persistent_vma_all(eb, final);
eb_unpin_engine(eb);
 }
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   

Re: [PATCH 14/16] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-04 Thread Niranjana Vishwanathapura

On Mon, Oct 03, 2022 at 09:36:38AM +0100, Matthew Auld wrote:

On 02/10/2022 07:28, Niranjana Vishwanathapura wrote:

On Fri, Sep 30, 2022 at 10:47:48AM +0100, Matthew Auld wrote:

On 28/09/2022 07:19, Niranjana Vishwanathapura wrote:

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 188 +-
 1 file changed, 187 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

index 92af88bc8deb..1aeeff5e8540 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -19,6 +19,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
+#define __EXEC3_HAS_PIN    BIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED    BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS    (~0ull << 32)
@@ -42,7 +43,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the 
VM_BIND mode only

- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on 
that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed 
required for that

+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses 
instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will 
also not

@@ -58,6 +61,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like 
relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference 
tracking etc., are not

  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to 
all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The 
DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that 
DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for 
DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, 
the execbuf3

+ * timeline out fence should be used for end of batch check.
  */
 /**
@@ -127,6 +137,23 @@ eb_find_vma(struct i915_address_space *vm, 
u64 addr)

 return i915_gem_vm_bind_lookup_vma(vm, va);
 }
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+    struct i915_vma *vma, *vn;
+
+    /**
+ * Move all unbound vmas back into vm_bind_list so that they are
+ * revalidated.
+ */
+    spin_lock(&vm->vm_rebind_lock);
+    list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, 
vm_rebind_link) {

+    list_del_init(&vma->vm_rebind_link);
+    if (!list_empty(&vma->vm_bind_link))
+    list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+    }
+    spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
 unsigned int i, current_batch = 0;
@@ -141,14 +168,121 @@ static int eb_lookup_vma_all(struct 
i915_execbuffer *eb)

 ++current_batch;
 }
+    eb_scoop_unbound_vma_all(eb->context->vm);
+
+    return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+    struct i915_address_space *vm = eb->context->vm;
+    struct i915_vma *vma;
+    int err;
+
+    err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+    if (err)
+    return err;
+
+    list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+    non_priv_vm_bind_link) {
+    err = i915_gem_object_lock(vma->obj, &eb->ww);
+    if (err)
+    return err;
+    }
+
 return 0;
 }
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb,
+  bool final)
+{
+    struct i915_address_space *vm = eb->context->vm;
+    struct i915_vma *vma, *vn;
+
+    lockdep_assert_held(&vm->vm_bind_lock);
+
+    if (!(eb->args->flags & __EXEC3_HAS_PIN))
+    return;
+
+    assert_object_held(vm->root_obj);
+
+    list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link)
+    __i915_vma_unpin(vma);
+
+    eb->args->flags &= ~__EXEC3_HAS_PIN;
+    if (!final)
+    return;
+
+    list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+    if (i915_vma_verify_bind_complete(vma))
+    list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb, bool final)
 {
+    eb_release_persistent_vma_all(eb, final);
 eb_unpin_engine(eb);
 }
+static int eb_reserve_fence_for_persistent_vma_all(struct 
i915_execbuffer *eb)

+{
+    struct i915_address_space *vm = eb->context->vm;
+    struct i915_vma *vma;
+    int ret;
+
+    ret = dma_resv_reserve_fences(vm->root_obj->bas

[PATCH v3 01/17] drm/i915/vm_bind: Expose vm lookup function

2022-10-09 Thread Niranjana Vishwanathapura
Make i915_gem_vm_lookup() function non-static as it will be
used by the vm_bind feature.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 1e29b1e6d186..793345cbf99e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -346,7 +346,16 @@ static int proto_context_register(struct 
drm_i915_file_private *fpriv,
return ret;
 }
 
-static struct i915_address_space *
+/**
+ * i915_gem_vm_lookup() - looks up for the VM reference given the vm id
+ * @file_priv: the private data associated with the user's file
+ * @id: the VM id
+ *
+ * Finds the VM reference associated to a specific id.
+ *
+ * Returns the VM pointer on success, NULL in case of failure.
+ */
+struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e5b0f66ea1fe..899fa8f1e0fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, 
void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 
+struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v3 00/17] drm/i915/vm_bind: Add VM_BIND functionality

2022-10-09 Thread Niranjana Vishwanathapura
DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

TODOs:
* Support out fence for VM_UNBIND ioctl.
* Async VM_UNBIND support.
* Optimizations.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
  [PATCH i-g-t v3 0/11] vm_bind: Add VM_BIND validation support

v2: Address various review comments
v3: Address review comments and other fixes

Signed-off-by: Niranjana Vishwanathapura 

Niranjana Vishwanathapura (17):
  drm/i915/vm_bind: Expose vm lookup function
  drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  drm/i915/vm_bind: Add support to create persistent vma
  drm/i915/vm_bind: Implement bind and unbind of object
  drm/i915/vm_bind: Support for VM private BOs
  drm/i915/vm_bind: Add support to handle object evictions
  drm/i915/vm_bind: Support persistent vma activeness tracking
  drm/i915/vm_bind: Add out fence support
  drm/i915/vm_bind: Abstract out common execbuf functions
  drm/i915/vm_bind: Use common execbuf functions in execbuf path
  drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  drm/i915/vm_bind: Expose i915_request_await_bind()
  drm/i915/vm_bind: Handle persistent vmas in execbuf3
  drm/i915/vm_bind: userptr dma-resv changes
  drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

 drivers/gpu/drm/i915/Makefile |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  39 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  67 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   6 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 516 +--
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 866 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   3 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 433 +
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  17 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  27 +
 drivers/gpu/drm/i915/i915_driver.c|   4 +
 drivers/gpu/drm/i915/i915_drv.h   |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   3 +
 drivers/gpu/drm/i915/i915_getparam.c  |   3 +
 drivers/gpu/drm/i915/i915_sw_fence.c  |  28 +-
 drivers/gpu/drm/i915/i915_sw_fence.h  |  23 +-
 drivers/gpu/drm/i915/i915_vma.c   | 128 ++-
 drivers/gpu/drm/i915/i915_vma.h   |  57 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  39 +
 include/uapi/drm/i915_drm.h   | 281 +-
 30 files changed, 2864 insertions(+), 522 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v3 02/17] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2022-10-09 Thread Niranjana Vishwanathapura
Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 28 +---
 drivers/gpu/drm/i915/i915_sw_fence.h | 23 +--
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index cc2a8821d22a..ae06d35db056 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -7,7 +7,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "i915_sw_fence.h"
 #include "i915_selftest.h"
@@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
return ret;
 }
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp)
+/**
+ * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
+ * object with specified usage.
+ * @fence: the fence that needs to wait
+ * @resv: dma-resv object
+ * @usage: dma_resv_usage (See enum dma_resv_usage)
+ * @timeout: how long to wait in jiffies
+ * @gfp: allocation mode
+ *
+ * Setup the @fence to asynchronously wait on dma-resv object @resv for
+ * @usage to complete before signaling.
+ *
+ * Returns 0 if there is nothing to wait on, -ve error code upon error
+ * and >0 upon successfully setting up the wait.
+ */
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp)
 {
struct dma_resv_iter cursor;
struct dma_fence *f;
@@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
debug_fence_assert(fence);
might_sleep_if(gfpflags_allow_blocking(gfp));
 
-   dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+   dma_resv_iter_begin(&cursor, resv, usage);
dma_resv_for_each_fence_unlocked(&cursor, f) {
pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
gfp);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index f752bfc7c6e1..9c4859dc4c0d 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 
 #include 
+#include 
 #include 
 #include 
 #include  /* for NOTIFY_DONE */
 #include 
 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 
 enum i915_sw_fence_notify {
@@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
  unsigned long timeout,
  gfp_t gfp);
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp);
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp);
+
+static inline int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ bool write,
+ unsigned long timeout,
+ gfp_t gfp)
+{
+   return __i915_sw_fence_await_reservation(fence, resv,
+dma_resv_usage_rw(write),
+timeout, gfp);
+}
 
 bool i915_sw_fence_await(struct i915_sw_fence *fence);
 void i915_sw_fence_complete(struct i915_sw_fence *fence);
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v3 06/17] drm/i915/vm_bind: Support for VM private BOs

2022-10-10 Thread Niranjana Vishwanathapura
Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

v2: Pad struct drm_i915_gem_create_ext_vm_private for 64bit alignment,
add input validity checks.
v3: Create root_obj only for ppgtt.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 16 +-
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 49 ++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  3 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  9 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  3 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 include/uapi/drm/i915_drm.h   | 33 +
 13 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 793345cbf99e..5ea7064805f3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -83,6 +83,7 @@
 
 #include "i915_file_private.h"
 #include "i915_gem_context.h"
+#include "i915_gem_internal.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
@@ -1795,6 +1796,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_gem_vm_control *args = data;
struct drm_i915_file_private *file_priv = file->driver_priv;
+   struct drm_i915_gem_object *obj;
struct i915_ppgtt *ppgtt;
u32 id;
int err;
@@ -1817,15 +1819,27 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
goto err_put;
}
 
+   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   if (IS_ERR(obj)) {
+   err = PTR_ERR(obj);
+   goto err_put;
+   }
+
+   ppgtt->vm.root_obj = obj;
+   ppgtt->vm.vm_bind_mode = true;
+
err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
   xa_limit_32b, GFP_KERNEL);
if (err)
-   goto err_put;
+   goto err_root_obj_put;
 
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
 
+err_root_obj_put:
+   if (ppgtt->vm.root_obj)
+   i915_gem_object_put(ppgtt->vm.root_obj);
 err_put:
i915_vm_put(&ppgtt->vm);
return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 5c6e396ab74d..694d4638ac8b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
 #include "pxp/intel_pxp.h"
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -251,6 +252,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   u32 vm_id;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -400,9 +402,32 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct drm_i915_gem_create_ext_vm_private ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   /* Reserved fields must be 0 */
+   if (ext.rsvd)
+   return -EINVAL;
+
+   /* vm_id 0 is reserved */
+   if (!ext.vm_id)
+   return -ENOENT;
+
+   ext_data->vm_id = ext.vm_id;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 
 /**
@@ -418,6 +443,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);

[PATCH v3 05/17] drm/i915/vm_bind: Implement bind and unbind of object

2022-10-10 Thread Niranjana Vishwanathapura
Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

v2: On older platforms ctx->vm is not set, check for it.
In vm_bind call, add vma to vm_bind_list.
Add more input validity checks.
Update some documentation.
v3: In vm_bind call, add vma to vm_bound_list as user can
request a fence and pass to execbuf3 as input fence.
Remove short term pinning with PIN_VALIDATE flag.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 321 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 110 ++
 10 files changed, 508 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index f8cc1eb52626..ad47e1ce428a 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 8f5796cf9c9c..9fb9f6faafd8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
 
+   if (ctx->vm && ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..1ed57eaa8d79
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_

[PATCH v3 04/17] drm/i915/vm_bind: Add support to create persistent vma

2022-10-10 Thread Niranjana Vishwanathapura
Add i915_vma_instance_persistent() to create persistent vmas.
Persistent vmas will use i915_gtt_view to support partial binding.

vma_lookup is tied to segment of the object instead of section
of VA space. Hence, it do not support aliasing. ie., multiple
mappings (at different VA) point to the same gtt_view of object.
Skip vma_lookup for persistent vmas to support aliasing.

v2: Remove unused I915_VMA_PERSISTENT definition,
update validity check in i915_vma_compare(),
remove unwanted is_persistent check in release_references().

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c   | 36 +--
 drivers/gpu/drm/i915/i915_vma.h   | 17 -
 drivers/gpu/drm/i915/i915_vma_types.h |  6 +
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f17c09ead7d7..f9da8010021f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -109,7 +109,8 @@ static void __i915_vma_retire(struct i915_active *ref)
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
   struct i915_address_space *vm,
-  const struct i915_gtt_view *view)
+  const struct i915_gtt_view *view,
+  bool skip_lookup_cache)
 {
struct i915_vma *pos = ERR_PTR(-E2BIG);
struct i915_vma *vma;
@@ -196,6 +197,9 @@ vma_create(struct drm_i915_gem_object *obj,
__set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(vma));
}
 
+   if (skip_lookup_cache)
+   goto skip_rb_insert;
+
rb = NULL;
p = &obj->vma.tree.rb_node;
while (*p) {
@@ -220,6 +224,7 @@ vma_create(struct drm_i915_gem_object *obj,
rb_link_node(&vma->obj_node, rb, p);
rb_insert_color(&vma->obj_node, &obj->vma.tree);
 
+skip_rb_insert:
if (i915_vma_is_ggtt(vma))
/*
 * We put the GGTT vma at the start of the vma-list, followed
@@ -299,7 +304,34 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 
/* vma_create() will resolve the race if another creates the vma */
if (unlikely(!vma))
-   vma = vma_create(obj, vm, view);
+   vma = vma_create(obj, vm, view, false);
+
+   GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
+   return vma;
+}
+
+/**
+ * i915_vma_create_persistent - create a persistent VMA
+ * @obj: parent &struct drm_i915_gem_object to be mapped
+ * @vm: address space in which the mapping is located
+ * @view: additional mapping requirements
+ *
+ * Creates a persistent vma.
+ *
+ * Returns the vma, or an error pointer.
+ */
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+  struct i915_address_space *vm,
+  const struct i915_gtt_view *view)
+{
+   struct i915_vma *vma;
+
+   GEM_BUG_ON(!kref_read(&vm->ref));
+
+   vma = vma_create(obj, vm, view, true);
+   if (!IS_ERR(vma))
+   i915_vma_set_persistent(vma);
 
GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
return vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index aecd9c64486b..c5378ec2f70a 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -44,6 +44,10 @@ struct i915_vma *
 i915_vma_instance(struct drm_i915_gem_object *obj,
  struct i915_address_space *vm,
  const struct i915_gtt_view *view);
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+  struct i915_address_space *vm,
+  const struct i915_gtt_view *view);
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
@@ -138,6 +142,16 @@ static inline u32 i915_ggtt_pin_bias(struct i915_vma *vma)
return i915_vm_to_ggtt(vma->vm)->pin_bias;
 }
 
+static inline bool i915_vma_is_persistent(const struct i915_vma *vma)
+{
+   return test_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_persistent(struct i915_vma *vma)
+{
+   set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
i915_gem_object_get(vma->obj);
@@ -164,7 +178,8 @@ i915_vma_compare(struct i915_vma *vma,
 {
ptrdiff_t cmp;
 
-   GEM_BUG_ON(view && !i915_is_ggtt_or_dpt(vm));
+   GEM_BUG_ON(view && !(i915_is_ggtt_or_dpt(vm) ||
+i915_vma_is_persistent(vma)));
 
cmp = ptrdiff(vma->vm, vm);
if (cmp)
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index ec0f6c9f57d0..3144d71

[PATCH v3 16/17] drm/i915/vm_bind: userptr dma-resv changes

2022-10-10 Thread Niranjana Vishwanathapura
For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

v2: Do not double add vma to vm->userptr_invalidated_list

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 80 +++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 19 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 15 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 6 files changed, 122 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 7e6abdcb9eec..f6cdccebb801 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -20,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_USERPTR_USED   BIT_ULL(34)
 #define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -142,6 +143,21 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
 {
struct i915_vma *vma, *vn;
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /**
+* Move all invalidated userptr vmas back into vm_bind_list so that
+* they are looked up and revalidated.
+*/
+   spin_lock(&vm->userptr_invalidated_lock);
+   list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+userptr_invalidated_link) {
+   list_del_init(&vma->userptr_invalidated_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
/**
 * Move all unbound vmas back into vm_bind_list so that they are
 * revalidated.
@@ -155,10 +171,47 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
spin_unlock(&vm->vm_rebind_lock);
 }
 
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *last_vma = NULL;
+   struct i915_vma *vma;
+   int err;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   /**
+* The above submit_init() call does the object unbind and
+* hence adds vma into vm_rebind_list. Remove it from that
+* list as it is already scooped for revalidation.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   spin_unlock(&vm->vm_rebind_lock);
+
+   last_vma = vma;
+   }
+
+   if (last_vma)
+   eb->args->flags |= __EXEC3_USERPTR_USED;
+
+   return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
struct i915_vma *vma;
+   int err = 0;
 
for (i = 0; i < eb->num_batches; i++) {
vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -171,6 +224,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 
eb_scoop_unbound_vma_all(eb->context->vm);
 
+   err = eb_lookup_persistent_userptr_vmas(eb);
+   if (err)
+   return err;
+
return 0;
 }
 
@@ -343,6 +400,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
}
}
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /* Check for further userptr invalidations */
+   spin_lock(&vm->userptr_invalidated_lock);
+   if (!list_empty(&vm->userptr_invalidated_list))
+   err = -EAGAIN;
+   spin_unlock(&vm->userptr_invalidated_lock);
+
+   if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+   read_lock(&eb->i915->mm.notifier_lock);
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_done(vma->obj);
+   if (err)
+

[PATCH v3 14/17] drm/i915/vm_bind: Expose i915_request_await_bind()

2022-10-10 Thread Niranjana Vishwanathapura
Rename __i915_request_await_bind() as i915_request_await_bind()
and make it non-static as it will be used in execbuf3 ioctl path.

Reviewed-by: Andi Shyti 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_vma.c | 8 +---
 drivers/gpu/drm/i915/i915_vma.h | 6 ++
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e27795e009ab..13dbaae2b518 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1889,18 +1889,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
list_del(&vma->obj->userfault_link);
 }
 
-static int
-__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
-{
-   return __i915_request_await_exclusive(rq, &vma->active);
-}
-
 static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request 
*rq)
 {
int err;
 
/* Wait for the vma to be bound before we start! */
-   err = __i915_request_await_bind(rq, vma);
+   err = i915_request_await_bind(rq, vma);
if (err)
return err;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 04770f8ba815..19e57e12b956 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -54,6 +54,12 @@ void i915_vma_unpin_and_release(struct i915_vma **p_vma, 
unsigned int flags);
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
+static inline int
+i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
+{
+   return __i915_request_await_exclusive(rq, &vma->active);
+}
+
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
  struct i915_request *rq,
  struct dma_fence *fence,
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v3 08/17] drm/i915/vm_bind: Support persistent vma activeness tracking

2022-10-10 Thread Niranjana Vishwanathapura
Do not use i915_vma activeness tracking for persistent vmas.

As persistent vmas are part of working set for each execbuf
submission on that address space (VM), a persistent vma is
active if the VM active. As vm->root_obj->base.resv will be
updated for each submission on that VM, it correctly
represent whether the VM is active or not.

Add i915_vm_is_active() and i915_vm_sync() functions based
on vm->root_obj->base.resv with DMA_RESV_USAGE_BOOKKEEP
usage. dma-resv fence list will be updated with this usage
during each submission with this VM in the new execbuf3
ioctl path.

Update i915_vma_is_active(), i915_vma_sync() and the
__i915_vma_unbind_async() functions to properly handle
persistent vmas.

Reviewed-by: Andi Shyti 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +++
 drivers/gpu/drm/i915/i915_vma.c | 28 +
 drivers/gpu/drm/i915/i915_vma.h | 25 +-
 4 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 329ff75b80b9..b7d0844de561 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,45 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
+/**
+ * i915_vm_sync() - Wait until address space is not in use
+ * @vm: address space
+ *
+ * Waits until all requests using the address space are complete.
+ *
+ * Returns: 0 if success, -ve err code upon failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+   int ret;
+
+   /* Wait for all requests under this vm to finish */
+   ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+   DMA_RESV_USAGE_BOOKKEEP, false,
+   MAX_SCHEDULE_TIMEOUT);
+   if (ret < 0)
+   return ret;
+   else if (ret > 0)
+   return 0;
+   else
+   return -ETIMEDOUT;
+}
+
+/**
+ * i915_vm_is_active() - Check if address space is being used
+ * @vm: address space
+ *
+ * Check if any request using the specified address space is
+ * active.
+ *
+ * Returns: true if address space is active, false otherwise.
+ */
+bool i915_vm_is_active(const struct i915_address_space *vm)
+{
+   return !dma_resv_test_signaled(vm->root_obj->base.resv,
+  DMA_RESV_USAGE_BOOKKEEP);
+}
+
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
   struct sg_table *pages)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5dda..a5bbdc59d9df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -51,4 +51,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASKI915_GTT_PAGE_MASK
 
+int i915_vm_sync(struct i915_address_space *vm);
+bool i915_vm_is_active(const struct i915_address_space *vm);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index b4be2cbe8382..88c09e885fec 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -420,6 +420,24 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
return err;
 }
 
+/**
+ * i915_vma_sync() - Wait for the vma to be idle
+ * @vma: vma to be tested
+ *
+ * Returns 0 on success and error code on failure
+ */
+int i915_vma_sync(struct i915_vma *vma)
+{
+   int ret;
+
+   /* Wait for the asynchronous bindings and pending GPU reads */
+   ret = i915_active_wait(&vma->active);
+   if (ret || !i915_vma_is_persistent(vma) || i915_vma_is_purged(vma))
+   return ret;
+
+   return i915_vm_sync(vma->vm);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
@@ -1887,6 +1905,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
int err;
 
assert_object_held(obj);
+   if (i915_vma_is_persistent(vma))
+   return -EINVAL;
 
GEM_BUG_ON(!vma->pages);
 
@@ -2096,6 +2116,14 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
i915_vma *vma)
return ERR_PTR(-EBUSY);
}
 
+   if (i915_vma_is_persistent(vma) &&
+   __i915_sw_fence_await_reservation(&vma->resource->chain,
+ vma->vm->root_obj->base.resv,
+ DMA_RESV_USAGE_BOOKKEEP,
+ i915_fence_timeout(vma->vm->i915),
+ GFP_NOWAIT | __GFP_NOWARN) < 0)
+   return ERR_PTR(-EBUSY);
+
fence = __i915_vma_evict(vma, true);
 
drm_mm_remove_node(&vma->node); /* pairs

[PATCH v3 17/17] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

2022-10-10 Thread Niranjana Vishwanathapura
Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

v2: update kernel-doc
v3: create vm->root_obj only upon I915_VM_CREATE_FLAGS_USE_VM_BIND

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 28 ++---
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c|  3 +++
 include/uapi/drm/i915_drm.h | 22 +++-
 4 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5ea7064805f3..1e947fb98ec7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1796,7 +1796,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_gem_vm_control *args = data;
struct drm_i915_file_private *file_priv = file->driver_priv;
-   struct drm_i915_gem_object *obj;
struct i915_ppgtt *ppgtt;
u32 id;
int err;
@@ -1804,9 +1803,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
if (!HAS_FULL_PPGTT(i915))
return -ENODEV;
 
-   if (args->flags)
+   if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
return -EINVAL;
 
+   if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+   !HAS_VM_BIND(i915))
+   return -EOPNOTSUPP;
+
ppgtt = i915_ppgtt_create(to_gt(i915), 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@@ -1819,20 +1822,27 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
goto err_put;
}
 
-   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
-   if (IS_ERR(obj)) {
-   err = PTR_ERR(obj);
-   goto err_put;
-   }
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) {
+   struct drm_i915_gem_object *obj;
+
+   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   if (IS_ERR(obj)) {
+   err = PTR_ERR(obj);
+   goto err_put;
+   }
 
-   ppgtt->vm.root_obj = obj;
-   ppgtt->vm.vm_bind_mode = true;
+   ppgtt->vm.root_obj = obj;
+   ppgtt->vm.vm_bind_mode = true;
+   }
 
err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
   xa_limit_32b, GFP_KERNEL);
if (err)
goto err_root_obj_put;
 
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND)
+   ppgtt->vm.vm_bind_mode = true;
+
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 90ed8e6db2fe..826b0f90879f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -977,6 +977,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_BAR2_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
 
+#define HAS_VM_BIND(i915) (GRAPHICS_VER(i915) >= 12)
+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 342c8ca6414e..f45b3c684bcf 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_PERF_REVISION:
value = i915_perf_ioctl_version();
break;
+   case I915_PARAM_VM_BIND_VERSION:
+   value = HAS_VM_BIND(i915);
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 33b28a61ffa3..ed26c69b6306 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -771,6 +771,22 @@ typedef struct drm_i915_irq_wait {
 /* Query if the kernel supports the I915_USERPTR_PROBE flag. */
 #define I915_PARAM_HAS_USERPTR_PROBE 56
 
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *previously with VM_BIND, the ioctl will not support unbinding multiple
+ *mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *any

[PATCH v3 07/17] drm/i915/vm_bind: Add support to handle object evictions

2022-10-10 Thread Niranjana Vishwanathapura
Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  6 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +++
 drivers/gpu/drm/i915/i915_vma.c   | 31 +--
 drivers/gpu/drm/i915/i915_vma.h   | 10 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 +
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 8e3e6ceb9442..c435d49af2c8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -85,6 +85,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
 {
lockdep_assert_held(&vma->vm->vm_bind_lock);
 
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 422394f8fb40..2fa37f46750b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -295,6 +295,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->vm_bound_list);
mutex_init(&vm->vm_bind_lock);
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 4ae5734f7d6b..443d1918ad4e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5d3d67a4bf47..b4be2cbe8382 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 
INIT_LIST_HEAD(&vma->vm_bind_link);
INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+   INIT_LIST_HEAD(&vma->vm_rebind_link);
return vma;
 
 err_unlock:
@@ -1686,6 +1687,14 @@ static void force_unbind(struct i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return;
 
+   /*
+* Persistent vma should have been purged by now.
+* If not, issue a warning and purge it.
+*/
+   if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+   !i915_vma_is_purged(vma)))
+   i915_vma_set_purged(vma);
+
atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
WARN_ON(__i915_vma_unbind(vma));
GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2047,6 +2056,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
__i915_vma_evict(vma, false);
 
drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+   if (i915_vma_is_persistent(vma)) {
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (list_empty(&vma->vm_rebind_link) &&
+   !i915_vma_is_purged(vma))
+   list_add_tail(&vma->vm_rebind_link,
+ &vma->vm->vm_rebind_list);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+   }
+
return 0;
 }
 
@@ -2059,8 +2078,7 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return NULL;
 
-   if (i915_vma_is_pinned(vma) ||
-   &vma->obj->mm.rsgt->table != vma->resource-&

[PATCH v3 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-10 Thread Niranjana Vishwanathapura
Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
Individualize fences before adding to dma_resv obj.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 208 +-
 1 file changed, 207 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 1f38f658066a..7e6abdcb9eec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
  * Copyright © 2022 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -19,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
 
@@ -42,7 +44,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 
 /**
@@ -127,6 +138,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
 
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +169,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
 
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
 }
 
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb)
 {
+   eb_release_persistent_vma_all(eb);
eb_unpin_engine(eb);
 }
 
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   u64 num_fences = 1;
+   struct i915_vma *vma;
+   int re

[PATCH v3 11/17] drm/i915/vm_bind: Use common execbuf functions in execbuf path

2022-10-10 Thread Niranjana Vishwanathapura
Update the execbuf path to use common execbuf functions to
reduce code duplication with the newer execbuf3 path.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++
 1 file changed, 38 insertions(+), 469 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4673e0812277..eb8491bd32c1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -28,6 +28,7 @@
 #include "i915_file_private.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
 #include "i915_gem_evict.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
@@ -235,13 +236,6 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
-struct eb_fence {
-   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-   struct dma_fence *dma_fence;
-   u64 value;
-   struct dma_fence_chain *chain_fence;
-};
-
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2440,29 @@ static const enum intel_engine_id user_ring_map[] = {
[I915_EXEC_VEBOX]   = VECS0
 };
 
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct 
intel_context *ce)
-{
-   struct intel_ring *ring = ce->ring;
-   struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
-
-   /*
-* Completely unscientific finger-in-the-air estimates for suitable
-* maximum user request size (to avoid blocking) and then backoff.
-*/
-   if (intel_ring_update_space(ring) >= PAGE_SIZE)
-   return NULL;
-
-   /*
-* Find a request that after waiting upon, there will be at least half
-* the ring available. The hysteresis allows us to compete for the
-* shared ring and should mean that we sleep less often prior to
-* claiming our resources, but not so long that the ring completely
-* drains before we can submit our next request.
-*/
-   list_for_each_entry(rq, &tl->requests, link) {
-   if (rq->ring != ring)
-   continue;
-
-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
-   break;
-   }
-   if (&rq->link == &tl->requests)
-   return NULL; /* weird, we will check again later for real */
-
-   return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context 
*ce,
-  bool throttle)
-{
-   struct intel_timeline *tl;
-   struct i915_request *rq = NULL;
-
-   /*
-* Take a local wakeref for preparing to dispatch the execbuf as
-* we expect to access the hardware fairly frequently in the
-* process, and require the engine to be kept awake between accesses.
-* Upon dispatch, we acquire another prolonged wakeref that we hold
-* until the timeline is idle, which in turn releases the wakeref
-* taken on the engine, and the parent device.
-*/
-   tl = intel_context_timeline_lock(ce);
-   if (IS_ERR(tl))
-   return PTR_ERR(tl);
-
-   intel_context_enter(ce);
-   if (throttle)
-   rq = eb_throttle(eb, ce);
-   intel_context_timeline_unlock(tl);
-
-   if (rq) {
-   bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
- timeout) < 0) {
-   i915_request_put(rq);
-
-   /*
-* Error path, cannot use intel_context_timeline_lock as
-* that is user interruptable and this clean up step
-* must be done.
-*/
-   mutex_lock(&ce->timeline->mutex);
-   intel_context_exit(ce);
-   mutex_unlock(&ce->timeline->mutex);
-
-   if (nonblock)
-   return -EWOULDBLOCK;
-   else
-   return -EINTR;
-   }
-   i915_request_put(rq);
-   }
-
-   return 0;
-}
-
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
 {
-   struct intel_context *ce = eb->context, *child;
int err;
-   int i = 0, j = 0;
 
GEM_BUG_ON(eb->args-&g

[PATCH v3 13/17] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()

2022-10-10 Thread Niranjana Vishwanathapura
Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c | 16 +++-
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index cb8e718ec46e..e27795e009ab 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,21 @@ int i915_vma_sync(struct i915_vma *vma)
return i915_vm_sync(vma->vm);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion
+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
int err;
 
+   /* Ensure vma bind is initiated */
+   if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+   return -EINVAL;
+
if (!fence)
return 0;
 
@@ -457,9 +466,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma 
*vma)
 
return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v3 10/17] drm/i915/vm_bind: Abstract out common execbuf functions

2022-10-10 Thread Niranjana Vishwanathapura
The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Abstract out the common execbuf functionalities into a
separate file where possible, thus allowing code sharing.

Reviewed-by: Andi Shyti 
Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 3 files changed, 741 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index ad47e1ce428a..3564307699ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_create.o \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
+   gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
new file mode 100644
index ..4d1c9ce154b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
@@ -0,0 +1,666 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_gem_execbuffer_common.h"
+
+#define __EXEC_COMMON_FENCE_WAIT   BIT(0)
+#define __EXEC_COMMON_FENCE_SIGNAL BIT(1)
+
+static struct i915_request *eb_throttle(struct intel_context *ce)
+{
+   struct intel_ring *ring = ce->ring;
+   struct intel_timeline *tl = ce->timeline;
+   struct i915_request *rq;
+
+   /*
+* Completely unscientific finger-in-the-air estimates for suitable
+* maximum user request size (to avoid blocking) and then backoff.
+*/
+   if (intel_ring_update_space(ring) >= PAGE_SIZE)
+   return NULL;
+
+   /*
+* Find a request that after waiting upon, there will be at least half
+* the ring available. The hysteresis allows us to compete for the
+* shared ring and should mean that we sleep less often prior to
+* claiming our resources, but not so long that the ring completely
+* drains before we can submit our next request.
+*/
+   list_for_each_entry(rq, &tl->requests, link) {
+   if (rq->ring != ring)
+   continue;
+
+   if (__intel_ring_space(rq->postfix,
+  ring->emit, ring->size) > ring->size / 2)
+   break;
+   }
+   if (&rq->link == &tl->requests)
+   return NULL; /* weird, we will check again later for real */
+
+   return i915_request_get(rq);
+}
+
+static int eb_pin_timeline(struct intel_context *ce, bool throttle,
+  bool nonblock)
+{
+   struct intel_timeline *tl;
+   struct i915_request *rq = NULL;
+
+   /*
+* Take a local wakeref for preparing to dispatch the execbuf as
+* we expect to access the hardware fairly frequently in the
+* process, and require the engine to be kept awake between accesses.
+* Upon dispatch, we acquire another prolonged wakeref that we hold
+* until the timeline is idle, which in turn releases the wakeref
+* taken on the engine, and the parent device.
+*/
+   tl = intel_context_timeline_lock(ce);
+   if (IS_ERR(tl))
+   return PTR_ERR(tl);
+
+   intel_context_enter(ce);
+   if (throttle)
+   rq = eb_throttle(ce);
+   intel_context_timeline_unlock(tl);
+
+   if (rq) {
+   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
+
+   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
+ timeout) < 0) {
+   i915_request_put(rq);
+
+   /*
+* Error path, cannot use intel_context_timeline_lock as
+* that is user interruptable and this clean up step
+* must be done.
+*/
+   mutex_lock(&ce->timeline->mutex);
+   intel_context_exit(ce);
+   mutex_unlock(&ce->timeline->mutex);
+
+   if (nonblock)
+   return -EWOULDBLOCK;
+   else
+   return -EINTR;
+   }
+   i915_request_put(rq);
+   }
+
+  

[PATCH v3 03/17] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2022-10-10 Thread Niranjana Vishwanathapura
Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 18 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..5c6e396ab74d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -15,10 +15,18 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-static u32 object_max_page_size(struct intel_memory_region **placements,
-   unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Returns the largest of min_page_size of the @placements,
+ * or I915_GTT_PAGE_SIZE_4K if @n_placements is 0.
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements)
 {
-   u32 max_page_size = 0;
+   u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
int i;
 
for (i = 0; i < n_placements; i++) {
@@ -28,7 +36,6 @@ static u32 object_max_page_size(struct intel_memory_region 
**placements,
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
}
 
-   GEM_BUG_ON(!max_page_size);
return max_page_size;
 }
 
@@ -99,7 +106,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private 
*i915, u64 size,
 
i915_gem_flush_free_objects(i915);
 
-   size = round_up(size, object_max_page_size(placements, n_placements));
+   size = round_up(size, i915_gem_object_max_page_size(placements,
+   n_placements));
if (size == 0)
return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 6b9ecff42bb5..db3dd0e285c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements);
 
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v3 12/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-10 Thread Niranjana Vishwanathapura
Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
minor cleanup

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 580 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 645 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3564307699ea..94520b79e7e7 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..1f38f658066a
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,580 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct drm_i915_gem_execbuffer3 *args;
+
+   struct intel_gt *gt;
+   struct intel_context *context;
+   struct i915_gem_context *gem_context;
+
+   struct i915_request *requests[MAX_ENGINE

[PATCH v3 09/17] drm/i915/vm_bind: Add out fence support

2022-10-10 Thread Niranjana Vishwanathapura
Add support for handling out fence for vm_bind call.

v2: Reset vma->vm_bind_fence.syncobj to NULL at the end
of vm_bind call.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 82 +++
 drivers/gpu/drm/i915/i915_vma.c   |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
 include/uapi/drm/i915_drm.h   | 63 +-
 5 files changed, 158 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@
 
 #include 
 
+struct dma_fence;
 struct drm_device;
 struct drm_file;
 struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,
 
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
 
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index c435d49af2c8..526d3a6bf671 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
 
@@ -100,6 +102,76 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
i915_gem_object_put(vma->obj);
 }
 
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+   vma->vm_bind_fence.syncobj = NULL;
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct drm_i915_gem_vm_unbind *va)
 {
@@ -237,6 +309,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER | PIN_VALIDATE;
 
for_i915_gem_ww(&ww, ret, true) {
@@ -258,6 +337,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
i915_gem_object_get(vma->obj);
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)
+   i915_vm_bind_

Re: [PATCH v3 07/17] drm/i915/vm_bind: Add support to handle object evictions

2022-10-10 Thread Niranjana Vishwanathapura

On Mon, Oct 10, 2022 at 02:30:49PM +0100, Matthew Auld wrote:

On 10/10/2022 07:58, Niranjana Vishwanathapura wrote:

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  6 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +++
 drivers/gpu/drm/i915/i915_vma.c   | 31 +--
 drivers/gpu/drm/i915/i915_vma.h   | 10 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 +
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 8e3e6ceb9442..c435d49af2c8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -85,6 +85,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
 {
lockdep_assert_held(&vma->vm->vm_bind_lock);
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 422394f8fb40..2fa37f46750b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -295,6 +295,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->vm_bound_list);
mutex_init(&vm->vm_bind_lock);
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
 }
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 4ae5734f7d6b..443d1918ad4e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5d3d67a4bf47..b4be2cbe8382 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
INIT_LIST_HEAD(&vma->vm_bind_link);
INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+   INIT_LIST_HEAD(&vma->vm_rebind_link);
return vma;
 err_unlock:
@@ -1686,6 +1687,14 @@ static void force_unbind(struct i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return;
+   /*
+* Persistent vma should have been purged by now.
+* If not, issue a warning and purge it.
+*/
+   if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+   !i915_vma_is_purged(vma)))
+   i915_vma_set_purged(vma);
+
atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
WARN_ON(__i915_vma_unbind(vma));
GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2047,6 +2056,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
__i915_vma_evict(vma, false);
drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+   if (i915_vma_is_persistent(vma)) {
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (list_empty(&vma->vm_rebind_link) &&
+   !i915_vma_is_purged(vma))
+   list_add_tail(&vma->vm_rebind_link,
+ &vma->vm->vm_rebind_list);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+   }
+
return 0;
 }
@@ -2059,8 +2078,7 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return NULL;
-   i

Re: [PATCH v3 07/17] drm/i915/vm_bind: Add support to handle object evictions

2022-10-10 Thread Niranjana Vishwanathapura

On Mon, Oct 10, 2022 at 06:15:02PM +0100, Matthew Auld wrote:

On 10/10/2022 17:11, Niranjana Vishwanathapura wrote:

On Mon, Oct 10, 2022 at 02:30:49PM +0100, Matthew Auld wrote:

On 10/10/2022 07:58, Niranjana Vishwanathapura wrote:

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c    |  6 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +++
 drivers/gpu/drm/i915/i915_vma.c   | 31 +--
 drivers/gpu/drm/i915/i915_vma.h   | 10 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 +
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

index 8e3e6ceb9442..c435d49af2c8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -85,6 +85,12 @@ static void i915_gem_vm_bind_remove(struct 
i915_vma *vma, bool release_obj)

 {
 lockdep_assert_held(&vma->vm->vm_bind_lock);
+    spin_lock(&vma->vm->vm_rebind_lock);
+    if (!list_empty(&vma->vm_rebind_link))
+    list_del_init(&vma->vm_rebind_link);
+    i915_vma_set_purged(vma);
+    spin_unlock(&vma->vm->vm_rebind_lock);
+
 list_del_init(&vma->vm_bind_link);
 list_del_init(&vma->non_priv_vm_bind_link);
 i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c

index 422394f8fb40..2fa37f46750b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -295,6 +295,8 @@ void i915_address_space_init(struct 
i915_address_space *vm, int subclass)

 INIT_LIST_HEAD(&vm->vm_bound_list);
 mutex_init(&vm->vm_bind_lock);
 INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+    INIT_LIST_HEAD(&vm->vm_rebind_list);
+    spin_lock_init(&vm->vm_rebind_lock);
 }
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h

index 4ae5734f7d6b..443d1918ad4e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
 struct list_head vm_bind_list;
 /** @vm_bound_list: List of vm_binding completed */
 struct list_head vm_bound_list;
+    /* @vm_rebind_list: list of vmas to be rebinded */
+    struct list_head vm_rebind_list;
+    /* @vm_rebind_lock: protects vm_rebound_list */
+    spinlock_t vm_rebind_lock;
 /* @va: tree of persistent vmas */
 struct rb_root_cached va;
 struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_vma.c 
b/drivers/gpu/drm/i915/i915_vma.c

index 5d3d67a4bf47..b4be2cbe8382 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 INIT_LIST_HEAD(&vma->vm_bind_link);
 INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+    INIT_LIST_HEAD(&vma->vm_rebind_link);
 return vma;
 err_unlock:
@@ -1686,6 +1687,14 @@ static void force_unbind(struct i915_vma *vma)
 if (!drm_mm_node_allocated(&vma->node))
 return;
+    /*
+ * Persistent vma should have been purged by now.
+ * If not, issue a warning and purge it.
+ */
+    if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+    !i915_vma_is_purged(vma)))
+    i915_vma_set_purged(vma);
+
 atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
 WARN_ON(__i915_vma_unbind(vma));
 GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2047,6 +2056,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
 __i915_vma_evict(vma, false);
 drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+    if (i915_vma_is_persistent(vma)) {
+    spin_lock(&vma->vm->vm_rebind_lock);
+    if (list_empty(&vma->vm_rebind_link) &&
+    !i915_vma_is_purged(vma))
+    list_add_tail(&vma->vm_rebind_link,
+  &vma->vm->vm_rebind_list);
+    spin_unlock(&vma->vm->vm_rebind_lock);
+    }
+
 return 0;
 }
@@ -2059,8 +2078,7 @@ static struct dma_fence 
*__i915_vma_unbind_async(struct i915_vma *vma)

 if (!drm_mm_node_allocated(&vma->node))
 return NULL;
-    if (i915_vma_is_pinned(vma) ||
-    &vma->obj->mm.rsgt->table != vma->res

Re: [PATCH v3 07/17] drm/i915/vm_bind: Add support to handle object evictions

2022-10-10 Thread Niranjana Vishwanathapura

On Mon, Oct 10, 2022 at 05:43:47PM -0700, Niranjana Vishwanathapura wrote:

On Mon, Oct 10, 2022 at 06:15:02PM +0100, Matthew Auld wrote:

On 10/10/2022 17:11, Niranjana Vishwanathapura wrote:

On Mon, Oct 10, 2022 at 02:30:49PM +0100, Matthew Auld wrote:

On 10/10/2022 07:58, Niranjana Vishwanathapura wrote:

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c    |  6 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +++
 drivers/gpu/drm/i915/i915_vma.c   | 31 +--
 drivers/gpu/drm/i915/i915_vma.h   | 10 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 +
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git 
a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

index 8e3e6ceb9442..c435d49af2c8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -85,6 +85,12 @@ static void i915_gem_vm_bind_remove(struct 
i915_vma *vma, bool release_obj)

 {
 lockdep_assert_held(&vma->vm->vm_bind_lock);
+    spin_lock(&vma->vm->vm_rebind_lock);
+    if (!list_empty(&vma->vm_rebind_link))
+    list_del_init(&vma->vm_rebind_link);
+    i915_vma_set_purged(vma);
+    spin_unlock(&vma->vm->vm_rebind_lock);
+
 list_del_init(&vma->vm_bind_link);
 list_del_init(&vma->non_priv_vm_bind_link);
 i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c

index 422394f8fb40..2fa37f46750b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -295,6 +295,8 @@ void i915_address_space_init(struct 
i915_address_space *vm, int subclass)

 INIT_LIST_HEAD(&vm->vm_bound_list);
 mutex_init(&vm->vm_bind_lock);
 INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+    INIT_LIST_HEAD(&vm->vm_rebind_list);
+    spin_lock_init(&vm->vm_rebind_lock);
 }
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h

index 4ae5734f7d6b..443d1918ad4e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
 struct list_head vm_bind_list;
 /** @vm_bound_list: List of vm_binding completed */
 struct list_head vm_bound_list;
+    /* @vm_rebind_list: list of vmas to be rebinded */
+    struct list_head vm_rebind_list;
+    /* @vm_rebind_lock: protects vm_rebound_list */
+    spinlock_t vm_rebind_lock;
 /* @va: tree of persistent vmas */
 struct rb_root_cached va;
 struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_vma.c 
b/drivers/gpu/drm/i915/i915_vma.c

index 5d3d67a4bf47..b4be2cbe8382 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 INIT_LIST_HEAD(&vma->vm_bind_link);
 INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+    INIT_LIST_HEAD(&vma->vm_rebind_link);
 return vma;
 err_unlock:
@@ -1686,6 +1687,14 @@ static void force_unbind(struct i915_vma *vma)
 if (!drm_mm_node_allocated(&vma->node))
 return;
+    /*
+ * Persistent vma should have been purged by now.
+ * If not, issue a warning and purge it.
+ */
+    if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+    !i915_vma_is_purged(vma)))
+    i915_vma_set_purged(vma);
+
 atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
 WARN_ON(__i915_vma_unbind(vma));
 GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2047,6 +2056,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
 __i915_vma_evict(vma, false);
 drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+    if (i915_vma_is_persistent(vma)) {
+    spin_lock(&vma->vm->vm_rebind_lock);
+    if (list_empty(&vma->vm_rebind_link) &&
+    !i915_vma_is_purged(vma))
+    list_add_tail(&vma->vm_rebind_link,
+  &vma->vm->vm_rebind_list);
+    spin_unlock(&vma->vm->vm_rebind_lock);
+    }
+
 return 0;
 }
@@ -2059,8 +2078,7 @@ static struct dma_fence 
*__i915_vma_unbind_async(struct i915_vma *vma)

 if (!drm_mm_node_allocated(&vma->node))
 return NULL;
-    if (i915_vma_is_pinn

Re: [PATCH v3 12/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-12 Thread Niranjana Vishwanathapura

On Wed, Oct 12, 2022 at 11:32:18AM +0100, Matthew Auld wrote:

On 10/10/2022 07:58, Niranjana Vishwanathapura wrote:

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
minor cleanup

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 580 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 645 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3564307699ea..94520b79e7e7 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..1f38f658066a
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,580 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.


I guess EXEC_CAPTURE is also now gone? Is that expected? We ofc don't 
have the list of objects so perhaps doesn't make sense any more. Just 
double checking...


Yah, instead will have a VM_BIND_CAPTURE flag for user to request for capture
during vm_bind call. It is not implemented yet.

Niranjana



Re: [PATCH v3 06/17] drm/i915/vm_bind: Support for VM private BOs

2022-10-13 Thread Niranjana Vishwanathapura

On Tue, Oct 11, 2022 at 06:41:23PM +0100, Matthew Auld wrote:

On 11/10/2022 17:27, Matthew Auld wrote:

On 10/10/2022 07:58, Niranjana Vishwanathapura wrote:

Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

v2: Pad struct drm_i915_gem_create_ext_vm_private for 64bit alignment,
 add input validity checks.
v3: Create root_obj only for ppgtt.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Andi Shyti 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 16 +-
  drivers/gpu/drm/i915/gem/i915_gem_create.c    | 49 ++-
  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  6 +++
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  4 ++
  drivers/gpu/drm/i915/gem/i915_gem_object.c    |  3 ++
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  3 ++
  .../drm/i915/gem/i915_gem_vm_bind_object.c    |  9 
  drivers/gpu/drm/i915/gt/intel_gtt.c   |  3 ++
  drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +
  drivers/gpu/drm/i915/i915_vma.c   |  1 +
  drivers/gpu/drm/i915/i915_vma_types.h |  2 +
  include/uapi/drm/i915_drm.h   | 33 +
  13 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c

index 793345cbf99e..5ea7064805f3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -83,6 +83,7 @@
  #include "i915_file_private.h"
  #include "i915_gem_context.h"
+#include "i915_gem_internal.h"
  #include "i915_trace.h"
  #include "i915_user_extensions.h"
@@ -1795,6 +1796,7 @@ int i915_gem_vm_create_ioctl(struct 
drm_device *dev, void *data,

  struct drm_i915_private *i915 = to_i915(dev);
  struct drm_i915_gem_vm_control *args = data;
  struct drm_i915_file_private *file_priv = file->driver_priv;
+    struct drm_i915_gem_object *obj;
  struct i915_ppgtt *ppgtt;
  u32 id;
  int err;
@@ -1817,15 +1819,27 @@ int i915_gem_vm_create_ioctl(struct 
drm_device *dev, void *data,

  goto err_put;
  }
+    obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+    if (IS_ERR(obj)) {
+    err = PTR_ERR(obj);
+    goto err_put;
+    }
+
+    ppgtt->vm.root_obj = obj;
+    ppgtt->vm.vm_bind_mode = true;


Won't this temporarily break execbuf2? Only in the final patch does 
this depend on the new flag? Perhaps the patch split could be 
improved, or maybe we can just keep this as false here, until the 
final patch? Could also maybe also keep root_obj = NULL, until the 
last patch also?


Yah, it was expected to be set in the final patch.
Ok, will move creating the root_obj part also to final patch.
Also, will remove vm.vm_bind_mode flag altogether. Instead the
'vm.root_obj != NULL' check will tell if VM is in vm_bind mode.




+
  err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
 xa_limit_32b, GFP_KERNEL);
  if (err)
-    goto err_put;
+    goto err_root_obj_put;
  GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
  args->vm_id = id;
  return 0;
+err_root_obj_put:
+    if (ppgtt->vm.root_obj)
+    i915_gem_object_put(ppgtt->vm.root_obj);
  err_put:
  i915_vm_put(&ppgtt->vm);
  return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c

index 5c6e396ab74d..694d4638ac8b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
  #include "pxp/intel_pxp.h"
  #include "i915_drv.h"
+#include "i915_gem_context.h"
  #include "i915_gem_create.h"
  #include "i915_trace.h"
  #include "i915_user_extensions.h"
@@ -251,6 +252,7 @@ struct create_ext {
  unsigned int n_placements;
  unsigned int placement_mask;
  unsigned long flags;
+    u32 vm_id;
  };
  static void repr_placements(char *buf, size_t size,
@@ -400,9 +402,32 @@ static int ext_set_protected(struct 
i915_user_extension __user *base, void *data

  return 0;
  }
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+  void *data)
+{
+    struct drm_i915_gem_create_ext_vm_private ext;
+    struct create_ext *ext_data = data;
+
+    if (copy_from_user(&ext, base, sizeof(ext)))
+    return -EFAULT;
+
+    /* Reserved fields must be 0 */
+    if (ext.rsvd)
+    return -EINVAL;
+

Re: [Intel-gfx] [PATCH v3 09/17] drm/i915/vm_bind: Add out fence support

2022-10-13 Thread Niranjana Vishwanathapura

On Sun, Oct 09, 2022 at 11:58:18PM -0700, Niranjana Vishwanathapura wrote:

Add support for handling out fence for vm_bind call.

v2: Reset vma->vm_bind_fence.syncobj to NULL at the end
   of vm_bind call.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
.../drm/i915/gem/i915_gem_vm_bind_object.c| 82 +++
drivers/gpu/drm/i915/i915_vma.c   |  7 +-
drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
include/uapi/drm/i915_drm.h   | 63 +-
5 files changed, 158 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@

#include 

+struct dma_fence;
struct drm_device;
struct drm_file;
struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,

void i915_gem_vm_unbind_all(struct i915_address_space *vm);

+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index c435d49af2c8..526d3a6bf671 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@

#include 

+#include 
+
#include "gem/i915_gem_context.h"
#include "gem/i915_gem_vm_bind.h"

@@ -100,6 +102,76 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
i915_gem_object_put(vma->obj);
}

+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+   vma->vm_bind_fence.syncobj = NULL;
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct drm_i915_gem_vm_unbind *va)
{
@@ -237,6 +309,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}

+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER | PIN_VALIDATE;

for_i915_gem_ww(&ww, ret, true) {
@@ -258,6 +337,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
i915_gem_object_get(vma->obj);
}

+   if (va-&g

[PATCH v4 02/17] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2022-10-18 Thread Niranjana Vishwanathapura
Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 28 +---
 drivers/gpu/drm/i915/i915_sw_fence.h | 23 +--
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index cc2a8821d22a..ae06d35db056 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -7,7 +7,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "i915_sw_fence.h"
 #include "i915_selftest.h"
@@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
return ret;
 }
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp)
+/**
+ * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
+ * object with specified usage.
+ * @fence: the fence that needs to wait
+ * @resv: dma-resv object
+ * @usage: dma_resv_usage (See enum dma_resv_usage)
+ * @timeout: how long to wait in jiffies
+ * @gfp: allocation mode
+ *
+ * Setup the @fence to asynchronously wait on dma-resv object @resv for
+ * @usage to complete before signaling.
+ *
+ * Returns 0 if there is nothing to wait on, -ve error code upon error
+ * and >0 upon successfully setting up the wait.
+ */
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp)
 {
struct dma_resv_iter cursor;
struct dma_fence *f;
@@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
debug_fence_assert(fence);
might_sleep_if(gfpflags_allow_blocking(gfp));
 
-   dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+   dma_resv_iter_begin(&cursor, resv, usage);
dma_resv_for_each_fence_unlocked(&cursor, f) {
pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
gfp);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index f752bfc7c6e1..9c4859dc4c0d 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 
 #include 
+#include 
 #include 
 #include 
 #include  /* for NOTIFY_DONE */
 #include 
 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 
 enum i915_sw_fence_notify {
@@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
  unsigned long timeout,
  gfp_t gfp);
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp);
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp);
+
+static inline int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ bool write,
+ unsigned long timeout,
+ gfp_t gfp)
+{
+   return __i915_sw_fence_await_reservation(fence, resv,
+dma_resv_usage_rw(write),
+timeout, gfp);
+}
 
 bool i915_sw_fence_await(struct i915_sw_fence *fence);
 void i915_sw_fence_complete(struct i915_sw_fence *fence);
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v4 00/17] drm/i915/vm_bind: Add VM_BIND functionality

2022-10-18 Thread Niranjana Vishwanathapura
DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

TODOs:
* Support out fence for VM_UNBIND ioctl.
* Async VM_UNBIND support.
* Optimizations.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
  [PATCH i-g-t v4 0/11] vm_bind: Add VM_BIND validation support

v2: Address various review comments
v3: Address review comments and other fixes
v4: Remove vm_unbind out fence uapi which is not supported yet,
replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Signed-off-by: Niranjana Vishwanathapura 

Niranjana Vishwanathapura (17):
  drm/i915/vm_bind: Expose vm lookup function
  drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  drm/i915/vm_bind: Add support to create persistent vma
  drm/i915/vm_bind: Implement bind and unbind of object
  drm/i915/vm_bind: Support for VM private BOs
  drm/i915/vm_bind: Add support to handle object evictions
  drm/i915/vm_bind: Support persistent vma activeness tracking
  drm/i915/vm_bind: Add out fence support
  drm/i915/vm_bind: Abstract out common execbuf functions
  drm/i915/vm_bind: Use common execbuf functions in execbuf path
  drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  drm/i915/vm_bind: Expose i915_request_await_bind()
  drm/i915/vm_bind: Handle persistent vmas in execbuf3
  drm/i915/vm_bind: userptr dma-resv changes
  drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

 drivers/gpu/drm/i915/Makefile |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  37 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  72 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   6 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 516 +--
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 866 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   3 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 433 +
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  17 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  19 +
 drivers/gpu/drm/i915/i915_driver.c|   4 +
 drivers/gpu/drm/i915/i915_drv.h   |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   3 +
 drivers/gpu/drm/i915/i915_getparam.c  |   3 +
 drivers/gpu/drm/i915/i915_sw_fence.c  |  28 +-
 drivers/gpu/drm/i915/i915_sw_fence.h  |  23 +-
 drivers/gpu/drm/i915/i915_vma.c   | 128 ++-
 drivers/gpu/drm/i915/i915_vma.h   |  57 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  39 +
 include/uapi/drm/i915_drm.h   | 271 +-
 29 files changed, 2860 insertions(+), 522 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v4 04/17] drm/i915/vm_bind: Add support to create persistent vma

2022-10-18 Thread Niranjana Vishwanathapura
Add i915_vma_instance_persistent() to create persistent vmas.
Persistent vmas will use i915_gtt_view to support partial binding.

vma_lookup is tied to segment of the object instead of section
of VA space. Hence, it do not support aliasing. ie., multiple
mappings (at different VA) point to the same gtt_view of object.
Skip vma_lookup for persistent vmas to support aliasing.

v2: Remove unused I915_VMA_PERSISTENT definition,
update validity check in i915_vma_compare(),
remove unwanted is_persistent check in release_references().

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c   | 36 +--
 drivers/gpu/drm/i915/i915_vma.h   | 17 -
 drivers/gpu/drm/i915/i915_vma_types.h |  6 +
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c39488eb9eeb..529d97318f00 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -109,7 +109,8 @@ static void __i915_vma_retire(struct i915_active *ref)
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
   struct i915_address_space *vm,
-  const struct i915_gtt_view *view)
+  const struct i915_gtt_view *view,
+  bool skip_lookup_cache)
 {
struct i915_vma *pos = ERR_PTR(-E2BIG);
struct i915_vma *vma;
@@ -196,6 +197,9 @@ vma_create(struct drm_i915_gem_object *obj,
__set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(vma));
}
 
+   if (skip_lookup_cache)
+   goto skip_rb_insert;
+
rb = NULL;
p = &obj->vma.tree.rb_node;
while (*p) {
@@ -220,6 +224,7 @@ vma_create(struct drm_i915_gem_object *obj,
rb_link_node(&vma->obj_node, rb, p);
rb_insert_color(&vma->obj_node, &obj->vma.tree);
 
+skip_rb_insert:
if (i915_vma_is_ggtt(vma))
/*
 * We put the GGTT vma at the start of the vma-list, followed
@@ -299,7 +304,34 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 
/* vma_create() will resolve the race if another creates the vma */
if (unlikely(!vma))
-   vma = vma_create(obj, vm, view);
+   vma = vma_create(obj, vm, view, false);
+
+   GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
+   return vma;
+}
+
+/**
+ * i915_vma_create_persistent - create a persistent VMA
+ * @obj: parent &struct drm_i915_gem_object to be mapped
+ * @vm: address space in which the mapping is located
+ * @view: additional mapping requirements
+ *
+ * Creates a persistent vma.
+ *
+ * Returns the vma, or an error pointer.
+ */
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+  struct i915_address_space *vm,
+  const struct i915_gtt_view *view)
+{
+   struct i915_vma *vma;
+
+   GEM_BUG_ON(!kref_read(&vm->ref));
+
+   vma = vma_create(obj, vm, view, true);
+   if (!IS_ERR(vma))
+   i915_vma_set_persistent(vma);
 
GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
return vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index aecd9c64486b..c5378ec2f70a 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -44,6 +44,10 @@ struct i915_vma *
 i915_vma_instance(struct drm_i915_gem_object *obj,
  struct i915_address_space *vm,
  const struct i915_gtt_view *view);
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+  struct i915_address_space *vm,
+  const struct i915_gtt_view *view);
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
@@ -138,6 +142,16 @@ static inline u32 i915_ggtt_pin_bias(struct i915_vma *vma)
return i915_vm_to_ggtt(vma->vm)->pin_bias;
 }
 
+static inline bool i915_vma_is_persistent(const struct i915_vma *vma)
+{
+   return test_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_persistent(struct i915_vma *vma)
+{
+   set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
i915_gem_object_get(vma->obj);
@@ -164,7 +178,8 @@ i915_vma_compare(struct i915_vma *vma,
 {
ptrdiff_t cmp;
 
-   GEM_BUG_ON(view && !i915_is_ggtt_or_dpt(vm));
+   GEM_BUG_ON(view && !(i915_is_ggtt_or_dpt(vm) ||
+i915_vma_is_persistent(vma)));
 
cmp = ptrdiff(vma->vm, vm);
if (cmp)
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index ec0f6c9f57d0..3144d71

[PATCH v4 05/17] drm/i915/vm_bind: Implement bind and unbind of object

2022-10-18 Thread Niranjana Vishwanathapura
Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

v2: On older platforms ctx->vm is not set, check for it.
In vm_bind call, add vma to vm_bind_list.
Add more input validity checks.
Update some documentation.
v3: In vm_bind call, add vma to vm_bound_list as user can
request a fence and pass to execbuf3 as input fence.
Remove short term pinning with PIN_VALIDATE flag.
v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 321 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   9 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 110 ++
 11 files changed, 515 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 2535593ab379..bfb59b8f8a43 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 899fa8f1e0fe..e8b41aa8f8c4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 
+/**
+ * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
+ * @vm: the address space
+ *
+ * Returns:
+ * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
+ * false: @vm is not in vm_bind mode; allows only legacy execbuff method
+ *of binding.
+ */
+static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
+{
+   /* No support to enable vm_bind mode yet */
+   return false;
+}
+
 struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1160723c9d2d..c5bc9f6e887f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
 
+   if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..8428338547ac
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyrigh

[PATCH v4 01/17] drm/i915/vm_bind: Expose vm lookup function

2022-10-18 Thread Niranjana Vishwanathapura
Make i915_gem_vm_lookup() function non-static as it will be
used by the vm_bind feature.

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 1e29b1e6d186..793345cbf99e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -346,7 +346,16 @@ static int proto_context_register(struct 
drm_i915_file_private *fpriv,
return ret;
 }
 
-static struct i915_address_space *
+/**
+ * i915_gem_vm_lookup() - looks up for the VM reference given the vm id
+ * @file_priv: the private data associated with the user's file
+ * @id: the VM id
+ *
+ * Finds the VM reference associated to a specific id.
+ *
+ * Returns the VM pointer on success, NULL in case of failure.
+ */
+struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e5b0f66ea1fe..899fa8f1e0fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, 
void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 
+struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v4 11/17] drm/i915/vm_bind: Use common execbuf functions in execbuf path

2022-10-18 Thread Niranjana Vishwanathapura
Update the execbuf path to use common execbuf functions to
reduce code duplication with the newer execbuf3 path.

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++
 1 file changed, 38 insertions(+), 469 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 43f29acfbec9..749c3c80e02d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -28,6 +28,7 @@
 #include "i915_file_private.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
 #include "i915_gem_evict.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
@@ -235,13 +236,6 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
-struct eb_fence {
-   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-   struct dma_fence *dma_fence;
-   u64 value;
-   struct dma_fence_chain *chain_fence;
-};
-
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2440,29 @@ static const enum intel_engine_id user_ring_map[] = {
[I915_EXEC_VEBOX]   = VECS0
 };
 
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct 
intel_context *ce)
-{
-   struct intel_ring *ring = ce->ring;
-   struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
-
-   /*
-* Completely unscientific finger-in-the-air estimates for suitable
-* maximum user request size (to avoid blocking) and then backoff.
-*/
-   if (intel_ring_update_space(ring) >= PAGE_SIZE)
-   return NULL;
-
-   /*
-* Find a request that after waiting upon, there will be at least half
-* the ring available. The hysteresis allows us to compete for the
-* shared ring and should mean that we sleep less often prior to
-* claiming our resources, but not so long that the ring completely
-* drains before we can submit our next request.
-*/
-   list_for_each_entry(rq, &tl->requests, link) {
-   if (rq->ring != ring)
-   continue;
-
-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
-   break;
-   }
-   if (&rq->link == &tl->requests)
-   return NULL; /* weird, we will check again later for real */
-
-   return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context 
*ce,
-  bool throttle)
-{
-   struct intel_timeline *tl;
-   struct i915_request *rq = NULL;
-
-   /*
-* Take a local wakeref for preparing to dispatch the execbuf as
-* we expect to access the hardware fairly frequently in the
-* process, and require the engine to be kept awake between accesses.
-* Upon dispatch, we acquire another prolonged wakeref that we hold
-* until the timeline is idle, which in turn releases the wakeref
-* taken on the engine, and the parent device.
-*/
-   tl = intel_context_timeline_lock(ce);
-   if (IS_ERR(tl))
-   return PTR_ERR(tl);
-
-   intel_context_enter(ce);
-   if (throttle)
-   rq = eb_throttle(eb, ce);
-   intel_context_timeline_unlock(tl);
-
-   if (rq) {
-   bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
- timeout) < 0) {
-   i915_request_put(rq);
-
-   /*
-* Error path, cannot use intel_context_timeline_lock as
-* that is user interruptable and this clean up step
-* must be done.
-*/
-   mutex_lock(&ce->timeline->mutex);
-   intel_context_exit(ce);
-   mutex_unlock(&ce->timeline->mutex);
-
-   if (nonblock)
-   return -EWOULDBLOCK;
-   else
-   return -EINTR;
-   }
-   i915_request_put(rq);
-   }
-
-   return 0;
-}
-
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
 {
-   struct intel_context *ce = eb->context, *child;
int err;
-   int i = 0, j = 0;
 
GEM_BUG_ON(eb->arg

[PATCH v4 07/17] drm/i915/vm_bind: Add support to handle object evictions

2022-10-18 Thread Niranjana Vishwanathapura
Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  6 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +++
 drivers/gpu/drm/i915/i915_vma.c   | 31 +--
 drivers/gpu/drm/i915/i915_vma.h   | 10 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 +
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 05e8d3d14894..3ea3cb3ed97e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -85,6 +85,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
 {
lockdep_assert_held(&vma->vm->vm_bind_lock);
 
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 74c3557e5bc4..ebf8fc3a4603 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -290,6 +290,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->vm_bound_list);
mutex_init(&vm->vm_bind_lock);
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 36791503fe86..384d1ee7c68d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -266,6 +266,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 0ffa24bc0954..249697ae1186 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 
INIT_LIST_HEAD(&vma->vm_bind_link);
INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+   INIT_LIST_HEAD(&vma->vm_rebind_link);
return vma;
 
 err_unlock:
@@ -1681,6 +1682,14 @@ static void force_unbind(struct i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return;
 
+   /*
+* Persistent vma should have been purged by now.
+* If not, issue a warning and purge it.
+*/
+   if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+   !i915_vma_is_purged(vma)))
+   i915_vma_set_purged(vma);
+
atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
WARN_ON(__i915_vma_unbind(vma));
GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2042,6 +2051,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
__i915_vma_evict(vma, false);
 
drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+   if (i915_vma_is_persistent(vma)) {
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (list_empty(&vma->vm_rebind_link) &&
+   !i915_vma_is_purged(vma))
+   list_add_tail(&vma->vm_rebind_link,
+ &vma->vm->vm_rebind_list);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+   }
+
return 0;
 }
 
@@ -2054,8 +2073,7 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return NULL;
 
-   if (i915_vma_is_pinned(vma) ||
-   &vma->obj->mm.rsgt->table != vma->resource-&

[PATCH v4 16/17] drm/i915/vm_bind: userptr dma-resv changes

2022-10-18 Thread Niranjana Vishwanathapura
For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

v2: Do not double add vma to vm->userptr_invalidated_list

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 80 +++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 19 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 15 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 6 files changed, 122 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 8120e4c6b7da..3f1157dd7fc2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -20,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_USERPTR_USED   BIT_ULL(34)
 #define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -142,6 +143,21 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
 {
struct i915_vma *vma, *vn;
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /**
+* Move all invalidated userptr vmas back into vm_bind_list so that
+* they are looked up and revalidated.
+*/
+   spin_lock(&vm->userptr_invalidated_lock);
+   list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+userptr_invalidated_link) {
+   list_del_init(&vma->userptr_invalidated_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
/**
 * Move all unbound vmas back into vm_bind_list so that they are
 * revalidated.
@@ -155,10 +171,47 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
spin_unlock(&vm->vm_rebind_lock);
 }
 
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *last_vma = NULL;
+   struct i915_vma *vma;
+   int err;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   /**
+* The above submit_init() call does the object unbind and
+* hence adds vma into vm_rebind_list. Remove it from that
+* list as it is already scooped for revalidation.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   spin_unlock(&vm->vm_rebind_lock);
+
+   last_vma = vma;
+   }
+
+   if (last_vma)
+   eb->args->flags |= __EXEC3_USERPTR_USED;
+
+   return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
struct i915_vma *vma;
+   int err = 0;
 
for (i = 0; i < eb->num_batches; i++) {
vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -171,6 +224,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 
eb_scoop_unbound_vma_all(eb->context->vm);
 
+   err = eb_lookup_persistent_userptr_vmas(eb);
+   if (err)
+   return err;
+
return 0;
 }
 
@@ -343,6 +400,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
}
}
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /* Check for further userptr invalidations */
+   spin_lock(&vm->userptr_invalidated_lock);
+   if (!list_empty(&vm->userptr_invalidated_list))
+   err = -EAGAIN;
+   spin_unlock(&vm->userptr_invalidated_lock);
+
+   if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+   read_lock(&eb->i915->mm.notifier_lock);
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_done(vma->obj);
+   if (err)
+

[PATCH v4 03/17] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2022-10-18 Thread Niranjana Vishwanathapura
Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 18 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..5c6e396ab74d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -15,10 +15,18 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-static u32 object_max_page_size(struct intel_memory_region **placements,
-   unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Returns the largest of min_page_size of the @placements,
+ * or I915_GTT_PAGE_SIZE_4K if @n_placements is 0.
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements)
 {
-   u32 max_page_size = 0;
+   u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
int i;
 
for (i = 0; i < n_placements; i++) {
@@ -28,7 +36,6 @@ static u32 object_max_page_size(struct intel_memory_region 
**placements,
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
}
 
-   GEM_BUG_ON(!max_page_size);
return max_page_size;
 }
 
@@ -99,7 +106,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private 
*i915, u64 size,
 
i915_gem_flush_free_objects(i915);
 
-   size = round_up(size, object_max_page_size(placements, n_placements));
+   size = round_up(size, i915_gem_object_max_page_size(placements,
+   n_placements));
if (size == 0)
return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 6b9ecff42bb5..db3dd0e285c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements);
 
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v4 09/17] drm/i915/vm_bind: Add out fence support

2022-10-18 Thread Niranjana Vishwanathapura
Add support for handling out fence for vm_bind call.

v2: Reset vma->vm_bind_fence.syncobj to NULL at the end
of vm_bind call.
v3: Remove vm_unbind out fence uapi which is not supported yet.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 82 +++
 drivers/gpu/drm/i915/i915_vma.c   |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
 include/uapi/drm/i915_drm.h   | 49 ++-
 5 files changed, 146 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@
 
 #include 
 
+struct dma_fence;
 struct drm_device;
 struct drm_file;
 struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,
 
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
 
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 3ea3cb3ed97e..63889ba00183 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
 
@@ -100,6 +102,76 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
i915_gem_object_put(vma->obj);
 }
 
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+   vma->vm_bind_fence.syncobj = NULL;
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct drm_i915_gem_vm_unbind *va)
 {
@@ -237,6 +309,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER | PIN_VALIDATE;
 
for_i915_gem_ww(&ww, ret, true) {
@@ -258,6 +337,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
i915_gem_object_get(vma->obj);
}
 
+   if (va-&g

[PATCH v4 12/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-18 Thread Niranjana Vishwanathapura
Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
minor cleanup
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 580 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 645 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 8d76bb888dc3..6a801684d569 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..a9b4cc44bf66
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,580 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct drm_i915_gem_execbuffer3 *args;
+
+   struct intel_gt *gt;
+   struct intel_context *context;
+   struct i915

[PATCH v4 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-18 Thread Niranjana Vishwanathapura
Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
Individualize fences before adding to dma_resv obj.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 208 +-
 1 file changed, 207 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index a9b4cc44bf66..8120e4c6b7da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
  * Copyright © 2022 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -19,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
 
@@ -42,7 +44,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 
 /**
@@ -127,6 +138,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
 
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +169,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
 
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
 }
 
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb)
 {
+   eb_release_persistent_vma_all(eb);
eb_unpin_engine(eb);
 }
 
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   u64 num_fences = 1;
+   struct i915_vma *vma;
+   int re

[PATCH v4 13/17] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()

2022-10-18 Thread Niranjana Vishwanathapura
Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c | 16 +++-
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index eaa13e9ba966..4975fc662c86 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,21 @@ int i915_vma_sync(struct i915_vma *vma)
return i915_vm_sync(vma->vm);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion
+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
int err;
 
+   /* Ensure vma bind is initiated */
+   if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+   return -EINVAL;
+
if (!fence)
return 0;
 
@@ -457,9 +466,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma 
*vma)
 
return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v4 14/17] drm/i915/vm_bind: Expose i915_request_await_bind()

2022-10-18 Thread Niranjana Vishwanathapura
Rename __i915_request_await_bind() as i915_request_await_bind()
and make it non-static as it will be used in execbuf3 ioctl path.

Reviewed-by: Andi Shyti 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_vma.c | 8 +---
 drivers/gpu/drm/i915/i915_vma.h | 6 ++
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4975fc662c86..ab89e907f2eb 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1884,18 +1884,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
list_del(&vma->obj->userfault_link);
 }
 
-static int
-__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
-{
-   return __i915_request_await_exclusive(rq, &vma->active);
-}
-
 static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request 
*rq)
 {
int err;
 
/* Wait for the vma to be bound before we start! */
-   err = __i915_request_await_bind(rq, vma);
+   err = i915_request_await_bind(rq, vma);
if (err)
return err;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 04770f8ba815..19e57e12b956 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -54,6 +54,12 @@ void i915_vma_unpin_and_release(struct i915_vma **p_vma, 
unsigned int flags);
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
+static inline int
+i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
+{
+   return __i915_request_await_exclusive(rq, &vma->active);
+}
+
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
  struct i915_request *rq,
  struct dma_fence *fence,
-- 
2.21.0.rc0.32.g243a4c7e27



[PATCH v4 06/17] drm/i915/vm_bind: Support for VM private BOs

2022-10-18 Thread Niranjana Vishwanathapura
Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

v2: Pad struct drm_i915_gem_create_ext_vm_private for 64bit alignment,
add input validity checks.
v3: Create root_obj only for ppgtt.
v4: Fix releasing of obj->priv_root. Do not create vm->root_obj yet.
Allow vm private object creation only in vm_bind mode.
Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 54 ++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  9 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  1 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 include/uapi/drm/i915_drm.h   | 33 
 12 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 793345cbf99e..76c6419b7ae0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -83,6 +83,7 @@
 
 #include "i915_file_private.h"
 #include "i915_gem_context.h"
+#include "i915_gem_internal.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 5c6e396ab74d..62648341780b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
 #include "pxp/intel_pxp.h"
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -251,6 +252,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   u32 vm_id;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -400,9 +402,32 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct drm_i915_gem_create_ext_vm_private ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   /* Reserved fields must be 0 */
+   if (ext.rsvd)
+   return -EINVAL;
+
+   /* vm_id 0 is reserved */
+   if (!ext.vm_id)
+   return -ENOENT;
+
+   ext_data->vm_id = ext.vm_id;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 
 /**
@@ -418,6 +443,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_gem_create_ext *args = data;
struct create_ext ext_data = { .i915 = i915 };
+   struct i915_address_space *vm = NULL;
struct drm_i915_gem_object *obj;
int ret;
 
@@ -431,6 +457,17 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (ret)
return ret;
 
+   if (ext_data.vm_id) {
+   vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id);
+   if (unlikely(!vm))
+   return -ENOENT;
+
+   if (!i915_gem_vm_is_vm_bind_mode(vm)) {
+   ret = -EINVAL;
+   goto vm_put;
+   }
+   }
+
if (!ext_data.n_placements) {
ext_data.placements[0] =
intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
@@ -457,8 +494,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
ext_data.placements,
ext_data.n_placements,

[PATCH v4 17/17] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

2022-10-18 Thread Niranjana Vishwanathapura
Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

v2: update kernel-doc
v3: create vm->root_obj only upon I915_VM_CREATE_FLAGS_USE_VM_BIND
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 +++--
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c |  2 ++
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c|  3 +++
 include/uapi/drm/i915_drm.h | 22 +-
 6 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 76c6419b7ae0..0376adbbeecc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1803,9 +1803,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
if (!HAS_FULL_PPGTT(i915))
return -ENODEV;
 
-   if (args->flags)
+   if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
return -EINVAL;
 
+   if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+   !HAS_VM_BIND(i915))
+   return -EOPNOTSUPP;
+
ppgtt = i915_ppgtt_create(to_gt(i915), 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@@ -1818,15 +1822,32 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
goto err_put;
}
 
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) {
+   struct drm_i915_gem_object *obj;
+
+   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   if (IS_ERR(obj)) {
+   err = PTR_ERR(obj);
+   goto err_put;
+   }
+
+   ppgtt->vm.root_obj = obj;
+   }
+
err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
   xa_limit_32b, GFP_KERNEL);
if (err)
-   goto err_put;
+   goto err_root_obj_put;
 
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
 
+err_root_obj_put:
+   if (ppgtt->vm.root_obj) {
+   i915_gem_object_put(ppgtt->vm.root_obj);
+   ppgtt->vm.root_obj = NULL;
+   }
 err_put:
i915_vm_put(&ppgtt->vm);
return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e8b41aa8f8c4..b53aef2853cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -150,8 +150,7 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device 
*dev, void *data,
  */
 static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
 {
-   /* No support to enable vm_bind mode yet */
-   return false;
+   return !!vm->root_obj;
 }
 
 struct i915_address_space *
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 50648ab9214a..ae66fdd4bce9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -178,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
 void i915_address_space_fini(struct i915_address_space *vm)
 {
drm_mm_takedown(&vm->mm);
+   if (vm->root_obj)
+   i915_gem_object_put(vm->root_obj);
GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
mutex_destroy(&vm->vm_bind_lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7c64f8a17493..f4e7f3d4aff9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -972,6 +972,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_BAR2_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
 
+#define HAS_VM_BIND(i915) (GRAPHICS_VER(i915) >= 12)
+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 342c8ca6414e..f45b3c684bcf 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_PERF_REVISION:
value = i915_perf_ioctl_version();
break;
+   case I915_PARAM_VM_BIND_VERSION:
+   value = HAS_VM_BIND(i915);
+   break;
default:
DRM

[PATCH v4 08/17] drm/i915/vm_bind: Support persistent vma activeness tracking

2022-10-18 Thread Niranjana Vishwanathapura
Do not use i915_vma activeness tracking for persistent vmas.

As persistent vmas are part of working set for each execbuf
submission on that address space (VM), a persistent vma is
active if the VM active. As vm->root_obj->base.resv will be
updated for each submission on that VM, it correctly
represent whether the VM is active or not.

Add i915_vm_is_active() and i915_vm_sync() functions based
on vm->root_obj->base.resv with DMA_RESV_USAGE_BOOKKEEP
usage. dma-resv fence list will be updated with this usage
during each submission with this VM in the new execbuf3
ioctl path.

Update i915_vma_is_active(), i915_vma_sync() and the
__i915_vma_unbind_async() functions to properly handle
persistent vmas.

Reviewed-by: Andi Shyti 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +++
 drivers/gpu/drm/i915/i915_vma.c | 28 +
 drivers/gpu/drm/i915/i915_vma.h | 25 +-
 4 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7bd1861ddbdf..f4a3fc54fe66 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,45 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
+/**
+ * i915_vm_sync() - Wait until address space is not in use
+ * @vm: address space
+ *
+ * Waits until all requests using the address space are complete.
+ *
+ * Returns: 0 if success, -ve err code upon failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+   int ret;
+
+   /* Wait for all requests under this vm to finish */
+   ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+   DMA_RESV_USAGE_BOOKKEEP, false,
+   MAX_SCHEDULE_TIMEOUT);
+   if (ret < 0)
+   return ret;
+   else if (ret > 0)
+   return 0;
+   else
+   return -ETIMEDOUT;
+}
+
+/**
+ * i915_vm_is_active() - Check if address space is being used
+ * @vm: address space
+ *
+ * Check if any request using the specified address space is
+ * active.
+ *
+ * Returns: true if address space is active, false otherwise.
+ */
+bool i915_vm_is_active(const struct i915_address_space *vm)
+{
+   return !dma_resv_test_signaled(vm->root_obj->base.resv,
+  DMA_RESV_USAGE_BOOKKEEP);
+}
+
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
   struct sg_table *pages)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5dda..a5bbdc59d9df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -51,4 +51,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASKI915_GTT_PAGE_MASK
 
+int i915_vm_sync(struct i915_address_space *vm);
+bool i915_vm_is_active(const struct i915_address_space *vm);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 249697ae1186..04abdb92c2b2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -420,6 +420,24 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
return err;
 }
 
+/**
+ * i915_vma_sync() - Wait for the vma to be idle
+ * @vma: vma to be tested
+ *
+ * Returns 0 on success and error code on failure
+ */
+int i915_vma_sync(struct i915_vma *vma)
+{
+   int ret;
+
+   /* Wait for the asynchronous bindings and pending GPU reads */
+   ret = i915_active_wait(&vma->active);
+   if (ret || !i915_vma_is_persistent(vma) || i915_vma_is_purged(vma))
+   return ret;
+
+   return i915_vm_sync(vma->vm);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
@@ -1882,6 +1900,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
int err;
 
assert_object_held(obj);
+   if (i915_vma_is_persistent(vma))
+   return -EINVAL;
 
GEM_BUG_ON(!vma->pages);
 
@@ -2091,6 +2111,14 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
i915_vma *vma)
return ERR_PTR(-EBUSY);
}
 
+   if (i915_vma_is_persistent(vma) &&
+   __i915_sw_fence_await_reservation(&vma->resource->chain,
+ vma->vm->root_obj->base.resv,
+ DMA_RESV_USAGE_BOOKKEEP,
+ i915_fence_timeout(vma->vm->i915),
+ GFP_NOWAIT | __GFP_NOWARN) < 0)
+   return ERR_PTR(-EBUSY);
+
fence = __i915_vma_evict(vma, true);
 
drm_mm_remove_node(&vma->node); /* pairs

[PATCH v4 10/17] drm/i915/vm_bind: Abstract out common execbuf functions

2022-10-18 Thread Niranjana Vishwanathapura
The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Abstract out the common execbuf functionalities into a
separate file where possible, thus allowing code sharing.

Reviewed-by: Andi Shyti 
Acked-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 3 files changed, 741 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bfb59b8f8a43..8d76bb888dc3 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_create.o \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
+   gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
new file mode 100644
index ..4d1c9ce154b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
@@ -0,0 +1,666 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_gem_execbuffer_common.h"
+
+#define __EXEC_COMMON_FENCE_WAIT   BIT(0)
+#define __EXEC_COMMON_FENCE_SIGNAL BIT(1)
+
+static struct i915_request *eb_throttle(struct intel_context *ce)
+{
+   struct intel_ring *ring = ce->ring;
+   struct intel_timeline *tl = ce->timeline;
+   struct i915_request *rq;
+
+   /*
+* Completely unscientific finger-in-the-air estimates for suitable
+* maximum user request size (to avoid blocking) and then backoff.
+*/
+   if (intel_ring_update_space(ring) >= PAGE_SIZE)
+   return NULL;
+
+   /*
+* Find a request that after waiting upon, there will be at least half
+* the ring available. The hysteresis allows us to compete for the
+* shared ring and should mean that we sleep less often prior to
+* claiming our resources, but not so long that the ring completely
+* drains before we can submit our next request.
+*/
+   list_for_each_entry(rq, &tl->requests, link) {
+   if (rq->ring != ring)
+   continue;
+
+   if (__intel_ring_space(rq->postfix,
+  ring->emit, ring->size) > ring->size / 2)
+   break;
+   }
+   if (&rq->link == &tl->requests)
+   return NULL; /* weird, we will check again later for real */
+
+   return i915_request_get(rq);
+}
+
+static int eb_pin_timeline(struct intel_context *ce, bool throttle,
+  bool nonblock)
+{
+   struct intel_timeline *tl;
+   struct i915_request *rq = NULL;
+
+   /*
+* Take a local wakeref for preparing to dispatch the execbuf as
+* we expect to access the hardware fairly frequently in the
+* process, and require the engine to be kept awake between accesses.
+* Upon dispatch, we acquire another prolonged wakeref that we hold
+* until the timeline is idle, which in turn releases the wakeref
+* taken on the engine, and the parent device.
+*/
+   tl = intel_context_timeline_lock(ce);
+   if (IS_ERR(tl))
+   return PTR_ERR(tl);
+
+   intel_context_enter(ce);
+   if (throttle)
+   rq = eb_throttle(ce);
+   intel_context_timeline_unlock(tl);
+
+   if (rq) {
+   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
+
+   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
+ timeout) < 0) {
+   i915_request_put(rq);
+
+   /*
+* Error path, cannot use intel_context_timeline_lock as
+* that is user interruptable and this clean up step
+* must be done.
+*/
+   mutex_lock(&ce->timeline->mutex);
+   intel_context_exit(ce);
+   mutex_unlock(&ce->timeline->mutex);
+
+   if (nonblock)
+   return -EWOULDBLOCK;
+   else
+   return -EINTR;
+   }
+   i915_request_put(rq);
+   }
+
+  

Re: [PATCH v4 17/17] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

2022-10-18 Thread Niranjana Vishwanathapura

On Tue, Oct 18, 2022 at 05:03:16PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

v2: update kernel-doc
v3: create vm->root_obj only upon I915_VM_CREATE_FLAGS_USE_VM_BIND
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Reviewed-by: Matthew Auld 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 +++--
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c |  2 ++
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c|  3 +++
 include/uapi/drm/i915_drm.h | 22 +-
 6 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 76c6419b7ae0..0376adbbeecc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1803,9 +1803,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
if (!HAS_FULL_PPGTT(i915))
return -ENODEV;
-   if (args->flags)
+   if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
return -EINVAL;
+   if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+   !HAS_VM_BIND(i915))
+   return -EOPNOTSUPP;
+
ppgtt = i915_ppgtt_create(to_gt(i915), 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@@ -1818,15 +1822,32 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
void *data,
goto err_put;
}
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) {
+   struct drm_i915_gem_object *obj;
+
+   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   if (IS_ERR(obj)) {
+   err = PTR_ERR(obj);
+   goto err_put;
+   }
+
+   ppgtt->vm.root_obj = obj;
+   }
+
err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
   xa_limit_32b, GFP_KERNEL);
if (err)
-   goto err_put;
+   goto err_root_obj_put;
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
+err_root_obj_put:
+   if (ppgtt->vm.root_obj) {
+   i915_gem_object_put(ppgtt->vm.root_obj);
+   ppgtt->vm.root_obj = NULL;
+   }
 err_put:
i915_vm_put(&ppgtt->vm);
return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e8b41aa8f8c4..b53aef2853cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -150,8 +150,7 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device 
*dev, void *data,
  */
 static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
 {
-   /* No support to enable vm_bind mode yet */
-   return false;
+   return !!vm->root_obj;
 }
 struct i915_address_space *
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 50648ab9214a..ae66fdd4bce9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -178,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
 void i915_address_space_fini(struct i915_address_space *vm)
 {
drm_mm_takedown(&vm->mm);
+   if (vm->root_obj)
+   i915_gem_object_put(vm->root_obj);
GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
mutex_destroy(&vm->vm_bind_lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7c64f8a17493..f4e7f3d4aff9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -972,6 +972,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_BAR2_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+#define HAS_VM_BIND(i915) (GRAPHICS_VER(i915) >= 12)
+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 342c8ca6414e..f45b3c684bcf 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_PERF_REVISION:
value = i915_perf_ioctl_version();
break;
+   case I915_PARAM_VM_BIND_VERS

Re: [PATCH v4 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-18 Thread Niranjana Vishwanathapura

On Tue, Oct 18, 2022 at 07:01:57PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
Individualize fences before adding to dma_resv obj.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 208 +-
 1 file changed, 207 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index a9b4cc44bf66..8120e4c6b7da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
  * Copyright © 2022 Intel Corporation
  */
+#include 
 #include 
 #include 
@@ -19,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -42,7 +44,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 /**
@@ -127,6 +138,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +169,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
 }
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb)
 {
+   eb_release_persistent_vma_all(eb);
eb_unpin_engine(eb);
 }
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space 

Re: [PATCH v4 09/17] drm/i915/vm_bind: Add out fence support

2022-10-18 Thread Niranjana Vishwanathapura

On Tue, Oct 18, 2022 at 04:28:07PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Add support for handling out fence for vm_bind call.

v2: Reset vma->vm_bind_fence.syncobj to NULL at the end
of vm_bind call.
v3: Remove vm_unbind out fence uapi which is not supported yet.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 82 +++
 drivers/gpu/drm/i915/i915_vma.c   |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
 include/uapi/drm/i915_drm.h   | 49 ++-
 5 files changed, 146 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@
 #include 
+struct dma_fence;
 struct drm_device;
 struct drm_file;
 struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 3ea3cb3ed97e..63889ba00183 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 #include 
+#include 
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
@@ -100,6 +102,76 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
i915_gem_object_put(vma->obj);
 }
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+   vma->vm_bind_fence.syncobj = NULL;
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct drm_i915_gem_vm_unbind *va)
 {
@@ -237,6 +309,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER | PIN_VALIDATE;
for_i915_gem_ww(&ww, ret, true) {
@@ -258,6 +337,9 @@ static int i915_gem_vm_bind_obj(struct i

Re: [PATCH v4 12/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-18 Thread Niranjana Vishwanathapura

On Tue, Oct 18, 2022 at 06:30:58PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
minor cleanup
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 580 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 645 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 8d76bb888dc3..6a801684d569 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..a9b4cc44bf66
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,580 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct

Re: [Intel-gfx] [PATCH v4 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-18 Thread Niranjana Vishwanathapura

On Tue, Oct 18, 2022 at 01:20:06PM -0700, Niranjana Vishwanathapura wrote:

On Tue, Oct 18, 2022 at 07:01:57PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
   Individualize fences before adding to dma_resv obj.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
.../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 208 +-
1 file changed, 207 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index a9b4cc44bf66..8120e4c6b7da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
 * Copyright © 2022 Intel Corporation
 */
+#include 
#include 
#include 
@@ -19,6 +20,7 @@
#include "i915_gem_vm_bind.h"
#include "i915_trace.h"
+#define __EXEC3_HAS_PINBIT_ULL(33)
#define __EXEC3_ENGINE_PINNED   BIT_ULL(32)
#define __EXEC3_INTERNAL_FLAGS  (~0ull << 32)
@@ -42,7 +44,9 @@
 * execlist. Hence, no support for implicit sync.
 *
 * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
 *
 * The execbuf3 ioctl directly specifies the batch addresses instead of as
 * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
 * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
 * vma lookup table, implicit sync, vma active reference tracking etc., are not
 * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
 */
/**
@@ -127,6 +138,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
}
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
static int eb_lookup_vma_all(struct i915_execbuffer *eb)
{
unsigned int i, current_batch = 0;
@@ -141,14 +169,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
}
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
static void eb_release_vma_all(struct i915_execbuffer *eb)
{
+   eb_release_persistent_vma_all(eb);
eb_unpin_engine(eb);
}
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer

Re: [PATCH v4 14/17] drm/i915/vm_bind: Expose i915_request_await_bind()

2022-10-19 Thread Niranjana Vishwanathapura

On Wed, Oct 19, 2022 at 05:09:32PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Rename __i915_request_await_bind() as i915_request_await_bind()
and make it non-static as it will be used in execbuf3 ioctl path.

Reviewed-by: Andi Shyti 
Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_vma.c | 8 +---
 drivers/gpu/drm/i915/i915_vma.h | 6 ++
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4975fc662c86..ab89e907f2eb 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1884,18 +1884,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
list_del(&vma->obj->userfault_link);
 }
-static int
-__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
-{
-   return __i915_request_await_exclusive(rq, &vma->active);
-}
-
 static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request 
*rq)
 {
int err;
/* Wait for the vma to be bound before we start! */
-   err = __i915_request_await_bind(rq, vma);
+   err = i915_request_await_bind(rq, vma);
if (err)
return err;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 04770f8ba815..19e57e12b956 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -54,6 +54,12 @@ void i915_vma_unpin_and_release(struct i915_vma **p_vma, 
unsigned int flags);
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
+static inline int
+i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)


Some kernel doc might be good?

Reviewed-by: Matthew Auld 



Ok, will add.

Thanks,
Niranjana


+{
+   return __i915_request_await_exclusive(rq, &vma->active);
+}
+
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
  struct i915_request *rq,
  struct dma_fence *fence,


Re: [PATCH v4 13/17] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()

2022-10-19 Thread Niranjana Vishwanathapura

On Wed, Oct 19, 2022 at 05:07:31PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c | 16 +++-
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index eaa13e9ba966..4975fc662c86 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,21 @@ int i915_vma_sync(struct i915_vma *vma)
return i915_vm_sync(vma->vm);
 }
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion


Maybe mention the locking since this is now more than just DEBUG_GEM 
stuff. I assume we need the object lock or otherwise some guarantee 
that the vma is pinned?


Reviewed-by: Matthew Auld 



I think they are not needed. The fence reference is obtained under rcu
lock anyhow (will add this to documentation). Only thing required is
that vma is not released, but that caller must ensure for all i915_vma
apis anyhow.

Thanks,
Niranjana


+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
int err;
+   /* Ensure vma bind is initiated */
+   if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+   return -EINVAL;
+
if (!fence)
return 0;
@@ -457,9 +466,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma 
*vma)
return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma


Re: [PATCH v4 12/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-10-19 Thread Niranjana Vishwanathapura

On Wed, Oct 19, 2022 at 04:20:49PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
minor cleanup
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 580 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  61 ++
 5 files changed, 645 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 8d76bb888dc3..6a801684d569 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..a9b4cc44bf66
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,580 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */


Are we building/including the docs for this somewhere? Looks like we 
are missing some stuff like @fences/@num_fenc

Re: [PATCH v4 16/17] drm/i915/vm_bind: userptr dma-resv changes

2022-10-20 Thread Niranjana Vishwanathapura

On Thu, Oct 20, 2022 at 05:29:45PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

v2: Do not double add vma to vm->userptr_invalidated_list

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 80 +++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 19 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 15 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 6 files changed, 122 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 8120e4c6b7da..3f1157dd7fc2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -20,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
+#define __EXEC3_USERPTR_USED   BIT_ULL(34)
 #define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -142,6 +143,21 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
 {
struct i915_vma *vma, *vn;
+#ifdef CONFIG_MMU_NOTIFIER
+   /**


Not proper kernel-doc AFAIK.


Ok, will use single asterisk above.




+* Move all invalidated userptr vmas back into vm_bind_list so that
+* they are looked up and revalidated.
+*/
+   spin_lock(&vm->userptr_invalidated_lock);
+   list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+userptr_invalidated_link) {
+   list_del_init(&vma->userptr_invalidated_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
/**
 * Move all unbound vmas back into vm_bind_list so that they are
 * revalidated.
@@ -155,10 +171,47 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
spin_unlock(&vm->vm_rebind_lock);
 }
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *last_vma = NULL;
+   struct i915_vma *vma;
+   int err;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   /**
+* The above submit_init() call does the object unbind and
+* hence adds vma into vm_rebind_list. Remove it from that
+* list as it is already scooped for revalidation.
+*/


Ditto.


ok




+   spin_lock(&vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   spin_unlock(&vm->vm_rebind_lock);
+
+   last_vma = vma;
+   }
+
+   if (last_vma)
+   eb->args->flags |= __EXEC3_USERPTR_USED;
+
+   return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
struct i915_vma *vma;
+   int err = 0;
for (i = 0; i < eb->num_batches; i++) {
vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -171,6 +224,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
eb_scoop_unbound_vma_all(eb->context->vm);
+   err = eb_lookup_persistent_userptr_vmas(eb);
+   if (err)
+   return err;
+
return 0;
 }
@@ -343,6 +400,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
}
}
+#ifdef CONFIG_MMU_NOTIFIER
+   /* Check for further userptr invalidations */
+   spin_lock(&vm->userptr_invalidated_lock);
+   if (!list_empty(&vm->userptr_invalidated_list))
+   err = -EAGAIN;
+   spin_unlock(&vm->userptr_invalidated_lock);


After dropping the lock here, the invalidated_list might no longer be 
empty? Is that not possible, or somehow not a concern?




It should be fine as we have already added the fence to dma-resv object above.
Any subsequent mmu invalidations will end up waiting for request to fini

Re: [PATCH v4 13/17] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()

2022-10-20 Thread Niranjana Vishwanathapura

On Thu, Oct 20, 2022 at 10:16:06AM +0100, Matthew Auld wrote:

On 19/10/2022 19:28, Niranjana Vishwanathapura wrote:

On Wed, Oct 19, 2022 at 05:07:31PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c | 16 +++-
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c 
b/drivers/gpu/drm/i915/i915_vma.c

index eaa13e9ba966..4975fc662c86 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,21 @@ int i915_vma_sync(struct i915_vma *vma)
 return i915_vm_sync(vma->vm);
 }
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind 
completion of the vma

+ * @vma: vma to check for bind completion


Maybe mention the locking since this is now more than just 
DEBUG_GEM stuff. I assume we need the object lock or otherwise 
some guarantee that the vma is pinned?


Reviewed-by: Matthew Auld 



I think they are not needed. The fence reference is obtained under rcu
lock anyhow (will add this to documentation). Only thing required is
that vma is not released, but that caller must ensure for all i915_vma
apis anyhow.


I was thinking more about how this potentially behaves with concurrent 
bind/unbind, and what it might return in such cases. I'm assuming most 
normal users will want to have an active pin and/or be holding the 
object lock when calling this.




Unbind will anyway wait behind this fence. As it is just returning the
exclusive fence status, I think it should be fine even for concurrent
cases. Any specific sequence you are concerned with?
For vm_bind use case, we do have the object lock while calling it, but
I am not certain about the i915_vma_pin_iomap() use case. The permerge
CI result is clean though.



Thanks,
Niranjana


+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
 struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
 int err;
+    /* Ensure vma bind is initiated */
+    if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))


Just realised that this leaks the fence. I guess we have yet to hit 
this in testing.


Ah...will move i915_active_fence_get() below.

Thanks,
Niranjana




+    return -EINVAL;
+
 if (!fence)
 return 0;
@@ -457,9 +466,6 @@ static int 
i915_vma_verify_bind_complete(struct i915_vma *vma)

 return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h 
b/drivers/gpu/drm/i915/i915_vma.h

index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma


Re: [PATCH v4 15/17] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-10-20 Thread Niranjana Vishwanathapura

On Thu, Oct 20, 2022 at 05:39:46PM +0100, Matthew Auld wrote:

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
Individualize fences before adding to dma_resv obj.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 208 +-
 1 file changed, 207 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index a9b4cc44bf66..8120e4c6b7da 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
  * Copyright © 2022 Intel Corporation
  */
+#include 
 #include 
 #include 
@@ -19,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -42,7 +44,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 /**
@@ -127,6 +138,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -141,14 +169,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
 }
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb)
 {
+   eb_release_persistent_vma_all(eb);
eb_unpin_engine(eb);
 }
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space 

Re: [PATCH v4 16/17] drm/i915/vm_bind: userptr dma-resv changes

2022-10-20 Thread Niranjana Vishwanathapura

On Thu, Oct 20, 2022 at 04:27:38PM +0200, Andi Shyti wrote:

Hi Niranjana,

[...]


@@ -307,6 +307,8 @@ struct i915_vma {
struct list_head non_priv_vm_bind_link;
/* @vm_rebind_link: link to vm_rebind_list and protected by 
vm_rebind_lock */
struct list_head vm_rebind_link; /* Link in vm_rebind_list */
+   /*@userptr_invalidated_link: link to the vm->userptr_invalidated_list */


nit: add a space here



Andi, I am keeping all vm_bind related fields together here.
Just following the other examples in this file.

Niranjana


Andi


  1   2   3   4   5   >