Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Lionel Landwerlin

On 28/02/2020 13:46, Michel Dänzer wrote:

On 2020-02-28 12:02 p.m., Erik Faye-Lund wrote:

On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:

On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
 wrote:

On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:

Yeah, changes on vulkan drivers or backend compilers should be
fairly
sandboxed.

We also have tools that only work for intel stuff, that should
never
trigger anything on other people's HW.

Could something be worked out using the tags?

I think so! We have the pre-defined environment variable
CI_MERGE_REQUEST_LABELS, and we can do variable conditions:

https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables

That sounds like a pretty neat middle-ground to me. I just hope
that
new pipelines are triggered if new labels are added, because not
everyone is allowed to set labels, and sometimes people forget...

There's also this which is somewhat more robust:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569

I'm not sure it's more robust, but yeah that a useful tool too.

The reason I'm skeptical about the robustness is that we'll miss
testing if this misses a path.

Surely missing a path will be less likely / often to happen compared to
an MR missing a label. (Users which aren't members of the project can't
even set labels for an MR)



Sounds like a good alternative to tags.


-Lionel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Lionel Landwerlin

On 28/02/2020 11:28, Erik Faye-Lund wrote:

On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:

On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
wrote:

Hi all,

You might have read the short take in the X.org board meeting
minutes
already, here's the long version.

The good news: gitlab.fd.o has become very popular with our
communities, and is used extensively. This especially includes all
the
CI integration. Modern development process and tooling, yay!

The bad news: The cost in growth has also been tremendous, and it's
breaking our bank account. With reasonable estimates for continued
growth we're expecting hosting expenses totalling 75k USD this
year,
and 90k USD next year. With the current sponsors we've set up we
can't
sustain that. We estimate that hosting expenses for gitlab.fd.o
without any of the CI features enabled would total 30k USD, which
is
within X.org's ability to support through various sponsorships,
mostly
through XDC.

Note that X.org does no longer sponsor any CI runners themselves,
we've stopped that. The huge additional expenses are all just in
storing and serving build artifacts and images to outside CI
runners
sponsored by various companies. A related topic is that with the
growth in fd.o it's becoming infeasible to maintain it all on
volunteer admin time. X.org is therefore also looking for admin
sponsorship, at least medium term.

Assuming that we want cash flow reserves for one year of
gitlab.fd.o
(without CI support) and a trimmed XDC and assuming no sponsor
payment
meanwhile, we'd have to cut CI services somewhere between May and
June
this year. The board is of course working on acquiring sponsors,
but
filling a shortfall of this magnitude is neither easy nor quick
work,
and we therefore decided to give an early warning as soon as
possible.
Any help in finding sponsors for fd.o is very much appreciated.

a) Ouch.

b) we probably need to take a large step back here.


I kinda agree, but maybe the step doesn't have to be *too* large?

I wonder if we could solve this by restructuring the project a bit. I'm
talking purely from a Mesa point of view here, so it might not solve
the full problem, but:

1. It feels silly that we need to test changes to e.g the i965 driver
on dragonboards. We only have a big "do not run CI at all" escape-
hatch.



Yeah, changes on vulkan drivers or backend compilers should be fairly 
sandboxed.


We also have tools that only work for intel stuff, that should never 
trigger anything on other people's HW.


Could something be worked out using the tags?


-Lionel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/syncobj: fix leaking dma_fence in drm_syncobj_query_ioctl

2019-07-22 Thread Lionel Landwerlin

On 22/07/2019 16:21, Christian König wrote:

Am 22.07.19 um 15:16 schrieb Lionel Landwerlin:

On 22/07/2019 15:59, Christian König wrote:
We need to check the context number instead if the previous sequence 
to detect
an error and if an error is detected we need to drop the reference 
to the

current fence or otherwise would leak it.

Signed-off-by: Christian König 


Fixes: 27b575a9aa2f ("drm/syncobj: add timeline payload query ioctl v6")
Reviewed-by: Lionel Landwerlin 


CC stable? I'm not sure when this got upstream.

Christian.



I thought it would get picked up automatically for the relevant stable 
version (and none if it's not upstream yet).


We also have to be nice to people cherry picking stuff like ChromeOS.


-Lionel







---
  drivers/gpu/drm/drm_syncobj.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index 75cb4bb7619e..1438dcb3ebb1 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1298,14 +1298,14 @@ int drm_syncobj_query_ioctl(struct 
drm_device *dev, void *data,

  struct dma_fence *iter, *last_signaled = NULL;
    dma_fence_chain_for_each(iter, fence) {
-    if (!iter)
-    break;
-    dma_fence_put(last_signaled);
-    last_signaled = dma_fence_get(iter);
-    if (!to_dma_fence_chain(last_signaled)->prev_seqno)
+    if (iter->context != fence->context) {
+    dma_fence_put(iter);
  /* It is most likely that timeline has
   * unorder points. */
  break;
+    }
+    dma_fence_put(last_signaled);
+    last_signaled = dma_fence_get(iter);
  }
  point = dma_fence_is_signaled(last_signaled) ?
  last_signaled->seqno :








___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/syncobj: fix leaking dma_fence in drm_syncobj_query_ioctl

2019-07-22 Thread Lionel Landwerlin

On 22/07/2019 15:59, Christian König wrote:

We need to check the context number instead if the previous sequence to detect
an error and if an error is detected we need to drop the reference to the
current fence or otherwise would leak it.

Signed-off-by: Christian König 


Fixes: 27b575a9aa2f ("drm/syncobj: add timeline payload query ioctl v6")
Reviewed-by: Lionel Landwerlin 


---
  drivers/gpu/drm/drm_syncobj.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 75cb4bb7619e..1438dcb3ebb1 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1298,14 +1298,14 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, 
void *data,
struct dma_fence *iter, *last_signaled = NULL;
  
  			dma_fence_chain_for_each(iter, fence) {

-   if (!iter)
-   break;
-   dma_fence_put(last_signaled);
-   last_signaled = dma_fence_get(iter);
-   if 
(!to_dma_fence_chain(last_signaled)->prev_seqno)
+   if (iter->context != fence->context) {
+   dma_fence_put(iter);
/* It is most likely that timeline has
 * unorder points. */
break;
+   }
+   dma_fence_put(last_signaled);
+   last_signaled = dma_fence_get(iter);
}
point = dma_fence_is_signaled(last_signaled) ?
last_signaled->seqno :



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH libdrm 1/7] addr cs chunk for syncobj timeline

2019-05-14 Thread Lionel Landwerlin
With the small nits, patches 2 & 4 are : Reviewed-by: Lionel Landwerlin 

The other patches are a bit amdgpu specific so maybe you might want 
someone more familiar with amdgpu to review them.
Still I didn't see anything wrong with them so remaining patches are : 
Acked-by: Lionel Landwerlin 


I'll send the IGT stuff shortly.

Thanks,

-Lionel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH libdrm 4/7] add timeline signal/transfer ioctls v2

2019-05-14 Thread Lionel Landwerlin

On 13/05/2019 10:53, Chunming Zhou wrote:

v2: use one transfer ioctl

Signed-off-by: Chunming Zhou 
---
  xf86drm.c | 33 +
  xf86drm.h |  6 ++
  2 files changed, 39 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 17e3d880..acd16fab 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4257,6 +4257,21 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t 
*handles,
  return ret;
  }
  
+drm_public int drmSyncobjTimelineSignal(int fd, const uint32_t *handles,

+   uint64_t *points, uint32_t handle_count)
+{
+struct drm_syncobj_timeline_array args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;



I think you can drop the uint64_t cast.



+args.count_handles = handle_count;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, );
+return ret;
+}
+
  drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t 
*points,
  unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
@@ -4299,4 +4314,22 @@ drm_public int drmSyncobjQuery(int fd, uint32_t 
*handles, uint64_t *points,
  return 0;
  }
  
+drm_public int drmSyncobjTransfer(int fd,

+ uint32_t dst_handle, uint64_t dst_point,
+ uint32_t src_handle, uint64_t src_point,
+ uint32_t flags)
+{
+struct drm_syncobj_transfer args;
+int ret;
+
+memclear(args);
+args.src_handle = src_handle;
+args.dst_handle = dst_handle;
+args.src_point = src_point;
+args.dst_point = dst_point;
+args.flags = flags;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TRANSFER, );
  
+return ret;

+}
diff --git a/xf86drm.h b/xf86drm.h
index 60c7a84f..3fb1d1ca 100644
--- a/xf86drm.h
+++ b/xf86drm.h
@@ -876,12 +876,18 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, 
unsigned num_handles,
  uint32_t *first_signaled);
  extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t 
handle_count);
  extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t 
handle_count);
+extern int drmSyncobjTimelineSignal(int fd, const uint32_t *handles,
+   uint64_t *points, uint32_t handle_count);
  extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points,
  unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
  uint32_t *first_signaled);
  extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
   uint32_t handle_count);
+extern int drmSyncobjTransfer(int fd,
+ uint32_t dst_handle, uint64_t dst_point,
+ uint32_t src_handle, uint64_t src_point,
+ uint32_t flags);
  
  #if defined(__cplusplus)

  }



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH libdrm 2/7] add timeline wait/query ioctl v2

2019-05-14 Thread Lionel Landwerlin

On 13/05/2019 10:53, Chunming Zhou wrote:

v2: drop export/import

Signed-off-by: Chunming Zhou 
---
  xf86drm.c | 44 
  xf86drm.h |  6 ++
  2 files changed, 50 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 2c19376b..17e3d880 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4256,3 +4256,47 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t 
*handles,
  ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_SIGNAL, );
  return ret;
  }
+
+drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t 
*points,
+ unsigned num_handles,
+ int64_t timeout_nsec, unsigned flags,
+ uint32_t *first_signaled)
+{
+struct drm_syncobj_timeline_wait args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;



I don't think you need those uintptr_t -> uint64_t casts.



+args.timeout_nsec = timeout_nsec;
+args.count_handles = num_handles;
+args.flags = flags;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, );
+if (ret < 0)
+return -errno;
+
+if (first_signaled)
+*first_signaled = args.first_signaled;
+return ret;
+}
+
+
+drm_public int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
+  uint32_t handle_count)
+{
+struct drm_syncobj_timeline_array args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;



Same here.



+args.count_handles = handle_count;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_QUERY, );
+if (ret)
+return ret;
+return 0;
+}
+
+
diff --git a/xf86drm.h b/xf86drm.h
index 887ecc76..60c7a84f 100644
--- a/xf86drm.h
+++ b/xf86drm.h
@@ -876,6 +876,12 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, 
unsigned num_handles,
  uint32_t *first_signaled);
  extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t 
handle_count);
  extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t 
handle_count);
+extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points,
+ unsigned num_handles,
+ int64_t timeout_nsec, unsigned flags,
+ uint32_t *first_signaled);
+extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
+  uint32_t handle_count);
  
  #if defined(__cplusplus)

  }



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH libdrm 1/7] addr cs chunk for syncobj timeline

2019-05-13 Thread Lionel Landwerlin

Sorry for the delay, I'll try to review this tomorrow.

-Lionel

On 13/05/2019 11:15, zhoucm1 wrote:

ping... for patch set.


On 2019年05月13日 17:52, Chunming Zhou wrote:

[CAUTION: External Email]

Signed-off-by: Chunming Zhou 
---
  include/drm/amdgpu_drm.h | 9 +
  1 file changed, 9 insertions(+)

diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index d0701ffc..3d0318e6 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -528,6 +528,8 @@ struct drm_amdgpu_gem_va {
  #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
  #define AMDGPU_CHUNK_ID_BO_HANDLES  0x06
  #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT    0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09

  struct drm_amdgpu_cs_chunk {
 __u32   chunk_id;
@@ -608,6 +610,13 @@ struct drm_amdgpu_cs_chunk_sem {
 __u32 handle;
  };

+struct drm_amdgpu_cs_chunk_syncobj {
+   __u32 handle;
+   __u32 flags;
+   __u64 point;
+};
+
+
  #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ 0
  #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD  1
  #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD    2
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4

2019-04-01 Thread Lionel Landwerlin

On 01/04/2019 11:50, zhoucm1 wrote:



On 2019年04月01日 16:19, Lionel Landwerlin wrote:

On 01/04/2019 06:54, Zhou, David(ChunMing) wrote:



-Original Message-
From: Lionel Landwerlin 
Sent: Saturday, March 30, 2019 10:09 PM
To: Koenig, Christian ; Zhou, 
David(ChunMing)

; dri-de...@lists.freedesktop.org; amd-
g...@lists.freedesktop.org; ja...@jlekstrand.net; Hector, Tobias

Subject: Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point
interface v4

On 28/03/2019 15:18, Christian König wrote:

Am 28.03.19 um 14:50 schrieb Lionel Landwerlin:

On 25/03/2019 08:32, Chunming Zhou wrote:

From: Christian König 

Use the dma_fence_chain object to create a timeline of fence 
objects

instead of just replacing the existing fence.

v2: rebase and cleanup
v3: fix garbage collection parameters
v4: add unorder point check, print a warn calltrace

Signed-off-by: Christian König 
Cc: Lionel Landwerlin 
---
   drivers/gpu/drm/drm_syncobj.c | 39
+++
   include/drm/drm_syncobj.h |  5 +
   2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119
100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct
drm_syncobj *syncobj,
   spin_unlock(>lock);
   }
   +/**
+ * drm_syncobj_add_point - add new timeline point to the syncobj
+ * @syncobj: sync object to add timeline point do
+ * @chain: chain node to use to add the point
+ * @fence: fence to encapsulate in the chain node
+ * @point: sequence number to use for the point
+ *
+ * Add the chain node as new timeline point to the syncobj.
+ */
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+   struct dma_fence_chain *chain,
+   struct dma_fence *fence,
+   uint64_t point)
+{
+    struct syncobj_wait_entry *cur, *tmp;
+    struct dma_fence *prev;
+
+    dma_fence_get(fence);
+
+    spin_lock(>lock);
+
+    prev = drm_syncobj_fence_get(syncobj);
+    /* You are adding an unorder point to timeline, which could
cause payload returned from query_ioctl is 0! */
+    WARN_ON_ONCE(prev && prev->seqno >= point);


I think the WARN/BUG macros should only fire when there is an issue
with programming from within the kernel.

But this particular warning can be triggered by an application.


Probably best to just remove it?

Yeah, that was also my argument against it.

Key point here is that we still want to note somehow that userspace
did something wrong and returning an error is not an option.

Maybe just use DRM_ERROR with a static variable to print the message
only once.

Christian.
I don't really see any point in printing an error once. If you run 
your
application twice you end up thinking there was an issue just on 
the first run

but it's actually always wrong.

Except this nitpick, is there any other concern to push whole patch 
set? Is that time to push whole patch set?


-David



Looks good to me.
Does that mean we can add your RB on patch set so that we can submit 
the patch set to drm-misc-next branch?



Yes :

Reviewed-by: Lionel Landwerlin 


Not too sure about the process for drm-misc-next. Sounds like Sumit 
Semwal  is the go to person according to 
MAINTAINERS.







I have an additional change to make drm_syncobj_find_fence() also 
return the drm_syncobj : 
https://github.com/djdeath/linux/commit/0b7732b267b931339d71fe6f493ea6fa4eab453e


This is needed in i915 to avoid looking up the drm_syncobj handle twice.

Our driver allows to wait on the syncobj's dma_fence that we're then 
going to replace so we need to get bot the fence & syncobj at the 
same time.


I guess it can go in a follow up series.

Yes, agree.

Thanks for your effort as well,
-David



Thanks to you!


-Lionel





-Lionel




Unless we're willing to take the syncobj lock for longer periods of 
time when
adding points, I guess we'll have to defer validation to validation 
layers.



-Lionel



-Lionel



+    dma_fence_chain_init(chain, prev, fence, point);
+    rcu_assign_pointer(syncobj->fence, >base);
+
+    list_for_each_entry_safe(cur, tmp, >cb_list, node) {
+    list_del_init(>node);
+    syncobj_wait_syncobj_func(syncobj, cur);
+    }
+    spin_unlock(>lock);
+
+    /* Walk the chain once to trigger garbage collection */
+    dma_fence_chain_for_each(fence, prev);
+    dma_fence_put(prev);
+}
+EXPORT_SYMBOL(drm_syncobj_add_point);
+
   /**
    * drm_syncobj_replace_fence - replace fence in a sync object.
    * @syncobj: Sync object to replace fence in diff --git
a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index
0311c9fdbd2f..6cf7243a1dc5 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -27,6 +27,7 @@
   #define __DRM_SYNCOBJ_H__
     #include 
+#include 
     struct drm_file;
   @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_sy

Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4

2019-04-01 Thread Lionel Landwerlin

On 01/04/2019 06:54, Zhou, David(ChunMing) wrote:



-Original Message-
From: Lionel Landwerlin 
Sent: Saturday, March 30, 2019 10:09 PM
To: Koenig, Christian ; Zhou, David(ChunMing)
; dri-de...@lists.freedesktop.org; amd-
g...@lists.freedesktop.org; ja...@jlekstrand.net; Hector, Tobias

Subject: Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point
interface v4

On 28/03/2019 15:18, Christian König wrote:

Am 28.03.19 um 14:50 schrieb Lionel Landwerlin:

On 25/03/2019 08:32, Chunming Zhou wrote:

From: Christian König 

Use the dma_fence_chain object to create a timeline of fence objects
instead of just replacing the existing fence.

v2: rebase and cleanup
v3: fix garbage collection parameters
v4: add unorder point check, print a warn calltrace

Signed-off-by: Christian König 
Cc: Lionel Landwerlin 
---
   drivers/gpu/drm/drm_syncobj.c | 39
+++
   include/drm/drm_syncobj.h |  5 +
   2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119
100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct
drm_syncobj *syncobj,
   spin_unlock(>lock);
   }
   +/**
+ * drm_syncobj_add_point - add new timeline point to the syncobj
+ * @syncobj: sync object to add timeline point do
+ * @chain: chain node to use to add the point
+ * @fence: fence to encapsulate in the chain node
+ * @point: sequence number to use for the point
+ *
+ * Add the chain node as new timeline point to the syncobj.
+ */
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+   struct dma_fence_chain *chain,
+   struct dma_fence *fence,
+   uint64_t point)
+{
+    struct syncobj_wait_entry *cur, *tmp;
+    struct dma_fence *prev;
+
+    dma_fence_get(fence);
+
+    spin_lock(>lock);
+
+    prev = drm_syncobj_fence_get(syncobj);
+    /* You are adding an unorder point to timeline, which could
cause payload returned from query_ioctl is 0! */
+    WARN_ON_ONCE(prev && prev->seqno >= point);


I think the WARN/BUG macros should only fire when there is an issue
with programming from within the kernel.

But this particular warning can be triggered by an application.


Probably best to just remove it?

Yeah, that was also my argument against it.

Key point here is that we still want to note somehow that userspace
did something wrong and returning an error is not an option.

Maybe just use DRM_ERROR with a static variable to print the message
only once.

Christian.

I don't really see any point in printing an error once. If you run your
application twice you end up thinking there was an issue just on the first run
but it's actually always wrong.


Except this nitpick, is there any other concern to push whole patch set? Is 
that time to push whole patch set?

-David



Looks good to me.

I have an additional change to make drm_syncobj_find_fence() also return 
the drm_syncobj : 
https://github.com/djdeath/linux/commit/0b7732b267b931339d71fe6f493ea6fa4eab453e


This is needed in i915 to avoid looking up the drm_syncobj handle twice.

Our driver allows to wait on the syncobj's dma_fence that we're then 
going to replace so we need to get bot the fence & syncobj at the same time.


I guess it can go in a follow up series.


-Lionel





Unless we're willing to take the syncobj lock for longer periods of time when
adding points, I guess we'll have to defer validation to validation layers.


-Lionel



-Lionel



+    dma_fence_chain_init(chain, prev, fence, point);
+    rcu_assign_pointer(syncobj->fence, >base);
+
+    list_for_each_entry_safe(cur, tmp, >cb_list, node) {
+    list_del_init(>node);
+    syncobj_wait_syncobj_func(syncobj, cur);
+    }
+    spin_unlock(>lock);
+
+    /* Walk the chain once to trigger garbage collection */
+    dma_fence_chain_for_each(fence, prev);
+    dma_fence_put(prev);
+}
+EXPORT_SYMBOL(drm_syncobj_add_point);
+
   /**
    * drm_syncobj_replace_fence - replace fence in a sync object.
    * @syncobj: Sync object to replace fence in diff --git
a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index
0311c9fdbd2f..6cf7243a1dc5 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -27,6 +27,7 @@
   #define __DRM_SYNCOBJ_H__
     #include 
+#include 
     struct drm_file;
   @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_syncobj
*syncobj)
     struct drm_syncobj *drm_syncobj_find(struct drm_file
*file_private,
    u32 handle);
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+   struct dma_fence_chain *chain,
+   struct dma_fence *fence,
+   uint64_t point);
   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
  struct dma_fence *fence);
   int drm_syncobj_f

Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4

2019-03-30 Thread Lionel Landwerlin

On 28/03/2019 15:18, Christian König wrote:

Am 28.03.19 um 14:50 schrieb Lionel Landwerlin:

On 25/03/2019 08:32, Chunming Zhou wrote:

From: Christian König 

Use the dma_fence_chain object to create a timeline of fence objects
instead of just replacing the existing fence.

v2: rebase and cleanup
v3: fix garbage collection parameters
v4: add unorder point check, print a warn calltrace

Signed-off-by: Christian König 
Cc: Lionel Landwerlin 
---
  drivers/gpu/drm/drm_syncobj.c | 39 
+++

  include/drm/drm_syncobj.h |  5 +
  2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index 5329e66598c6..19a9ce638119 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct 
drm_syncobj *syncobj,

  spin_unlock(>lock);
  }
  +/**
+ * drm_syncobj_add_point - add new timeline point to the syncobj
+ * @syncobj: sync object to add timeline point do
+ * @chain: chain node to use to add the point
+ * @fence: fence to encapsulate in the chain node
+ * @point: sequence number to use for the point
+ *
+ * Add the chain node as new timeline point to the syncobj.
+ */
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+   struct dma_fence_chain *chain,
+   struct dma_fence *fence,
+   uint64_t point)
+{
+    struct syncobj_wait_entry *cur, *tmp;
+    struct dma_fence *prev;
+
+    dma_fence_get(fence);
+
+    spin_lock(>lock);
+
+    prev = drm_syncobj_fence_get(syncobj);
+    /* You are adding an unorder point to timeline, which could 
cause payload returned from query_ioctl is 0! */

+    WARN_ON_ONCE(prev && prev->seqno >= point);



I think the WARN/BUG macros should only fire when there is an issue 
with programming from within the kernel.


But this particular warning can be triggered by an application.


Probably best to just remove it?


Yeah, that was also my argument against it.

Key point here is that we still want to note somehow that userspace 
did something wrong and returning an error is not an option.


Maybe just use DRM_ERROR with a static variable to print the message 
only once.


Christian.


I don't really see any point in printing an error once. If you run your 
application twice you end up thinking there was an issue just on the 
first run but it's actually always wrong.



Unless we're willing to take the syncobj lock for longer periods of time 
when adding points, I guess we'll have to defer validation to validation 
layers.



-Lionel






-Lionel



+    dma_fence_chain_init(chain, prev, fence, point);
+    rcu_assign_pointer(syncobj->fence, >base);
+
+    list_for_each_entry_safe(cur, tmp, >cb_list, node) {
+    list_del_init(>node);
+    syncobj_wait_syncobj_func(syncobj, cur);
+    }
+    spin_unlock(>lock);
+
+    /* Walk the chain once to trigger garbage collection */
+    dma_fence_chain_for_each(fence, prev);
+    dma_fence_put(prev);
+}
+EXPORT_SYMBOL(drm_syncobj_add_point);
+
  /**
   * drm_syncobj_replace_fence - replace fence in a sync object.
   * @syncobj: Sync object to replace fence in
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 0311c9fdbd2f..6cf7243a1dc5 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -27,6 +27,7 @@
  #define __DRM_SYNCOBJ_H__
    #include 
+#include 
    struct drm_file;
  @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_syncobj 
*syncobj)

    struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
   u32 handle);
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+   struct dma_fence_chain *chain,
+   struct dma_fence *fence,
+   uint64_t point);
  void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 struct dma_fence *fence);
  int drm_syncobj_find_fence(struct drm_file *file_private,








___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v5

2019-03-30 Thread Lionel Landwerlin

On 28/03/2019 13:33, Lionel Landwerlin wrote:

On 28/03/2019 13:08, Chunming Zhou wrote:

在 2019/3/28 20:53, Lionel Landwerlin 写道:

On 25/03/2019 08:32, Chunming Zhou wrote:

v2: individually allocate chain array, since chain node is free
independently.
v3: all existing points must be already signaled before cpu perform
signal operation,
  so add check condition for that.
v4: remove v3 change and add checking to prevent out-of-order
v5: unify binary and timeline

Signed-off-by: Chunming Zhou 
Cc: Tobias Hector 
Cc: Jason Ekstrand 
Cc: Dave Airlie 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
---
   drivers/gpu/drm/drm_internal.h |  2 +
   drivers/gpu/drm/drm_ioctl.c    |  2 +
   drivers/gpu/drm/drm_syncobj.c  | 73 
++

   include/uapi/drm/drm.h |  1 +
   4 files changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h
b/drivers/gpu/drm/drm_internal.h
index dd11ae5f1eef..d9a483a5fce0 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device
*dev, void *data,
   struct drm_file *file_private);
   int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
    struct drm_file *file_private);
+int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void
*data,
+  struct drm_file *file_private);
   int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_private);
   diff --git a/drivers/gpu/drm/drm_ioctl.c 
b/drivers/gpu/drm/drm_ioctl.c

index 92b3b7b2fd81..d337f161909c 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] 
= {

 DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, 
drm_syncobj_signal_ioctl,

 DRM_UNLOCKED|DRM_RENDER_ALLOW),
+    DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL,
drm_syncobj_timeline_signal_ioctl,
+  DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
 DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE,
drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c
index ee2d66e047e7..099596190845 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1183,6 +1183,79 @@ drm_syncobj_signal_ioctl(struct drm_device
*dev, void *data,
   return ret;
   }
   +int
+drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file_private)
+{
+    struct drm_syncobj_timeline_array *args = data;
+    struct drm_syncobj **syncobjs;
+    struct dma_fence_chain **chains;
+    uint64_t *points;
+    uint32_t i, j;
+    int ret;
+
+    if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+    return -EOPNOTSUPP;
+
+    if (args->pad != 0)
+    return -EINVAL;
+
+    if (args->count_handles == 0)
+    return -EINVAL;
+
+    ret = drm_syncobj_array_find(file_private,
+ u64_to_user_ptr(args->handles),
+ args->count_handles,
+ );
+    if (ret < 0)
+    return ret;
+
+    points = kmalloc_array(args->count_handles, sizeof(*points),
+   GFP_KERNEL);
+    if (!points) {
+    ret = -ENOMEM;
+    goto out;
+    }
+    if (!u64_to_user_ptr(args->points)) {
+    memset(points, 0, args->count_handles * sizeof(uint64_t));
+    } else if (copy_from_user(points, u64_to_user_ptr(args->points),
+  sizeof(uint64_t) * args->count_handles)) {
+    ret = -EFAULT;
+    goto err_points;
+    }
+
+    chains = kmalloc_array(args->count_handles, sizeof(void *),
GFP_KERNEL);
+    if (!chains) {
+    ret = -ENOMEM;
+    goto err_points;
+    }
+    for (i = 0; i < args->count_handles; i++) {
+    chains[i] = kzalloc(sizeof(struct dma_fence_chain),
GFP_KERNEL);
+    if (!chains[i]) {
+    for (j = 0; j < i; j++)
+    kfree(chains[j]);
+    ret = -ENOMEM;
+    goto err_chains;
+    }
+    }
+
+    for (i = 0; i < args->count_handles; i++) {
+    struct dma_fence *fence = dma_fence_get_stub();
+
+    drm_syncobj_add_point(syncobjs[i], chains[i],
+  fence, points[i]);
+    dma_fence_put(fence);
+    }


I think I found an issue where the implementation doesn't match what
the spec requires on host signaling.

Consider the following :


Timeline semaphore has a value of 4.

A device submits a commands to signal point 5.

The host signals point 6.


The value of the timeline will remain 4 until the device commands
complete, at which point it will change to 6.

But the specification seems to say that the timeline value should be 6
once

Re: [PATCH] drm/gamma: Clarify gamma lut uapi

2019-03-29 Thread Lionel Landwerlin

On 29/03/2019 09:20, Daniel Vetter wrote:

Interpreting it as a 0.16 fixed point means we can't accurately
represent 1.0. Which is one of the values we really should be able to
represent.

Since most (all?) luts have lower precision this will only affect
rounding of 0x.

Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Shashank Sharma 
Cc: "Kumar, Kiran S" 
Cc: Kausal Malladi 
Cc: Lionel Landwerlin 
Cc: Matt Roper 
Cc: Rob Bradford 
Cc: Daniel Stone 
Cc: Stefan Schake 
Cc: Eric Anholt 
Cc: Maarten Lankhorst 
Cc: Harry Wentland 
Cc: Leo Li 
Cc: amd-gfx@lists.freedesktop.org
Cc: James (Qian) Wang 
Cc: Liviu Dudau 
Cc: Mali DP Maintainers 
Cc: CK Hu 
Cc: Philipp Zabel 
Cc: Yannick Fertre 
Cc: Philippe Cornu 
Cc: Benjamin Gaignard 
Cc: Vincent Abriou 
Cc: Tomi Valkeinen 
Cc: Boris Brezillon 
Signed-off-by: Daniel Vetter Signed-off-by: Daniel Vetter 

---
  include/uapi/drm/drm_mode.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
index 09d72966899a..83cd1636b9be 100644
--- a/include/uapi/drm/drm_mode.h
+++ b/include/uapi/drm/drm_mode.h
@@ -621,7 +621,8 @@ struct drm_color_ctm {
  
  struct drm_color_lut {

/*
-* Data is U0.16 fixed point format.
+* Values are mapped linearly to 0.0 - 1.0 range, with 0x0 == 0.0 and
+* 0x == 1.0.
 */
__u16 red;
__u16 green;



Thanks, that was the intention when it was introduced.


Acked-by: Lionel Landwerlin 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4

2019-03-28 Thread Lionel Landwerlin

On 25/03/2019 08:32, Chunming Zhou wrote:

From: Christian König 

Use the dma_fence_chain object to create a timeline of fence objects
instead of just replacing the existing fence.

v2: rebase and cleanup
v3: fix garbage collection parameters
v4: add unorder point check, print a warn calltrace

Signed-off-by: Christian König 
Cc: Lionel Landwerlin 
---
  drivers/gpu/drm/drm_syncobj.c | 39 +++
  include/drm/drm_syncobj.h |  5 +
  2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 5329e66598c6..19a9ce638119 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct drm_syncobj 
*syncobj,
spin_unlock(>lock);
  }
  
+/**

+ * drm_syncobj_add_point - add new timeline point to the syncobj
+ * @syncobj: sync object to add timeline point do
+ * @chain: chain node to use to add the point
+ * @fence: fence to encapsulate in the chain node
+ * @point: sequence number to use for the point
+ *
+ * Add the chain node as new timeline point to the syncobj.
+ */
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+  struct dma_fence_chain *chain,
+  struct dma_fence *fence,
+  uint64_t point)
+{
+   struct syncobj_wait_entry *cur, *tmp;
+   struct dma_fence *prev;
+
+   dma_fence_get(fence);
+
+   spin_lock(>lock);
+
+   prev = drm_syncobj_fence_get(syncobj);
+   /* You are adding an unorder point to timeline, which could cause 
payload returned from query_ioctl is 0! */
+   WARN_ON_ONCE(prev && prev->seqno >= point);



I think the WARN/BUG macros should only fire when there is an issue with 
programming from within the kernel.


But this particular warning can be triggered by an application.


Probably best to just remove it?


-Lionel



+   dma_fence_chain_init(chain, prev, fence, point);
+   rcu_assign_pointer(syncobj->fence, >base);
+
+   list_for_each_entry_safe(cur, tmp, >cb_list, node) {
+   list_del_init(>node);
+   syncobj_wait_syncobj_func(syncobj, cur);
+   }
+   spin_unlock(>lock);
+
+   /* Walk the chain once to trigger garbage collection */
+   dma_fence_chain_for_each(fence, prev);
+   dma_fence_put(prev);
+}
+EXPORT_SYMBOL(drm_syncobj_add_point);
+
  /**
   * drm_syncobj_replace_fence - replace fence in a sync object.
   * @syncobj: Sync object to replace fence in
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 0311c9fdbd2f..6cf7243a1dc5 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -27,6 +27,7 @@
  #define __DRM_SYNCOBJ_H__
  
  #include 

+#include 
  
  struct drm_file;
  
@@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj)
  
  struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,

 u32 handle);
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+  struct dma_fence_chain *chain,
+  struct dma_fence *fence,
+  uint64_t point);
  void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
   struct dma_fence *fence);
  int drm_syncobj_find_fence(struct drm_file *file_private,



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v5

2019-03-28 Thread Lionel Landwerlin

On 28/03/2019 13:08, Chunming Zhou wrote:

在 2019/3/28 20:53, Lionel Landwerlin 写道:

On 25/03/2019 08:32, Chunming Zhou wrote:

v2: individually allocate chain array, since chain node is free
independently.
v3: all existing points must be already signaled before cpu perform
signal operation,
  so add check condition for that.
v4: remove v3 change and add checking to prevent out-of-order
v5: unify binary and timeline

Signed-off-by: Chunming Zhou 
Cc: Tobias Hector 
Cc: Jason Ekstrand 
Cc: Dave Airlie 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
---
   drivers/gpu/drm/drm_internal.h |  2 +
   drivers/gpu/drm/drm_ioctl.c    |  2 +
   drivers/gpu/drm/drm_syncobj.c  | 73 ++
   include/uapi/drm/drm.h |  1 +
   4 files changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h
b/drivers/gpu/drm/drm_internal.h
index dd11ae5f1eef..d9a483a5fce0 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device
*dev, void *data,
   struct drm_file *file_private);
   int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
    struct drm_file *file_private);
+int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void
*data,
+  struct drm_file *file_private);
   int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_private);
   diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 92b3b7b2fd81..d337f161909c 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
     DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
     DRM_UNLOCKED|DRM_RENDER_ALLOW),
+    DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL,
drm_syncobj_timeline_signal_ioctl,
+  DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
     DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE,
drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c
index ee2d66e047e7..099596190845 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1183,6 +1183,79 @@ drm_syncobj_signal_ioctl(struct drm_device
*dev, void *data,
   return ret;
   }
   +int
+drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file_private)
+{
+    struct drm_syncobj_timeline_array *args = data;
+    struct drm_syncobj **syncobjs;
+    struct dma_fence_chain **chains;
+    uint64_t *points;
+    uint32_t i, j;
+    int ret;
+
+    if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+    return -EOPNOTSUPP;
+
+    if (args->pad != 0)
+    return -EINVAL;
+
+    if (args->count_handles == 0)
+    return -EINVAL;
+
+    ret = drm_syncobj_array_find(file_private,
+ u64_to_user_ptr(args->handles),
+ args->count_handles,
+ );
+    if (ret < 0)
+    return ret;
+
+    points = kmalloc_array(args->count_handles, sizeof(*points),
+   GFP_KERNEL);
+    if (!points) {
+    ret = -ENOMEM;
+    goto out;
+    }
+    if (!u64_to_user_ptr(args->points)) {
+    memset(points, 0, args->count_handles * sizeof(uint64_t));
+    } else if (copy_from_user(points, u64_to_user_ptr(args->points),
+  sizeof(uint64_t) * args->count_handles)) {
+    ret = -EFAULT;
+    goto err_points;
+    }
+
+    chains = kmalloc_array(args->count_handles, sizeof(void *),
GFP_KERNEL);
+    if (!chains) {
+    ret = -ENOMEM;
+    goto err_points;
+    }
+    for (i = 0; i < args->count_handles; i++) {
+    chains[i] = kzalloc(sizeof(struct dma_fence_chain),
GFP_KERNEL);
+    if (!chains[i]) {
+    for (j = 0; j < i; j++)
+    kfree(chains[j]);
+    ret = -ENOMEM;
+    goto err_chains;
+    }
+    }
+
+    for (i = 0; i < args->count_handles; i++) {
+    struct dma_fence *fence = dma_fence_get_stub();
+
+    drm_syncobj_add_point(syncobjs[i], chains[i],
+  fence, points[i]);
+    dma_fence_put(fence);
+    }


I think I found an issue where the implementation doesn't match what
the spec requires on host signaling.

Consider the following :


Timeline semaphore has a value of 4.

A device submits a commands to signal point 5.

The host signals point 6.


The value of the timeline will remain 4 until the device commands
complete, at which point it will change to 6.

But the specification seems to say that the timeline value should be 6
once the host signaling completes.


The only o

Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v3

2019-03-20 Thread Lionel Landwerlin

On 20/03/2019 03:53, zhoucm1 wrote:



On 2019年03月19日 19:54, Lionel Landwerlin wrote:

On 15/03/2019 12:09, Chunming Zhou wrote:
v2: individually allocate chain array, since chain node is free 
independently.
v3: all existing points must be already signaled before cpu perform 
signal operation,

 so add check condition for that.

Signed-off-by: Chunming Zhou 
---
  drivers/gpu/drm/drm_internal.h |   2 +
  drivers/gpu/drm/drm_ioctl.c    |   2 +
  drivers/gpu/drm/drm_syncobj.c  | 103 
+

  include/uapi/drm/drm.h |   1 +
  4 files changed, 108 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h 
b/drivers/gpu/drm/drm_internal.h

index dd11ae5f1eef..d9a483a5fce0 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device 
*dev, void *data,

  struct drm_file *file_private);
  int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_private);
+int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void 
*data,

+  struct drm_file *file_private);
  int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_private);
  diff --git a/drivers/gpu/drm/drm_ioctl.c 
b/drivers/gpu/drm/drm_ioctl.c

index 92b3b7b2fd81..d337f161909c 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
    DRM_UNLOCKED|DRM_RENDER_ALLOW),
  DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
    DRM_UNLOCKED|DRM_RENDER_ALLOW),
+    DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, 
drm_syncobj_timeline_signal_ioctl,

+  DRM_UNLOCKED|DRM_RENDER_ALLOW),
  DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
    DRM_UNLOCKED|DRM_RENDER_ALLOW),
  DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, 
drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index 306c7b7e2770..eaeb038f97d7 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1183,6 +1183,109 @@ drm_syncobj_signal_ioctl(struct drm_device 
*dev, void *data,

  return ret;
  }
  +int
+drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file_private)
+{
+    struct drm_syncobj_timeline_array *args = data;
+    struct drm_syncobj **syncobjs;
+    struct dma_fence_chain **chains;
+    uint64_t *points;
+    uint32_t i, j, timeline_count = 0;
+    int ret;
+
+    if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+    return -EOPNOTSUPP;
+
+    if (args->pad != 0)
+    return -EINVAL;
+
+    if (args->count_handles == 0)
+    return -EINVAL;
+
+    ret = drm_syncobj_array_find(file_private,
+ u64_to_user_ptr(args->handles),
+ args->count_handles,
+ );
+    if (ret < 0)
+    return ret;
+
+    for (i = 0; i < args->count_handles; i++) {
+    struct dma_fence_chain *chain;
+    struct dma_fence *fence;
+
+    fence = drm_syncobj_fence_get(syncobjs[i]);
+    chain = to_dma_fence_chain(fence);
+    if (chain) {
+    struct dma_fence *iter;
+
+    dma_fence_chain_for_each(iter, fence) {
+    if (!iter)
+    break;
+    if (!dma_fence_is_signaled(iter)) {
+    dma_fence_put(iter);
+    DRM_ERROR("Client must guarantee all existing 
timeline points signaled before performing host signal operation!");

+    ret = -EPERM;
+    goto out;



Sorry if I'm failing to remember whether we discussed this before.


Signaling a point from the host should be fine even if the previous 
points in the timeline are not signaled.

ok, will remove that checking.



After all this can happen on the device side as well (out of order 
signaling).



I thought the thing we didn't want is out of order submission.

Just checking the last chain node seqno against the host signal point 
should be enough.



What about simply returning -EPERM, we can warn the application from 
userspace?
OK, will add that. 



Sorry for coming back to this again, but we probably ought to drop this 
check completely.


This is allowed on the device signaling side, so I'm not quite sure what 
that protects us from.



Also we do the check at this point in the signal_timeline_ioctl() 
function but there is nothing that says that when we later call the 
add_point() function this condition still holds.



-Lionel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https:

Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v3

2019-03-20 Thread Lionel Landwerlin

On 20/03/2019 03:53, zhoucm1 wrote:



On 2019年03月19日 19:54, Lionel Landwerlin wrote:

On 15/03/2019 12:09, Chunming Zhou wrote:
v2: individually allocate chain array, since chain node is free 
independently.
v3: all existing points must be already signaled before cpu perform 
signal operation,

 so add check condition for that.

Signed-off-by: Chunming Zhou 
---
  drivers/gpu/drm/drm_internal.h |   2 +
  drivers/gpu/drm/drm_ioctl.c    |   2 +
  drivers/gpu/drm/drm_syncobj.c  | 103 
+

  include/uapi/drm/drm.h |   1 +
  4 files changed, 108 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h 
b/drivers/gpu/drm/drm_internal.h

index dd11ae5f1eef..d9a483a5fce0 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device 
*dev, void *data,

  struct drm_file *file_private);
  int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_private);
+int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void 
*data,

+  struct drm_file *file_private);
  int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_private);
  diff --git a/drivers/gpu/drm/drm_ioctl.c 
b/drivers/gpu/drm/drm_ioctl.c

index 92b3b7b2fd81..d337f161909c 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
    DRM_UNLOCKED|DRM_RENDER_ALLOW),
  DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
    DRM_UNLOCKED|DRM_RENDER_ALLOW),
+    DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, 
drm_syncobj_timeline_signal_ioctl,

+  DRM_UNLOCKED|DRM_RENDER_ALLOW),
  DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
    DRM_UNLOCKED|DRM_RENDER_ALLOW),
  DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, 
drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index 306c7b7e2770..eaeb038f97d7 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1183,6 +1183,109 @@ drm_syncobj_signal_ioctl(struct drm_device 
*dev, void *data,

  return ret;
  }
  +int
+drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file_private)
+{
+    struct drm_syncobj_timeline_array *args = data;
+    struct drm_syncobj **syncobjs;
+    struct dma_fence_chain **chains;
+    uint64_t *points;
+    uint32_t i, j, timeline_count = 0;
+    int ret;
+
+    if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+    return -EOPNOTSUPP;
+
+    if (args->pad != 0)
+    return -EINVAL;
+
+    if (args->count_handles == 0)
+    return -EINVAL;
+
+    ret = drm_syncobj_array_find(file_private,
+ u64_to_user_ptr(args->handles),
+ args->count_handles,
+ );
+    if (ret < 0)
+    return ret;
+
+    for (i = 0; i < args->count_handles; i++) {
+    struct dma_fence_chain *chain;
+    struct dma_fence *fence;
+
+    fence = drm_syncobj_fence_get(syncobjs[i]);
+    chain = to_dma_fence_chain(fence);
+    if (chain) {
+    struct dma_fence *iter;
+
+    dma_fence_chain_for_each(iter, fence) {
+    if (!iter)
+    break;
+    if (!dma_fence_is_signaled(iter)) {
+    dma_fence_put(iter);
+    DRM_ERROR("Client must guarantee all existing 
timeline points signaled before performing host signal operation!");

+    ret = -EPERM;
+    goto out;



Sorry if I'm failing to remember whether we discussed this before.


Signaling a point from the host should be fine even if the previous 
points in the timeline are not signaled.

ok, will remove that checking.



After all this can happen on the device side as well (out of order 
signaling).



I thought the thing we didn't want is out of order submission.

Just checking the last chain node seqno against the host signal point 
should be enough.



What about simply returning -EPERM, we can warn the application from 
userspace?

OK, will add that.





+    }
+    }
+    }
+    }
+
+    points = kmalloc_array(args->count_handles, sizeof(*points),
+   GFP_KERNEL);
+    if (!points) {
+    ret = -ENOMEM;
+    goto out;
+    }
+    if (!u64_to_user_ptr(args->points)) {
+    memset(points, 0, args->count_handles * sizeof(uint64_t));
+    } else if (copy_from_user(points, u64_to_user_ptr(args->points),
+  sizeof(uint64_t) * args->count_han

Re: [PATCH 1/9] dma-buf: add new dma_fence_chain container v5

2019-03-19 Thread Lionel Landwerlin

On 15/03/2019 12:09, Chunming Zhou wrote:

From: Christian König 

Lockless container implementation similar to a dma_fence_array, but with
only two elements per node and automatic garbage collection.

v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno,
 drop prev reference during garbage collection if it's not a chain fence.
v3: use head and iterator for dma_fence_chain_for_each
v4: fix reference count in dma_fence_chain_enable_signaling
v5: fix iteration when walking each chain node

Signed-off-by: Christian König 
---
  drivers/dma-buf/Makefile  |   3 +-
  drivers/dma-buf/dma-fence-chain.c | 241 ++
  include/linux/dma-fence-chain.h   |  81 ++
  3 files changed, 324 insertions(+), 1 deletion(-)
  create mode 100644 drivers/dma-buf/dma-fence-chain.c
  create mode 100644 include/linux/dma-fence-chain.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 0913a6ccab5a..1f006e083eb9 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,4 +1,5 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o
+obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
+reservation.o seqno-fence.o
  obj-$(CONFIG_SYNC_FILE)   += sync_file.o
  obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o
  obj-$(CONFIG_UDMABUF) += udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
new file mode 100644
index ..0c5e3c902fa0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -0,0 +1,241 @@
+/*
+ * fence-chain: chain fences together in a timeline
+ *
+ * Copyright (C) 2018 Advanced Micro Devices, Inc.
+ * Authors:
+ * Christian König 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
+
+/**
+ * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence
+ * @chain: chain node to get the previous node from
+ *
+ * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the
+ * chain node.
+ */
+static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain 
*chain)
+{
+   struct dma_fence *prev;
+
+   rcu_read_lock();
+   prev = dma_fence_get_rcu_safe(>prev);
+   rcu_read_unlock();
+   return prev;
+}
+
+/**
+ * dma_fence_chain_walk - chain walking function
+ * @fence: current chain node
+ *
+ * Walk the chain to the next node. Returns the next fence or NULL if we are at
+ * the end of the chain. Garbage collects chain nodes which are already
+ * signaled.
+ */
+struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence)
+{
+   struct dma_fence_chain *chain, *prev_chain;
+   struct dma_fence *prev, *replacement, *tmp;
+
+   chain = to_dma_fence_chain(fence);
+   if (!chain) {
+   dma_fence_put(fence);
+   return NULL;
+   }
+
+   while ((prev = dma_fence_chain_get_prev(chain))) {
+
+   prev_chain = to_dma_fence_chain(prev);
+   if (prev_chain) {
+   if (!dma_fence_is_signaled(prev_chain->fence))
+   break;
+
+   replacement = dma_fence_chain_get_prev(prev_chain);
+   } else {
+   if (!dma_fence_is_signaled(prev))
+   break;
+
+   replacement = NULL;
+   }
+
+   tmp = cmpxchg(>prev, prev, replacement);
+   if (tmp == prev)
+   dma_fence_put(tmp);
+   else
+   dma_fence_put(replacement);
+   dma_fence_put(prev);
+   }
+
+   dma_fence_put(fence);
+   return prev;
+}
+EXPORT_SYMBOL(dma_fence_chain_walk);
+
+/**
+ * dma_fence_chain_find_seqno - find fence chain node by seqno
+ * @pfence: pointer to the chain node where to start
+ * @seqno: the sequence number to search for
+ *
+ * Advance the fence pointer to the chain node which will signal this sequence
+ * number. If no sequence number is provided then this is a no-op.
+ *
+ * Returns EINVAL if the fence is not a chain node or the sequence number has
+ * not yet advanced far enough.
+ */
+int dma_fence_chain_find_seqno(struct dma_fence **pfence, uint64_t seqno)
+{
+   struct dma_fence_chain *chain;
+
+   if (!seqno)
+   return 0;
+
+   chain = to_dma_fence_chain(*pfence);
+   if (!chain || chain->base.seqno < seqno)
+   return -EINVAL;

Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v3

2019-03-19 Thread Lionel Landwerlin

On 15/03/2019 12:09, Chunming Zhou wrote:

v2: individually allocate chain array, since chain node is free independently.
v3: all existing points must be already signaled before cpu perform signal 
operation,
 so add check condition for that.

Signed-off-by: Chunming Zhou 
---
  drivers/gpu/drm/drm_internal.h |   2 +
  drivers/gpu/drm/drm_ioctl.c|   2 +
  drivers/gpu/drm/drm_syncobj.c  | 103 +
  include/uapi/drm/drm.h |   1 +
  4 files changed, 108 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index dd11ae5f1eef..d9a483a5fce0 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void 
*data,
struct drm_file *file_private);
  int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_private);
+int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_private);
  int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_private);
  
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c

index 92b3b7b2fd81..d337f161909c 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, 
drm_syncobj_timeline_signal_ioctl,
+ DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 
DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 306c7b7e2770..eaeb038f97d7 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1183,6 +1183,109 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void 
*data,
return ret;
  }
  
+int

+drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_private)
+{
+   struct drm_syncobj_timeline_array *args = data;
+   struct drm_syncobj **syncobjs;
+   struct dma_fence_chain **chains;
+   uint64_t *points;
+   uint32_t i, j, timeline_count = 0;
+   int ret;
+
+   if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+   return -EOPNOTSUPP;
+
+   if (args->pad != 0)
+   return -EINVAL;
+
+   if (args->count_handles == 0)
+   return -EINVAL;
+
+   ret = drm_syncobj_array_find(file_private,
+u64_to_user_ptr(args->handles),
+args->count_handles,
+);
+   if (ret < 0)
+   return ret;
+
+   for (i = 0; i < args->count_handles; i++) {
+   struct dma_fence_chain *chain;
+struct dma_fence *fence;
+
+fence = drm_syncobj_fence_get(syncobjs[i]);
+chain = to_dma_fence_chain(fence);
+if (chain) {
+struct dma_fence *iter;
+
+dma_fence_chain_for_each(iter, fence) {
+if (!iter)
+break;
+   if (!dma_fence_is_signaled(iter)) {
+   dma_fence_put(iter);
+   DRM_ERROR("Client must guarantee all 
existing timeline points signaled before performing host signal operation!");
+   ret = -EPERM;
+   goto out;



Sorry if I'm failing to remember whether we discussed this before.


Signaling a point from the host should be fine even if the previous 
points in the timeline are not signaled.


After all this can happen on the device side as well (out of order 
signaling).



I thought the thing we didn't want is out of order submission.

Just checking the last chain node seqno against the host signal point 
should be enough.



What about simply returning -EPERM, we can warn the application from 
userspace?




+   }
+}
+}
+}
+
+   points = kmalloc_array(args->count_handles, sizeof(*points),
+  GFP_KERNEL);
+   if (!points) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   if (!u64_to_user_ptr(args->points)) {
+   memset(points, 0, 

Re: [PATCH 3/9] drm/syncobj: add support for timeline point wait v8

2019-03-18 Thread Lionel Landwerlin

On 18/03/2019 17:20, Koenig, Christian wrote:

   -    if (dma_fence_is_signaled(entries[i].fence)) {
+    if (fence)
+    entries[i].fence = fence;
+    else
+    entries[i].fence = dma_fence_get_stub();
+
+    if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE) ||

Hey David,

Could you remind me what DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE is used
for?

I didn't have a use for it in userspace,

The flag is used to wait for fences for a sequence number to show up.

Christian.



Thanks Christian.

I guess I missed the additive nature of WAIT_FOR_SUBMIT.


It feels like WAIT_AVAILABLE almost deserves its own commit as it 
affects both timelines & binary syncobjs.



-Lionel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 3/9] drm/syncobj: add support for timeline point wait v8

2019-03-18 Thread Lionel Landwerlin

On 15/03/2019 12:09, Chunming Zhou wrote:

points array is one-to-one match with syncobjs array.
v2:
add seperate ioctl for timeline point wait, otherwise break uapi.
v3:
userspace can specify two kinds waits::
a. Wait for time point to be completed.
b. and wait for time point to become available
v4:
rebase
v5:
add comment for xxx_WAIT_AVAILABLE
v6: rebase and rework on new container
v7: drop _WAIT_COMPLETED, it is the default anyway
v8: correctly handle garbage collected fences

Signed-off-by: Chunming Zhou 
Signed-off-by: Christian König 
Cc: Daniel Rakos 
Cc: Jason Ekstrand 
Cc: Bas Nieuwenhuizen 
Cc: Dave Airlie 
Cc: Chris Wilson 
---
  drivers/gpu/drm/drm_internal.h |   2 +
  drivers/gpu/drm/drm_ioctl.c|   2 +
  drivers/gpu/drm/drm_syncobj.c  | 153 ++---
  include/uapi/drm/drm.h |  15 
  4 files changed, 143 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 251d67e04c2d..331ac6225b58 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -182,6 +182,8 @@ int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, 
void *data,
   struct drm_file *file_private);
  int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_private);
+int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file_private);
  int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_private);
  int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 687943df58e1..c984654646fa 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -688,6 +688,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, 
drm_syncobj_timeline_wait_ioctl,
+ DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 19a9ce638119..eae51978cda4 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -61,6 +61,7 @@ struct syncobj_wait_entry {
struct task_struct *task;
struct dma_fence *fence;
struct dma_fence_cb fence_cb;
+   u64point;
  };
  
  static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,

@@ -95,6 +96,8 @@ EXPORT_SYMBOL(drm_syncobj_find);
  static void drm_syncobj_fence_add_wait(struct drm_syncobj *syncobj,
   struct syncobj_wait_entry *wait)
  {
+   struct dma_fence *fence;
+
if (wait->fence)
return;
  
@@ -103,11 +106,15 @@ static void drm_syncobj_fence_add_wait(struct drm_syncobj *syncobj,

 * have the lock, try one more time just to be sure we don't add a
 * callback when a fence has already been set.
 */
-   if (syncobj->fence)
-   wait->fence = dma_fence_get(
-   rcu_dereference_protected(syncobj->fence, 1));
-   else
+   fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, 1));
+   if (!fence || dma_fence_chain_find_seqno(, wait->point)) {
+   dma_fence_put(fence);
list_add_tail(>node, >cb_list);
+   } else if (!fence) {
+   wait->fence = dma_fence_get_stub();
+   } else {
+   wait->fence = fence;
+   }
spin_unlock(>lock);
  }
  
@@ -149,10 +156,8 @@ void drm_syncobj_add_point(struct drm_syncobj *syncobj,

dma_fence_chain_init(chain, prev, fence, point);
rcu_assign_pointer(syncobj->fence, >base);
  
-	list_for_each_entry_safe(cur, tmp, >cb_list, node) {

-   list_del_init(>node);
+   list_for_each_entry_safe(cur, tmp, >cb_list, node)
syncobj_wait_syncobj_func(syncobj, cur);
-   }
spin_unlock(>lock);
  
  	/* Walk the chain once to trigger garbage collection */

@@ -184,10 +189,8 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
rcu_assign_pointer(syncobj->fence, fence);
  
  	if (fence != old_fence) {

-   list_for_each_entry_safe(cur, tmp, >cb_list, node) {
-   list_del_init(>node);
+   list_for_each_entry_safe(cur, tmp, >cb_list, node)
syncobj_wait_syncobj_func(syncobj, cur);
-   }
}
  
  	spin_unlock(>lock);

@@ -644,13 +647,27 @@ 

Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline

2019-02-19 Thread Lionel Landwerlin

Thanks David,

Will give this a go!

-Lionel

On 19/02/2019 10:46, zhoucm1 wrote:


Hi Lionel,

the attached should fix your problem and also messed signal order.

Hi Christian,

Could you have a look if it's reasonable?


btw: I pushed to change to 
https://github.com/amingriyue/timeline-syncobj-kernel, which is 
already rebased to latest drm-misc(kernel 5.0). You can directly use 
that branch.



-David


On 2019年02月19日 01:01, Koenig, Christian wrote:

Am 18.02.19 um 13:07 schrieb Lionel Landwerlin:

Thanks guys :)

You mentioned that signaling out of order is illegal.
Is this illegal with regard to the vulkan spec or to the syncobj 
implementation?


David is the expert on that, but as far as I know that is forbidden 
by the vulkan spec.


I'm not finding anything in the vulkan spec that makes out of order 
signaling illegal.
That's why I came up with this test, just verifying that the 
timeline does not go backward in term of its payload.


Well we need to handle this case gracefully in the kernel, so it is 
still a good testcase.


Christian.



-Lionel

On 18/02/2019 11:01, Koenig, Christian wrote:

Hi David,

well I think Lionel is testing the invalid signal order on purpose :)

Anyway we really need to handle invalid order graceful here. E.g. 
either the same way as during CS or we abort and return an error 
message.


I think just using the same approach as during CS ist the best we 
can do.


Regards,
Christian


Am 18.02.2019 11:35 schrieb "Zhou, David(ChunMing)" 
:


Hi Lionel,

I checked your igt test case,

uint64_t points[5] = { 1, 5, 3, 7, 6 };

which is illegal signal order.

I must admit we should handle it gracefully if signal isn't 
in-order, and we shouldn't lead to deadlock.


Hi Christian,

Can we just ignore when signal point X <= timeline Y? Or just give 
a warning?


Otherwise like Lionel's unexpected use cases, which easily leads to 
deadlock.



-David


On 2019年02月15日 22:28, Lionel Landwerlin wrote:

Hi David,

Thanks a lot for point me to the tests you've added in IGT.
While adding a test with that signals fences imported into a timeline
syncobj out of order, I ran into a deadlock.
Here is the test :
https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9

Trying to kill the deadlocked process I got this backtrace :


[   33.969136] [IGT] syncobj_timeline: starting subtest signal-order
[   60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s!
[syncobj_timelin:2021]
[   60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc
nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel
rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl
snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf
s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event
libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass
btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq
snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me
mei intel_pch_thermal mac_hid acpi_pad parp
ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit
ghash_clmulni_intel prime_numbers
drm_kms_helper aesni_intel syscopyarea sysfillrect
[   60.452876]  sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci
cryptd drm e1000e glue_helper cqhci sdhci wmi video
[   60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G
U    5.0.0-rc5+ #337
[   60.452882] Hardware name:  /NUC6i7KYB, BIOS
KYSKLi70.86A.0042.2016.0929.1933 09/29/2016
[   60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260
[   60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0
74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f
00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41
[   60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX:
ff13
[   60.452889] RAX:  RBX: 8f5690fb2480 RCX:
8f5690fb2f00
[   60.452890] RDX: 003e3730 RSI:  RDI:
8f5690fb2180
[   60.452891] RBP: 8f5690fb2180 R08:  R09:
8f5690fb2eb0
[   60.452891] R10:  R11: 8f5660469860 R12:
8f5690fb2f68
[   60.452892] R13: 8f5690fb2f00 R14: 0003 R15:
8f5655a45fc0
[   60.452913] FS:  7fdc5c459980() GS:8f569eb8()
knlGS:
[   60.452913] CS:  0010 DS:  ES:  CR0: 80050033
[   60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4:
003606e0
[   60.452915] DR0:  DR1:  DR2:

[   60.452915] DR3:  DR6: fffe0ff0 DR7:
0400
[   60

Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline

2019-02-18 Thread Lionel Landwerlin

Thanks guys :)

You mentioned that signaling out of order is illegal.
Is this illegal with regard to the vulkan spec or to the syncobj 
implementation?


I'm not finding anything in the vulkan spec that makes out of order 
signaling illegal.
That's why I came up with this test, just verifying that the timeline 
does not go backward in term of its payload.


-Lionel

On 18/02/2019 11:01, Koenig, Christian wrote:

Hi David,

well I think Lionel is testing the invalid signal order on purpose :)

Anyway we really need to handle invalid order graceful here. E.g. 
either the same way as during CS or we abort and return an error message.


I think just using the same approach as during CS ist the best we can do.

Regards,
Christian


Am 18.02.2019 11:35 schrieb "Zhou, David(ChunMing)" :

Hi Lionel,

I checked your igt test case,

uint64_t points[5] = { 1, 5, 3, 7, 6 };

which is illegal signal order.

I must admit we should handle it gracefully if signal isn't in-order, 
and we shouldn't lead to deadlock.


Hi Christian,

Can we just ignore when signal point X <= timeline Y? Or just give a 
warning?


Otherwise like Lionel's unexpected use cases, which easily leads to 
deadlock.



-David


On 2019年02月15日 22:28, Lionel Landwerlin wrote:

Hi David,

Thanks a lot for point me to the tests you've added in IGT.
While adding a test with that signals fences imported into a timeline
syncobj out of order, I ran into a deadlock.
Here is the test :
https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9

Trying to kill the deadlocked process I got this backtrace :


[   33.969136] [IGT] syncobj_timeline: starting subtest signal-order
[   60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s!
[syncobj_timelin:2021]
[   60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc
nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel
rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl
snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf
s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event
libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass
btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq
snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me
mei intel_pch_thermal mac_hid acpi_pad parp
ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit
ghash_clmulni_intel prime_numbers
drm_kms_helper aesni_intel syscopyarea sysfillrect
[   60.452876]  sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci
cryptd drm e1000e glue_helper cqhci sdhci wmi video
[   60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G
U    5.0.0-rc5+ #337
[   60.452882] Hardware name:  /NUC6i7KYB, BIOS
KYSKLi70.86A.0042.2016.0929.1933 09/29/2016
[   60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260
[   60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0
74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f
00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41
[   60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX:
ff13
[   60.452889] RAX:  RBX: 8f5690fb2480 RCX:
8f5690fb2f00
[   60.452890] RDX: 003e3730 RSI:  RDI:
8f5690fb2180
[   60.452891] RBP: 8f5690fb2180 R08:  R09:
8f5690fb2eb0
[   60.452891] R10:  R11: 8f5660469860 R12:
8f5690fb2f68
[   60.452892] R13: 8f5690fb2f00 R14: 0003 R15:
8f5655a45fc0
[   60.452913] FS:  7fdc5c459980() GS:8f569eb8()
knlGS:
[   60.452913] CS:  0010 DS:  ES:  CR0: 80050033
[   60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4:
003606e0
[   60.452915] DR0:  DR1:  DR2:

[   60.452915] DR3:  DR6: fffe0ff0 DR7:
0400
[   60.452916] Call Trace:
[   60.452958]  drm_syncobj_add_point+0x102/0x160 [drm]
[   60.452965]  ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm]
[   60.452971]  drm_syncobj_transfer_ioctl+0x10f/0x180 [drm]
[   60.452978]  drm_ioctl_kernel+0xac/0xf0 [drm]
[   60.452984]  drm_ioctl+0x2eb/0x3b0 [drm]
[   60.452990]  ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm]
[   60.452992]  ? sw_sync_ioctl+0x347/0x370
[   60.452994]  do_vfs_ioctl+0xa4/0x640
[   60.452995]  ? __fput+0x134/0x220
[   60.452997]  ? do_fcntl+0x1a5/0x650
[   60.452998]  ksys_ioctl+0x70/0x80
[   60.452999]  __x64_sys_ioctl+0x16/0x20
[   60.453002]  do_syscall_64+0x55/0x110
[   60.453004]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   60.453005] RIP: 

Re: [PATCH 06/11] drm/syncobj: add timeline payload query ioctl v4

2019-02-18 Thread Lionel Landwerlin

On 18/02/2019 07:28, Koenig, Christian wrote:

Am 18.02.19 um 04:10 schrieb zhoucm1:


On 2019年02月17日 03:22, Christian König wrote:

Am 15.02.19 um 20:31 schrieb Lionel Landwerlin via amd-gfx:

On 07/12/2018 09:55, Chunming Zhou wrote:

user mode can query timeline payload.
v2: check return value of copy_to_user
v3: handle querying entry by entry
v4: rebase on new chain container, simplify interface

Signed-off-by: Chunming Zhou 
Cc: Daniel Rakos 
Cc: Jason Ekstrand 
Cc: Bas Nieuwenhuizen 
Cc: Dave Airlie 
Cc: Christian König 
Cc: Chris Wilson 
---
   drivers/gpu/drm/drm_internal.h |  2 ++
   drivers/gpu/drm/drm_ioctl.c    |  2 ++
   drivers/gpu/drm/drm_syncobj.c  | 43
++
   include/uapi/drm/drm.h | 10 
   4 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h
b/drivers/gpu/drm/drm_internal.h
index 18b41e10195c..dab4d5936441 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -184,6 +184,8 @@ int drm_syncobj_reset_ioctl(struct drm_device
*dev, void *data,
   struct drm_file *file_private);
   int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
    struct drm_file *file_private);
+int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
+    struct drm_file *file_private);
     /* drm_framebuffer.c */
   void drm_framebuffer_print_info(struct drm_printer *p, unsigned
int indent,
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index a9a17ed35cc4..7578ef6dc1d1 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -681,6 +681,8 @@ static const struct drm_ioctl_desc drm_ioctls[]
= {
     DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL,
drm_syncobj_signal_ioctl,
     DRM_UNLOCKED|DRM_RENDER_ALLOW),
+    DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
+  DRM_UNLOCKED|DRM_RENDER_ALLOW),
   DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE,
drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
   DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE,
drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED),
   DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE,
drm_mode_create_lease_ioctl, DRM_MASTER|DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c
index 348079bb0965..f97fa00ca1d0 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1061,3 +1061,46 @@ drm_syncobj_signal_ioctl(struct drm_device
*dev, void *data,
     return ret;
   }
+
+int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
+    struct drm_file *file_private)
+{
+    struct drm_syncobj_timeline_array *args = data;
+    struct drm_syncobj **syncobjs;
+    uint64_t __user *points = u64_to_user_ptr(args->points);
+    uint32_t i;
+    int ret;
+
+    if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+    return -ENODEV;
+
+    if (args->pad != 0)
+    return -EINVAL;
+
+    if (args->count_handles == 0)
+    return -EINVAL;
+
+    ret = drm_syncobj_array_find(file_private,
+ u64_to_user_ptr(args->handles),
+ args->count_handles,
+ );
+    if (ret < 0)
+    return ret;
+
+    for (i = 0; i < args->count_handles; i++) {
+    struct dma_fence_chain *chain;
+    struct dma_fence *fence;
+    uint64_t point;
+
+    fence = drm_syncobj_fence_get(syncobjs[i]);
+    chain = to_dma_fence_chain(fence);
+    point = chain ? fence->seqno : 0;


Sorry, I don' t want to sound annoying, but this looks like this
could report values going backward.

Well please be annoying as much as you can :) But yeah all that stuff
has been discussed before as well.


Anything add a point X to a timeline that has reached value Y with X
< Y would trigger that.

Yes, that can indeed happen.

trigger what? when adding x (x < y), then return 0 when query?
Why would this happen?
No, syncobj->fence should always be there and the last chain node, if
it is ever added.

Well maybe Lionel should clarify a bit what he means?

I thought he is concerned that the call could return values where X < Y,
but that doesn't seem to be the case.

Christian.



I meant something like this :


t = create_timeline_syncobj()

signal(t, 2)

query(t) => 2

signal(t, 1)

query(t) => 1


-Lionel






-David

But adding a timeline point X which is before the already added point
Y is illegal in the first place :)

So when the application does something stupid and breaks it can just
keep the pieces.

In the kernel we still do the most defensive thing and sync to
everything in this case.

I'm just not sure if we should print an error into syslog or just
continue silently.

Regards,
Christian.


Either through the submission or userspace signaling or importing
another syncpoint's fence.


-Lionel



+    ret = copy_t

Re: [PATCH 06/11] drm/syncobj: add timeline payload query ioctl v4

2019-02-15 Thread Lionel Landwerlin via amd-gfx

On 07/12/2018 09:55, Chunming Zhou wrote:

user mode can query timeline payload.
v2: check return value of copy_to_user
v3: handle querying entry by entry
v4: rebase on new chain container, simplify interface

Signed-off-by: Chunming Zhou 
Cc: Daniel Rakos 
Cc: Jason Ekstrand 
Cc: Bas Nieuwenhuizen 
Cc: Dave Airlie 
Cc: Christian König 
Cc: Chris Wilson 
---
  drivers/gpu/drm/drm_internal.h |  2 ++
  drivers/gpu/drm/drm_ioctl.c|  2 ++
  drivers/gpu/drm/drm_syncobj.c  | 43 ++
  include/uapi/drm/drm.h | 10 
  4 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 18b41e10195c..dab4d5936441 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -184,6 +184,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void 
*data,
struct drm_file *file_private);
  int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_private);
+int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file_private);
  
  /* drm_framebuffer.c */

  void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent,
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index a9a17ed35cc4..7578ef6dc1d1 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -681,6 +681,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
+ DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 
DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, 
drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, 
DRM_MASTER|DRM_UNLOCKED),
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 348079bb0965..f97fa00ca1d0 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1061,3 +1061,46 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void 
*data,
  
  	return ret;

  }
+
+int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file_private)
+{
+   struct drm_syncobj_timeline_array *args = data;
+   struct drm_syncobj **syncobjs;
+   uint64_t __user *points = u64_to_user_ptr(args->points);
+   uint32_t i;
+   int ret;
+
+   if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+   return -ENODEV;
+
+   if (args->pad != 0)
+   return -EINVAL;
+
+   if (args->count_handles == 0)
+   return -EINVAL;
+
+   ret = drm_syncobj_array_find(file_private,
+u64_to_user_ptr(args->handles),
+args->count_handles,
+);
+   if (ret < 0)
+   return ret;
+
+   for (i = 0; i < args->count_handles; i++) {
+   struct dma_fence_chain *chain;
+   struct dma_fence *fence;
+   uint64_t point;
+
+   fence = drm_syncobj_fence_get(syncobjs[i]);
+   chain = to_dma_fence_chain(fence);
+   point = chain ? fence->seqno : 0;



Sorry, I don' t want to sound annoying, but this looks like this could 
report values going backward.


Anything add a point X to a timeline that has reached value Y with X < Y 
would trigger that.


Either through the submission or userspace signaling or importing 
another syncpoint's fence.



-Lionel



+   ret = copy_to_user([i], , sizeof(uint64_t));
+   ret = ret ? -EFAULT : 0;
+   if (ret)
+   break;
+   }
+   drm_syncobj_array_free(syncobjs, args->count_handles);
+
+   return ret;
+}
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 0092111d002c..b2c36f2b2599 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -767,6 +767,14 @@ struct drm_syncobj_array {
__u32 pad;
  };
  
+struct drm_syncobj_timeline_array {

+   __u64 handles;
+   __u64 points;
+   __u32 count_handles;
+   __u32 pad;
+};
+
+
  /* Query current scanout sequence number */
  struct drm_crtc_get_sequence {
__u32 crtc_id;  /* requested crtc_id */
@@ -924,6 +932,8 @@ extern "C" {
  #define DRM_IOCTL_MODE_REVOKE_LEASE   DRM_IOWR(0xC9, struct 
drm_mode_revoke_lease)
  
  #define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT	DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait)

+#define DRM_IOCTL_SYNCOBJ_QUERYDRM_IOWR(0xCB, struct 

Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container v4

2019-02-15 Thread Lionel Landwerlin via amd-gfx

On 15/02/2019 14:32, Koenig, Christian wrote:

Am 15.02.19 um 15:23 schrieb Lionel Landwerlin:

Hi Christian, David,

For timeline semaphore we need points to signaled in order.
I'm struggling to understand how this fence-chain implementation
preserves ordering of the seqnos.

One of the scenario I can see an issue happening is when you have a
timeline with points 1 & 2 and userspace submits for 2 different
engines :
     - first with let's say a blitter style engine on point 2
     - then a 3d style engine on point 1

Yeah, and where exactly is the problem?

Seqno 1 will signal when the 3d style engine finishes work.

And seqno 2 will signal when both seqno 1 is signaled and the blitter
style engine has finished its work.


That's not really how I understood the spec, but I might be wrong.

What makes me thing 1 should be signaled as soon as 2 is signaled
(regardless of whether the fence attached on point 1 is been signaled),
is that the spec defines wait & signal operations in term of the value
of the timeline.


-Lionel




Another scenario would be signaling a timeline with points 1 & 2 with
those points in reverse order in the submission array.

That is actually illegal in the spec, but actually handled gracefully as
well.

E.g. when you add seqno 1 to the syncobj container it will only signal
when 2 is signaled as well.







Regards,
Christian.


-Lionel

On 07/12/2018 09:55, Chunming Zhou wrote:

From: Christian König 

Lockless container implementation similar to a dma_fence_array, but with
only two elements per node and automatic garbage collection.

v2: properly document dma_fence_chain_for_each, add
dma_fence_chain_find_seqno,
  drop prev reference during garbage collection if it's not a
chain fence.
v3: use head and iterator for dma_fence_chain_for_each
v4: fix reference count in dma_fence_chain_enable_signaling

Signed-off-by: Christian König 
---
   drivers/dma-buf/Makefile  |   3 +-
   drivers/dma-buf/dma-fence-chain.c | 241 ++
   include/linux/dma-fence-chain.h   |  81 ++
   3 files changed, 324 insertions(+), 1 deletion(-)
   create mode 100644 drivers/dma-buf/dma-fence-chain.c
   create mode 100644 include/linux/dma-fence-chain.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 0913a6ccab5a..1f006e083eb9 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,4 +1,5 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o
seqno-fence.o
+obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
+ reservation.o seqno-fence.o
   obj-$(CONFIG_SYNC_FILE)    += sync_file.o
   obj-$(CONFIG_SW_SYNC)    += sw_sync.o sync_debug.o
   obj-$(CONFIG_UDMABUF)    += udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-chain.c
b/drivers/dma-buf/dma-fence-chain.c
new file mode 100644
index ..0c5e3c902fa0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -0,0 +1,241 @@
+/*
+ * fence-chain: chain fences together in a timeline
+ *
+ * Copyright (C) 2018 Advanced Micro Devices, Inc.
+ * Authors:
+ *    Christian König 
+ *
+ * This program is free software; you can redistribute it and/or
modify it
+ * under the terms of the GNU General Public License version 2 as
published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
License for
+ * more details.
+ */
+
+#include 
+
+static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
+
+/**
+ * dma_fence_chain_get_prev - use RCU to get a reference to the
previous fence
+ * @chain: chain node to get the previous node from
+ *
+ * Use dma_fence_get_rcu_safe to get a reference to the previous
fence of the
+ * chain node.
+ */
+static struct dma_fence *dma_fence_chain_get_prev(struct
dma_fence_chain *chain)
+{
+    struct dma_fence *prev;
+
+    rcu_read_lock();
+    prev = dma_fence_get_rcu_safe(>prev);
+    rcu_read_unlock();
+    return prev;
+}
+
+/**
+ * dma_fence_chain_walk - chain walking function
+ * @fence: current chain node
+ *
+ * Walk the chain to the next node. Returns the next fence or NULL
if we are at
+ * the end of the chain. Garbage collects chain nodes which are already
+ * signaled.
+ */
+struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence)
+{
+    struct dma_fence_chain *chain, *prev_chain;
+    struct dma_fence *prev, *replacement, *tmp;
+
+    chain = to_dma_fence_chain(fence);
+    if (!chain) {
+    dma_fence_put(fence);
+    return NULL;
+    }
+
+    while ((prev = dma_fence_chain_get_prev(chain))) {
+
+    prev_chain = to_dma_fence_chain(prev);
+    if (prev_chain) {
+    if (!dma_fence_is_signaled(prev_chain->fence))
+    break;
+
+    replacement = dma_fence_chain_get_prev(prev_chain

Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container v4

2019-02-15 Thread Lionel Landwerlin via amd-gfx

Hi Christian, David,

For timeline semaphore we need points to signaled in order.
I'm struggling to understand how this fence-chain implementation 
preserves ordering of the seqnos.


One of the scenario I can see an issue happening is when you have a 
timeline with points 1 & 2 and userspace submits for 2 different engines :

    - first with let's say a blitter style engine on point 2
    - then a 3d style engine on point 1

Another scenario would be signaling a timeline with points 1 & 2 with 
those points in reverse order in the submission array.


-Lionel

On 07/12/2018 09:55, Chunming Zhou wrote:

From: Christian König 

Lockless container implementation similar to a dma_fence_array, but with
only two elements per node and automatic garbage collection.

v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno,
 drop prev reference during garbage collection if it's not a chain fence.
v3: use head and iterator for dma_fence_chain_for_each
v4: fix reference count in dma_fence_chain_enable_signaling

Signed-off-by: Christian König 
---
  drivers/dma-buf/Makefile  |   3 +-
  drivers/dma-buf/dma-fence-chain.c | 241 ++
  include/linux/dma-fence-chain.h   |  81 ++
  3 files changed, 324 insertions(+), 1 deletion(-)
  create mode 100644 drivers/dma-buf/dma-fence-chain.c
  create mode 100644 include/linux/dma-fence-chain.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 0913a6ccab5a..1f006e083eb9 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,4 +1,5 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o
+obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
+reservation.o seqno-fence.o
  obj-$(CONFIG_SYNC_FILE)   += sync_file.o
  obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o
  obj-$(CONFIG_UDMABUF) += udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
new file mode 100644
index ..0c5e3c902fa0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -0,0 +1,241 @@
+/*
+ * fence-chain: chain fences together in a timeline
+ *
+ * Copyright (C) 2018 Advanced Micro Devices, Inc.
+ * Authors:
+ * Christian König 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
+
+/**
+ * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence
+ * @chain: chain node to get the previous node from
+ *
+ * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the
+ * chain node.
+ */
+static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain 
*chain)
+{
+   struct dma_fence *prev;
+
+   rcu_read_lock();
+   prev = dma_fence_get_rcu_safe(>prev);
+   rcu_read_unlock();
+   return prev;
+}
+
+/**
+ * dma_fence_chain_walk - chain walking function
+ * @fence: current chain node
+ *
+ * Walk the chain to the next node. Returns the next fence or NULL if we are at
+ * the end of the chain. Garbage collects chain nodes which are already
+ * signaled.
+ */
+struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence)
+{
+   struct dma_fence_chain *chain, *prev_chain;
+   struct dma_fence *prev, *replacement, *tmp;
+
+   chain = to_dma_fence_chain(fence);
+   if (!chain) {
+   dma_fence_put(fence);
+   return NULL;
+   }
+
+   while ((prev = dma_fence_chain_get_prev(chain))) {
+
+   prev_chain = to_dma_fence_chain(prev);
+   if (prev_chain) {
+   if (!dma_fence_is_signaled(prev_chain->fence))
+   break;
+
+   replacement = dma_fence_chain_get_prev(prev_chain);
+   } else {
+   if (!dma_fence_is_signaled(prev))
+   break;
+
+   replacement = NULL;
+   }
+
+   tmp = cmpxchg(>prev, prev, replacement);
+   if (tmp == prev)
+   dma_fence_put(tmp);
+   else
+   dma_fence_put(replacement);
+   dma_fence_put(prev);
+   }
+
+   dma_fence_put(fence);
+   return prev;
+}
+EXPORT_SYMBOL(dma_fence_chain_walk);
+
+/**
+ * dma_fence_chain_find_seqno - find fence chain node by seqno
+ * @pfence: pointer to the chain node where to start
+ * @seqno: the sequence number to search for
+ *
+ * Advance the fence pointer to the 

Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline

2019-02-15 Thread Lionel Landwerlin via amd-gfx

Hi David,

Thanks a lot for point me to the tests you've added in IGT.
While adding a test with that signals fences imported into a timeline 
syncobj out of order, I ran into a deadlock.
Here is the test : 
https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9


Trying to kill the deadlocked process I got this backtrace :


[   33.969136] [IGT] syncobj_timeline: starting subtest signal-order
[   60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s! 
[syncobj_timelin:2021]
[   60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc 
nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel 
rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl 
snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf
s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event 
libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass 
btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq 
snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me 
mei intel_pch_thermal mac_hid acpi_pad parp
ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress 
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit 
ghash_clmulni_intel prime_numbers

drm_kms_helper aesni_intel syscopyarea sysfillrect
[   60.452876]  sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci 
cryptd drm e1000e glue_helper cqhci sdhci wmi video
[   60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G 
U    5.0.0-rc5+ #337
[   60.452882] Hardware name:  /NUC6i7KYB, BIOS 
KYSKLi70.86A.0042.2016.0929.1933 09/29/2016

[   60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260
[   60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0 
74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f 
00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41
[   60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX: 
ff13
[   60.452889] RAX:  RBX: 8f5690fb2480 RCX: 
8f5690fb2f00
[   60.452890] RDX: 003e3730 RSI:  RDI: 
8f5690fb2180
[   60.452891] RBP: 8f5690fb2180 R08:  R09: 
8f5690fb2eb0
[   60.452891] R10:  R11: 8f5660469860 R12: 
8f5690fb2f68
[   60.452892] R13: 8f5690fb2f00 R14: 0003 R15: 
8f5655a45fc0
[   60.452913] FS:  7fdc5c459980() GS:8f569eb8() 
knlGS:

[   60.452913] CS:  0010 DS:  ES:  CR0: 80050033
[   60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4: 
003606e0
[   60.452915] DR0:  DR1:  DR2: 

[   60.452915] DR3:  DR6: fffe0ff0 DR7: 
0400

[   60.452916] Call Trace:
[   60.452958]  drm_syncobj_add_point+0x102/0x160 [drm]
[   60.452965]  ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm]
[   60.452971]  drm_syncobj_transfer_ioctl+0x10f/0x180 [drm]
[   60.452978]  drm_ioctl_kernel+0xac/0xf0 [drm]
[   60.452984]  drm_ioctl+0x2eb/0x3b0 [drm]
[   60.452990]  ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm]
[   60.452992]  ? sw_sync_ioctl+0x347/0x370
[   60.452994]  do_vfs_ioctl+0xa4/0x640
[   60.452995]  ? __fput+0x134/0x220
[   60.452997]  ? do_fcntl+0x1a5/0x650
[   60.452998]  ksys_ioctl+0x70/0x80
[   60.452999]  __x64_sys_ioctl+0x16/0x20
[   60.453002]  do_syscall_64+0x55/0x110
[   60.453004]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   60.453005] RIP: 0033:0x7fdc5b6e45d7
[   60.453006] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
[   60.453007] RSP: 002b:7fff25c4d198 EFLAGS: 0206 ORIG_RAX: 
0010
[   60.453008] RAX: ffda RBX:  RCX: 
7fdc5b6e45d7
[   60.453008] RDX: 7fff25c4d200 RSI: c02064cc RDI: 
0003
[   60.453009] RBP: 7fff25c4d1d0 R08:  R09: 
001e
[   60.453010] R10:  R11: 0206 R12: 
563d3959e4d0
[   60.453010] R13: 7fff25c4d620 R14:  R15: 

[   88.447359] watchdog: BUG: soft lockup - CPU#6 stuck for 22s! 
[syncobj_timelin:2021]



-Lionel


On 07/12/2018 09:55, Chunming Zhou wrote:

we need to import/export timeline point

Signed-off-by: Chunming Zhou 
---
  drivers/gpu/drm/drm_internal.h |  4 +++
  drivers/gpu/drm/drm_ioctl.c|  6 
  drivers/gpu/drm/drm_syncobj.c  | 66 ++
  include/uapi/drm/drm.h | 10 ++
  4 files changed, 86 insertions(+)

diff --git