Re: [PATCH] drm/gem: Expose the buffer object handle to userspace last

2023-02-23 Thread Christian König via Virtualization

Am 20.02.23 um 11:23 schrieb Tvrtko Ursulin:


On 20/02/2023 10:01, Christian König wrote:

Am 20.02.23 um 10:55 schrieb Tvrtko Ursulin:


Hi,

On 14/02/2023 13:59, Christian König wrote:

Am 14.02.23 um 13:50 schrieb Tvrtko Ursulin:

From: Tvrtko Ursulin 

Currently drm_gem_handle_create_tail exposes the handle to userspace
before the buffer object constructions is complete. This allowing
of working against a partially constructed object, which may also 
be in

the process of having its creation fail, can have a range of negative
outcomes.

A lot of those will depend on what the individual drivers are 
doing in
their obj->funcs->open() callbacks, and also with a common failure 
mode

being -ENOMEM from drm_vma_node_allow.

We can make sure none of this can happen by allocating a handle last,
although with a downside that more of the function now runs under the
dev->object_name_lock.

Looking into the individual drivers open() hooks, we have
amdgpu_gem_object_open which seems like it could have a potential 
security

issue without this change.

A couple drivers like qxl_gem_object_open and vmw_gem_object_open
implement no-op hooks so no impact for them.

A bunch of other require a deeper look by individual owners to 
asses for

impact. Those are lima_gem_object_open, nouveau_gem_object_open,
panfrost_gem_open, radeon_gem_object_open and 
virtio_gpu_gem_object_open.


Putting aside the risk assesment of the above, some common 
scenarios to

think about are along these lines:

1)
Userspace closes a handle by speculatively "guessing" it from a 
second

thread.

This results in an unreachable buffer object so, a memory leak.

2)
Same as 1), but object is in the process of getting closed (failed
creation).

The second thread is then able to re-cycle the handle and 
idr_remove would
in the first thread would then remove the handle it does not own 
from the

idr.

3)
Going back to the earlier per driver problem space - individual 
impact
assesment of allowing a second thread to access and operate on a 
partially

constructed handle / object. (Can something crash? Leak information?)

In terms of identifying when the problem started I will tag some 
patches

as references, but not all, if even any, of them actually point to a
broken state. I am just identifying points at which more 
opportunity for

issues to arise was added.


Yes I've looked into this once as well, but couldn't completely 
solve it for some reason.


Give me a day or two to get this tested and all the logic swapped 
back into my head again.


Managed to recollect what the problem with earlier attempts was?


Nope, that's way to long ago. I can only assume that I ran into 
problems with the object_name_lock.


Probably best to double check if that doesn't result in a lock 
inversion when somebody grabs the reservation lock in their ->load() 
callback.


Hmm I don't immediately follow the connection. But I have only found 
radeon_driver_load_kms as using the load callback. Is there any 
lockdep enabled CI for that driver which could tell us if there is a 
problem there?


We don't have CI for radeon and most likely never will, that hw is just 
to old. But we also have amdgpu_gem_object_open() and that looks 
suspiciously like trouble.


The function makes sure that every BO is registered in the VM house 
keeping functions of the drm_file and while doing so grabs a few locks. 
I'm not sure what the locking order of those are.


Could be that this will work, could be that it breaks. I will ping 
internally today if somebody from my team can take a look at this patch.


Regards,
Christian.



Regards,

Tvrtko



Regards,
Christian.



Regards,

Tvrtko


Christian.



References: 304eda32920b ("drm/gem: add hooks to notify driver 
when object handle is created/destroyed")

References: ca481c9b2a3a ("drm/gem: implement vma access management")
References: b39b5394fabc ("drm/gem: Add drm_gem_object_funcs")
Cc: dri-de...@lists.freedesktop.org
Cc: Rob Clark 
Cc: Ben Skeggs 
Cc: David Herrmann 
Cc: Noralf Trønnes 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: Steven Price 
Cc: virtualization@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
Cc: Zack Rusin 
---
  drivers/gpu/drm/drm_gem.c | 48 
+++

  1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index aa15c52ae182..e3d897bca0f2 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -356,52 +356,52 @@ drm_gem_handle_create_tail(struct drm_file 
*file_priv,

 u32 *handlep)
  {
  struct drm_device *dev = obj->dev;
-    u32 handle;
  int ret;
WARN_ON(!mutex_is_locked(>object_name_lock));
  if (obj->handle_count++ == 0)
  drm_gem_object_get(obj);
+    ret = drm_vma_node_allow(>vma_node, file_priv);
+    if (ret)
+    goto err_put;
+
+    if 

Re: [PATCH] drm/gem: Expose the buffer object handle to userspace last

2023-02-20 Thread Christian König

Am 20.02.23 um 10:55 schrieb Tvrtko Ursulin:


Hi,

On 14/02/2023 13:59, Christian König wrote:

Am 14.02.23 um 13:50 schrieb Tvrtko Ursulin:

From: Tvrtko Ursulin 

Currently drm_gem_handle_create_tail exposes the handle to userspace
before the buffer object constructions is complete. This allowing
of working against a partially constructed object, which may also be in
the process of having its creation fail, can have a range of negative
outcomes.

A lot of those will depend on what the individual drivers are doing in
their obj->funcs->open() callbacks, and also with a common failure mode
being -ENOMEM from drm_vma_node_allow.

We can make sure none of this can happen by allocating a handle last,
although with a downside that more of the function now runs under the
dev->object_name_lock.

Looking into the individual drivers open() hooks, we have
amdgpu_gem_object_open which seems like it could have a potential 
security

issue without this change.

A couple drivers like qxl_gem_object_open and vmw_gem_object_open
implement no-op hooks so no impact for them.

A bunch of other require a deeper look by individual owners to asses 
for

impact. Those are lima_gem_object_open, nouveau_gem_object_open,
panfrost_gem_open, radeon_gem_object_open and 
virtio_gpu_gem_object_open.


Putting aside the risk assesment of the above, some common scenarios to
think about are along these lines:

1)
Userspace closes a handle by speculatively "guessing" it from a second
thread.

This results in an unreachable buffer object so, a memory leak.

2)
Same as 1), but object is in the process of getting closed (failed
creation).

The second thread is then able to re-cycle the handle and idr_remove 
would
in the first thread would then remove the handle it does not own 
from the

idr.

3)
Going back to the earlier per driver problem space - individual impact
assesment of allowing a second thread to access and operate on a 
partially

constructed handle / object. (Can something crash? Leak information?)

In terms of identifying when the problem started I will tag some 
patches

as references, but not all, if even any, of them actually point to a
broken state. I am just identifying points at which more opportunity 
for

issues to arise was added.


Yes I've looked into this once as well, but couldn't completely solve 
it for some reason.


Give me a day or two to get this tested and all the logic swapped 
back into my head again.


Managed to recollect what the problem with earlier attempts was?


Nope, that's way to long ago. I can only assume that I ran into problems 
with the object_name_lock.


Probably best to double check if that doesn't result in a lock inversion 
when somebody grabs the reservation lock in their ->load() callback.


Regards,
Christian.



Regards,

Tvrtko


Christian.



References: 304eda32920b ("drm/gem: add hooks to notify driver when 
object handle is created/destroyed")

References: ca481c9b2a3a ("drm/gem: implement vma access management")
References: b39b5394fabc ("drm/gem: Add drm_gem_object_funcs")
Cc: dri-de...@lists.freedesktop.org
Cc: Rob Clark 
Cc: Ben Skeggs 
Cc: David Herrmann 
Cc: Noralf Trønnes 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: Steven Price 
Cc: virtualization@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
Cc: Zack Rusin 
---
  drivers/gpu/drm/drm_gem.c | 48 
+++

  1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index aa15c52ae182..e3d897bca0f2 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -356,52 +356,52 @@ drm_gem_handle_create_tail(struct drm_file 
*file_priv,

 u32 *handlep)
  {
  struct drm_device *dev = obj->dev;
-    u32 handle;
  int ret;
WARN_ON(!mutex_is_locked(>object_name_lock));
  if (obj->handle_count++ == 0)
  drm_gem_object_get(obj);
+    ret = drm_vma_node_allow(>vma_node, file_priv);
+    if (ret)
+    goto err_put;
+
+    if (obj->funcs->open) {
+    ret = obj->funcs->open(obj, file_priv);
+    if (ret)
+    goto err_revoke;
+    }
+
  /*
- * Get the user-visible handle using idr.  Preload and perform
- * allocation under our spinlock.
+ * Get the user-visible handle using idr as the _last_ step.
+ * Preload and perform allocation under our spinlock.
   */
  idr_preload(GFP_KERNEL);
  spin_lock(_priv->table_lock);
-
  ret = idr_alloc(_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
-
  spin_unlock(_priv->table_lock);
  idr_preload_end();
-    mutex_unlock(>object_name_lock);
  if (ret < 0)
-    goto err_unref;
-
-    handle = ret;
+    goto err_close;
-    ret = drm_vma_node_allow(>vma_node, file_priv);
-    if (ret)
-    goto err_remove;
+    mutex_unlock(>object_name_lock);
-    if (obj->funcs->open) {
-   

Re: [PATCH] drm/gem: Expose the buffer object handle to userspace last

2023-02-14 Thread Christian König via Virtualization

Am 14.02.23 um 13:50 schrieb Tvrtko Ursulin:

From: Tvrtko Ursulin 

Currently drm_gem_handle_create_tail exposes the handle to userspace
before the buffer object constructions is complete. This allowing
of working against a partially constructed object, which may also be in
the process of having its creation fail, can have a range of negative
outcomes.

A lot of those will depend on what the individual drivers are doing in
their obj->funcs->open() callbacks, and also with a common failure mode
being -ENOMEM from drm_vma_node_allow.

We can make sure none of this can happen by allocating a handle last,
although with a downside that more of the function now runs under the
dev->object_name_lock.

Looking into the individual drivers open() hooks, we have
amdgpu_gem_object_open which seems like it could have a potential security
issue without this change.

A couple drivers like qxl_gem_object_open and vmw_gem_object_open
implement no-op hooks so no impact for them.

A bunch of other require a deeper look by individual owners to asses for
impact. Those are lima_gem_object_open, nouveau_gem_object_open,
panfrost_gem_open, radeon_gem_object_open and virtio_gpu_gem_object_open.

Putting aside the risk assesment of the above, some common scenarios to
think about are along these lines:

1)
Userspace closes a handle by speculatively "guessing" it from a second
thread.

This results in an unreachable buffer object so, a memory leak.

2)
Same as 1), but object is in the process of getting closed (failed
creation).

The second thread is then able to re-cycle the handle and idr_remove would
in the first thread would then remove the handle it does not own from the
idr.

3)
Going back to the earlier per driver problem space - individual impact
assesment of allowing a second thread to access and operate on a partially
constructed handle / object. (Can something crash? Leak information?)

In terms of identifying when the problem started I will tag some patches
as references, but not all, if even any, of them actually point to a
broken state. I am just identifying points at which more opportunity for
issues to arise was added.


Yes I've looked into this once as well, but couldn't completely solve it 
for some reason.


Give me a day or two to get this tested and all the logic swapped back 
into my head again.


Christian.



References: 304eda32920b ("drm/gem: add hooks to notify driver when object handle is 
created/destroyed")
References: ca481c9b2a3a ("drm/gem: implement vma access management")
References: b39b5394fabc ("drm/gem: Add drm_gem_object_funcs")
Cc: dri-de...@lists.freedesktop.org
Cc: Rob Clark 
Cc: Ben Skeggs 
Cc: David Herrmann 
Cc: Noralf Trønnes 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: Steven Price 
Cc: virtualization@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
Cc: Zack Rusin 
---
  drivers/gpu/drm/drm_gem.c | 48 +++
  1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index aa15c52ae182..e3d897bca0f2 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -356,52 +356,52 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
   u32 *handlep)
  {
struct drm_device *dev = obj->dev;
-   u32 handle;
int ret;
  
  	WARN_ON(!mutex_is_locked(>object_name_lock));

if (obj->handle_count++ == 0)
drm_gem_object_get(obj);
  
+	ret = drm_vma_node_allow(>vma_node, file_priv);

+   if (ret)
+   goto err_put;
+
+   if (obj->funcs->open) {
+   ret = obj->funcs->open(obj, file_priv);
+   if (ret)
+   goto err_revoke;
+   }
+
/*
-* Get the user-visible handle using idr.  Preload and perform
-* allocation under our spinlock.
+* Get the user-visible handle using idr as the _last_ step.
+* Preload and perform allocation under our spinlock.
 */
idr_preload(GFP_KERNEL);
spin_lock(_priv->table_lock);
-
ret = idr_alloc(_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
-
spin_unlock(_priv->table_lock);
idr_preload_end();
  
-	mutex_unlock(>object_name_lock);

if (ret < 0)
-   goto err_unref;
-
-   handle = ret;
+   goto err_close;
  
-	ret = drm_vma_node_allow(>vma_node, file_priv);

-   if (ret)
-   goto err_remove;
+   mutex_unlock(>object_name_lock);
  
-	if (obj->funcs->open) {

-   ret = obj->funcs->open(obj, file_priv);
-   if (ret)
-   goto err_revoke;
-   }
+   *handlep = ret;
  
-	*handlep = handle;

return 0;
  
+err_close:

+   if (obj->funcs->close)
+   obj->funcs->close(obj, file_priv);
  err_revoke: