Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Asahi Lina

On 06/04/2023 18.48, Daniel Vetter wrote:

On Thu, Apr 06, 2023 at 06:27:27PM +0900, Asahi Lina wrote:

On 06/04/2023 18.15, Asahi Lina wrote:

On 06/04/2023 18.06, Christian König wrote:

Am 06.04.23 um 10:49 schrieb Asahi Lina:

On 06/04/2023 17.29, Christian König wrote:

Am 05.04.23 um 18:34 schrieb Asahi Lina:

A signaled scheduler fence can outlive its scheduler, since fences are
independently reference counted.


Well that is actually not correct. Schedulers are supposed to stay
around until the hw they have been driving is no longer present.


But the fences can outlive that. You can GPU render into an imported
buffer, which attaches a fence to it. Then the GPU goes away but the
fence is still attached to the buffer. Then you oops when you cat that
debugfs file...


No, exactly that's the point you wouldn't ops.



My use case does this way more often (since schedulers are tied to
UAPI objects), which is how I found this, but as far as I can tell
this is already broken for all drivers on unplug/unbind/anything else
that would destroy the schedulers with fences potentially referenced
on separate scanout devices or at any other DMA-BUF consumer.


Even if a GPU is hot plugged the data structures for it should only go
away with the last reference, since the scheduler fence is referencing
the hw fence and the hw fence in turn is referencing the driver this
shouldn't happen.


So those fences can potentially keep half the driver data structures
alive just for existing in a signaled state? That doesn't seem sensible
(and it definitely doesn't work for our use case where schedulers can be
created and destroyed freely, it could lead to way too much junk
sticking around in memory).


This is why the split betwen the hw fence and the public sched fence.
Because once the hw fence stuff is sorted it should all be able to go
away, without the public fence keeping everything alive.


That doesn't seem to be how it works though... as far as I can tell the 
public finished fence keeps the public scheduled fence alive, which 
keeps the hw fence alive.


If that is how it is supposed to work, then we're back at this being a 
simple UAF which is fixed by this patch (and then the fence code also 
needs to be fixed to actually drop the parent fence reference when it 
signals).



Also, requiring that the hw_fence keep the scheduler alive (which is
documented nowhere) is a completely ridiculous lifetime requirement and
layering violation across multiple subsystems. I have no idea how I'd make
Rust abstractions uphold that requirement safely without doing something
silly like having abstraction-specific hw_fence wrappers, and then you run
into deadlocks due to the scheduler potentially being dropped from the
job_done callback when the fence reference is dropped and just... no,
please. This is terrible. If you don't want me to fix it we'll have to find
another way because I can't work with this.


So generally the hw fence keeps the underlying gpu ctx alive, and the gpu
context keeps the gpu vm alive. Pretty much has to, or your gpu starts
executing stuff that's freed.


I do that using a different mechanism, since the way this GPU signals 
events doesn't map directly to fences, nor to the UAPI queue's (and 
therefore the DRM scheduler's hw fence's) idea of what job completion 
is. There's sort of a timeline mechanism instead, but the timelines are 
24 bits and wrap and there's a global set of 128 of them and multiple 
can be required for a single drm_scheduler job. Keeping queues alive 
(which then keep active jobs alive which keep whatever resources they 
require alive) happens at the level of those 128 event slots, as I call 
them, which hold a reference to whatever queue is assigned to them. Once 
those signal the queues walk through pending jobs until the signaled 
value, and that's where the hw fence gets signaled. After that the jobs 
that were just completed are freed (This is a sticky point right now due 
to possible deadlocks if done on the event thread. What I'm doing right 
now is a bit of a hack that works well enough in practice, but I want to 
refactor it to be cleaner once we have more Rust abstractions to work 
with...)


It might actually make sense to start moving some lifetimes into the 
hw_fence if we find we need to more tightly tie firmware lifetimes 
together (this is TBD, I'm still figuring out all the corner case 
rules), but that's only practical if the drm_sched fence doesn't hold up 
the entire hw_fence. For that to make sense the hw_fence needs to be 
able to go away soon after it signals.


This whole thing is pretty messy and I'm open to refactoring ideas, I 
just don't want to have to keep drm_sched instances lying around just 
because someone has a buffer reference with one of its fences 
installed... those potentially long-lived fences shouldn't keep anything 
more than small bits of driver-global state alive (ideally just the 
module itself).



Now for fw scheduler gpu ctx isn't just 

Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Daniel Vetter
On Thu, Apr 06, 2023 at 06:27:27PM +0900, Asahi Lina wrote:
> On 06/04/2023 18.15, Asahi Lina wrote:
> > On 06/04/2023 18.06, Christian König wrote:
> > > Am 06.04.23 um 10:49 schrieb Asahi Lina:
> > > > On 06/04/2023 17.29, Christian König wrote:
> > > > > Am 05.04.23 um 18:34 schrieb Asahi Lina:
> > > > > > A signaled scheduler fence can outlive its scheduler, since fences 
> > > > > > are
> > > > > > independently reference counted.
> > > > > 
> > > > > Well that is actually not correct. Schedulers are supposed to stay
> > > > > around until the hw they have been driving is no longer present.
> > > > 
> > > > But the fences can outlive that. You can GPU render into an imported
> > > > buffer, which attaches a fence to it. Then the GPU goes away but the
> > > > fence is still attached to the buffer. Then you oops when you cat that
> > > > debugfs file...
> > > 
> > > No, exactly that's the point you wouldn't ops.
> > > 
> > > > 
> > > > My use case does this way more often (since schedulers are tied to
> > > > UAPI objects), which is how I found this, but as far as I can tell
> > > > this is already broken for all drivers on unplug/unbind/anything else
> > > > that would destroy the schedulers with fences potentially referenced
> > > > on separate scanout devices or at any other DMA-BUF consumer.
> > > 
> > > Even if a GPU is hot plugged the data structures for it should only go
> > > away with the last reference, since the scheduler fence is referencing
> > > the hw fence and the hw fence in turn is referencing the driver this
> > > shouldn't happen.
> > 
> > So those fences can potentially keep half the driver data structures
> > alive just for existing in a signaled state? That doesn't seem sensible
> > (and it definitely doesn't work for our use case where schedulers can be
> > created and destroyed freely, it could lead to way too much junk
> > sticking around in memory).

This is why the split betwen the hw fence and the public sched fence.
Because once the hw fence stuff is sorted it should all be able to go
away, without the public fence keeping everything alive.

> > > > > E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.
> > > > 
> > > > It's up to drivers not to mess that up, since the HW fence has the
> > > > same requirements that it can outlive other driver objects, just like
> > > > any other fence. That's not something the scheduler has to be
> > > > concerned with, it's a driver correctness issue.
> > > > 
> > > > Of course, in C you have to get it right yourself, while with correct
> > > > Rust abstractions will cause your code to fail to compile if you do it
> > > > wrong ^^
> > > > 
> > > > In my particular case, the hw_fence is a very dumb object that has no
> > > > references to anything, only an ID and a pending op count. Jobs hold
> > > > references to it and decrement it until it signals, not the other way
> > > > around. So that object can live forever regardless of whether the rest
> > > > of the device is gone.
> > > 
> > > That is then certainly a bug. This won't work that way, and the timelime
> > > name is just the tip of the iceberg here.
> > > 
> > > The fence reference count needs to keep both the scheduler and driver
> > > alive. Otherwise you could for example unload the module and immediately
> > > ops because your fence_ops go away.
> > 
> > Yes, I realized the module issue after writing the email. But that's the
> > *only* thing it needs to hold a reference to! Which is much more
> > sensible than keeping a huge chunk of UAPI-adjacent data structures
> > alive for a signaled fence that for all we know might stick around
> > forever attached to some buffer.
> > 
> > > > > Your use case is now completely different to that and this won't work
> > > > > any more.
> > > > > 
> > > > > This here might just be the first case where that breaks.
> > > > 
> > > > This bug already exists, it's just a lot rarer for existing use
> > > > cases... but either way Xe is doing the same thing I am, so I'm not
> > > > the only one here either.
> > > 
> > > No it doesn't. You just have implemented the references differently than
> > > they are supposed to be.
> > > 
> > > Fixing this one occasion here would mitigate that immediate ops, but
> > > doesn't fix the fundamental problem.
> > 
> > Honestly, at this point I'm starting to doubt whether we want to use
> > drm_scheduler at all, because it clearly wasn't designed for our use
> > case and every time I try to fix something your answer has been "you're
> > using it wrong", and at the same time the practically nonexistent
> > documentation makes it impossible to know how it was actually designed
> > to be used.
> 
> Also, requiring that the hw_fence keep the scheduler alive (which is
> documented nowhere) is a completely ridiculous lifetime requirement and
> layering violation across multiple subsystems. I have no idea how I'd make
> Rust abstractions uphold that requirement safely without doing something
> silly 

Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Asahi Lina

On 06/04/2023 18.15, Asahi Lina wrote:

On 06/04/2023 18.06, Christian König wrote:

Am 06.04.23 um 10:49 schrieb Asahi Lina:

On 06/04/2023 17.29, Christian König wrote:

Am 05.04.23 um 18:34 schrieb Asahi Lina:

A signaled scheduler fence can outlive its scheduler, since fences are
independently reference counted.


Well that is actually not correct. Schedulers are supposed to stay
around until the hw they have been driving is no longer present.


But the fences can outlive that. You can GPU render into an imported
buffer, which attaches a fence to it. Then the GPU goes away but the
fence is still attached to the buffer. Then you oops when you cat that
debugfs file...


No, exactly that's the point you wouldn't ops.



My use case does this way more often (since schedulers are tied to
UAPI objects), which is how I found this, but as far as I can tell
this is already broken for all drivers on unplug/unbind/anything else
that would destroy the schedulers with fences potentially referenced
on separate scanout devices or at any other DMA-BUF consumer.


Even if a GPU is hot plugged the data structures for it should only go
away with the last reference, since the scheduler fence is referencing
the hw fence and the hw fence in turn is referencing the driver this
shouldn't happen.


So those fences can potentially keep half the driver data structures
alive just for existing in a signaled state? That doesn't seem sensible
(and it definitely doesn't work for our use case where schedulers can be
created and destroyed freely, it could lead to way too much junk
sticking around in memory).




E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.


It's up to drivers not to mess that up, since the HW fence has the
same requirements that it can outlive other driver objects, just like
any other fence. That's not something the scheduler has to be
concerned with, it's a driver correctness issue.

Of course, in C you have to get it right yourself, while with correct
Rust abstractions will cause your code to fail to compile if you do it
wrong ^^

In my particular case, the hw_fence is a very dumb object that has no
references to anything, only an ID and a pending op count. Jobs hold
references to it and decrement it until it signals, not the other way
around. So that object can live forever regardless of whether the rest
of the device is gone.


That is then certainly a bug. This won't work that way, and the timelime
name is just the tip of the iceberg here.

The fence reference count needs to keep both the scheduler and driver
alive. Otherwise you could for example unload the module and immediately
ops because your fence_ops go away.


Yes, I realized the module issue after writing the email. But that's the
*only* thing it needs to hold a reference to! Which is much more
sensible than keeping a huge chunk of UAPI-adjacent data structures
alive for a signaled fence that for all we know might stick around
forever attached to some buffer.


Your use case is now completely different to that and this won't work
any more.

This here might just be the first case where that breaks.


This bug already exists, it's just a lot rarer for existing use
cases... but either way Xe is doing the same thing I am, so I'm not
the only one here either.


No it doesn't. You just have implemented the references differently than
they are supposed to be.

Fixing this one occasion here would mitigate that immediate ops, but
doesn't fix the fundamental problem.


Honestly, at this point I'm starting to doubt whether we want to use
drm_scheduler at all, because it clearly wasn't designed for our use
case and every time I try to fix something your answer has been "you're
using it wrong", and at the same time the practically nonexistent
documentation makes it impossible to know how it was actually designed
to be used.


Also, requiring that the hw_fence keep the scheduler alive (which is 
documented nowhere) is a completely ridiculous lifetime requirement and 
layering violation across multiple subsystems. I have no idea how I'd 
make Rust abstractions uphold that requirement safely without doing 
something silly like having abstraction-specific hw_fence wrappers, and 
then you run into deadlocks due to the scheduler potentially being 
dropped from the job_done callback when the fence reference is dropped 
and just... no, please. This is terrible. If you don't want me to fix it 
we'll have to find another way because I can't work with this.


~~ Lina



Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Asahi Lina

On 06/04/2023 18.06, Christian König wrote:

Am 06.04.23 um 10:49 schrieb Asahi Lina:

On 06/04/2023 17.29, Christian König wrote:

Am 05.04.23 um 18:34 schrieb Asahi Lina:

A signaled scheduler fence can outlive its scheduler, since fences are
independently reference counted.


Well that is actually not correct. Schedulers are supposed to stay
around until the hw they have been driving is no longer present.


But the fences can outlive that. You can GPU render into an imported
buffer, which attaches a fence to it. Then the GPU goes away but the
fence is still attached to the buffer. Then you oops when you cat that
debugfs file...


No, exactly that's the point you wouldn't ops.



My use case does this way more often (since schedulers are tied to
UAPI objects), which is how I found this, but as far as I can tell
this is already broken for all drivers on unplug/unbind/anything else
that would destroy the schedulers with fences potentially referenced
on separate scanout devices or at any other DMA-BUF consumer.


Even if a GPU is hot plugged the data structures for it should only go
away with the last reference, since the scheduler fence is referencing
the hw fence and the hw fence in turn is referencing the driver this
shouldn't happen.


So those fences can potentially keep half the driver data structures 
alive just for existing in a signaled state? That doesn't seem sensible 
(and it definitely doesn't work for our use case where schedulers can be 
created and destroyed freely, it could lead to way too much junk 
sticking around in memory).





E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.


It's up to drivers not to mess that up, since the HW fence has the
same requirements that it can outlive other driver objects, just like
any other fence. That's not something the scheduler has to be
concerned with, it's a driver correctness issue.

Of course, in C you have to get it right yourself, while with correct
Rust abstractions will cause your code to fail to compile if you do it
wrong ^^

In my particular case, the hw_fence is a very dumb object that has no
references to anything, only an ID and a pending op count. Jobs hold
references to it and decrement it until it signals, not the other way
around. So that object can live forever regardless of whether the rest
of the device is gone.


That is then certainly a bug. This won't work that way, and the timelime
name is just the tip of the iceberg here.

The fence reference count needs to keep both the scheduler and driver
alive. Otherwise you could for example unload the module and immediately
ops because your fence_ops go away.


Yes, I realized the module issue after writing the email. But that's the 
*only* thing it needs to hold a reference to! Which is much more 
sensible than keeping a huge chunk of UAPI-adjacent data structures 
alive for a signaled fence that for all we know might stick around 
forever attached to some buffer.



Your use case is now completely different to that and this won't work
any more.

This here might just be the first case where that breaks.


This bug already exists, it's just a lot rarer for existing use
cases... but either way Xe is doing the same thing I am, so I'm not
the only one here either.


No it doesn't. You just have implemented the references differently than
they are supposed to be.

Fixing this one occasion here would mitigate that immediate ops, but
doesn't fix the fundamental problem.


Honestly, at this point I'm starting to doubt whether we want to use 
drm_scheduler at all, because it clearly wasn't designed for our use 
case and every time I try to fix something your answer has been "you're 
using it wrong", and at the same time the practically nonexistent 
documentation makes it impossible to know how it was actually designed 
to be used.


~~ Lina



Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Christian König

Am 06.04.23 um 10:49 schrieb Asahi Lina:

On 06/04/2023 17.29, Christian König wrote:

Am 05.04.23 um 18:34 schrieb Asahi Lina:

A signaled scheduler fence can outlive its scheduler, since fences are
independently reference counted.


Well that is actually not correct. Schedulers are supposed to stay
around until the hw they have been driving is no longer present.


But the fences can outlive that. You can GPU render into an imported 
buffer, which attaches a fence to it. Then the GPU goes away but the 
fence is still attached to the buffer. Then you oops when you cat that 
debugfs file...


No, exactly that's the point you wouldn't ops.



My use case does this way more often (since schedulers are tied to 
UAPI objects), which is how I found this, but as far as I can tell 
this is already broken for all drivers on unplug/unbind/anything else 
that would destroy the schedulers with fences potentially referenced 
on separate scanout devices or at any other DMA-BUF consumer.


Even if a GPU is hot plugged the data structures for it should only go 
away with the last reference, since the scheduler fence is referencing 
the hw fence and the hw fence in turn is referencing the driver this 
shouldn't happen.





E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.


It's up to drivers not to mess that up, since the HW fence has the 
same requirements that it can outlive other driver objects, just like 
any other fence. That's not something the scheduler has to be 
concerned with, it's a driver correctness issue.


Of course, in C you have to get it right yourself, while with correct 
Rust abstractions will cause your code to fail to compile if you do it 
wrong ^^


In my particular case, the hw_fence is a very dumb object that has no 
references to anything, only an ID and a pending op count. Jobs hold 
references to it and decrement it until it signals, not the other way 
around. So that object can live forever regardless of whether the rest 
of the device is gone.


That is then certainly a bug. This won't work that way, and the timelime 
name is just the tip of the iceberg here.


The fence reference count needs to keep both the scheduler and driver 
alive. Otherwise you could for example unload the module and immediately 
ops because your fence_ops go away.





Your use case is now completely different to that and this won't work
any more.

This here might just be the first case where that breaks.


This bug already exists, it's just a lot rarer for existing use 
cases... but either way Xe is doing the same thing I am, so I'm not 
the only one here either.


No it doesn't. You just have implemented the references differently than 
they are supposed to be.


Fixing this one occasion here would mitigate that immediate ops, but 
doesn't fix the fundamental problem.


Regards,
Christian.



~~ Lina





Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Asahi Lina

On 06/04/2023 17.29, Christian König wrote:

Am 05.04.23 um 18:34 schrieb Asahi Lina:

A signaled scheduler fence can outlive its scheduler, since fences are
independently reference counted.


Well that is actually not correct. Schedulers are supposed to stay
around until the hw they have been driving is no longer present.


But the fences can outlive that. You can GPU render into an imported 
buffer, which attaches a fence to it. Then the GPU goes away but the 
fence is still attached to the buffer. Then you oops when you cat that 
debugfs file...


My use case does this way more often (since schedulers are tied to UAPI 
objects), which is how I found this, but as far as I can tell this is 
already broken for all drivers on unplug/unbind/anything else that would 
destroy the schedulers with fences potentially referenced on separate 
scanout devices or at any other DMA-BUF consumer.



E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.


It's up to drivers not to mess that up, since the HW fence has the same 
requirements that it can outlive other driver objects, just like any 
other fence. That's not something the scheduler has to be concerned 
with, it's a driver correctness issue.


Of course, in C you have to get it right yourself, while with correct 
Rust abstractions will cause your code to fail to compile if you do it 
wrong ^^


In my particular case, the hw_fence is a very dumb object that has no 
references to anything, only an ID and a pending op count. Jobs hold 
references to it and decrement it until it signals, not the other way 
around. So that object can live forever regardless of whether the rest 
of the device is gone.



Your use case is now completely different to that and this won't work
any more.

This here might just be the first case where that breaks.


This bug already exists, it's just a lot rarer for existing use cases... 
but either way Xe is doing the same thing I am, so I'm not the only one 
here either.


~~ Lina



Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Daniel Vetter
On Thu, Apr 06, 2023 at 10:29:57AM +0200, Christian König wrote:
> Am 05.04.23 um 18:34 schrieb Asahi Lina:
> > A signaled scheduler fence can outlive its scheduler, since fences are
> > independently reference counted.
> 
> Well that is actually not correct. Schedulers are supposed to stay around
> until the hw they have been driving is no longer present.
> 
> E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.
> 
> Your use case is now completely different to that and this won't work any
> more.

This is why I'm a bit a broken record suggesting that for the fw scheduler
case, where we have drm_sched_entity:drm_scheduler 1:1 and created at
runtime, we really should rework the interface exposed to drivers:

- drm_scheduler stays the thing that's per-engine and stays around for as
  long as the driver

- We split out a drm_sched_internal, which is either tied to drm_scheduler
  (ringbuffer scheduler mode) or drm_sched_entity (fw ctx scheduling
  mode).

- drm/sched internals are updated to dtrt in all these cases. And there's
  a lot, stuff like drm_sched_job is quite tricky if each driver needs to
  protect against concurrent ctx/entity creation/destruction, and I really
  don't like the idea that drivers hand-roll this kind of tricky state
  transition code that's used in the exceptional tdr/gpu/fw-death
  situation all themselves.
 
> This here might just be the first case where that breaks.

Yeah I expect there's going to be a solid stream of these, and we're just
going to random-walk in circles if this effort doesn't come with at least
some amount of design.

Thus far no one really comment on the above plan though, so I'm not sure
what the consensu plan is among all the various fw-scheduling driver
efforts ...
-Daniel

> 
> Regards,
> Christian.
> 
> >   Therefore, we can't reference the
> > scheduler in the get_timeline_name() implementation.
> > 
> > Fixes oopses on `cat /sys/kernel/debug/dma_buf/bufinfo` when shared
> > dma-bufs reference fences from GPU schedulers that no longer exist.
> > 
> > Signed-off-by: Asahi Lina 
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 7 ++-
> >   drivers/gpu/drm/scheduler/sched_fence.c  | 4 +++-
> >   include/drm/gpu_scheduler.h  | 5 +
> >   3 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 15d04a0ec623..8b3b949b2ce8 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -368,7 +368,12 @@ static bool drm_sched_entity_add_dependency_cb(struct 
> > drm_sched_entity *entity)
> > /*
> >  * Fence is from the same scheduler, only need to wait for
> > -* it to be scheduled
> > +* it to be scheduled.
> > +*
> > +* Note: s_fence->sched could have been freed and reallocated
> > +* as another scheduler. This false positive case is okay, as if
> > +* the old scheduler was freed all of its jobs must have
> > +* signaled their completion fences.
> >  */
> > fence = dma_fence_get(_fence->scheduled);
> > dma_fence_put(entity->dependency);
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> > b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 7fd869520ef2..33b145dfa38c 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -66,7 +66,7 @@ static const char *drm_sched_fence_get_driver_name(struct 
> > dma_fence *fence)
> >   static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> >   {
> > struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > -   return (const char *)fence->sched->name;
> > +   return (const char *)fence->sched_name;
> >   }
> >   static void drm_sched_fence_free_rcu(struct rcu_head *rcu)
> > @@ -168,6 +168,8 @@ void drm_sched_fence_init(struct drm_sched_fence *fence,
> > unsigned seq;
> > fence->sched = entity->rq->sched;
> > +   strlcpy(fence->sched_name, entity->rq->sched->name,
> > +   sizeof(fence->sched_name));
> > seq = atomic_inc_return(>fence_seq);
> > dma_fence_init(>scheduled, _sched_fence_ops_scheduled,
> >>lock, entity->fence_context, seq);
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 9db9e5e504ee..49f019731891 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -295,6 +295,11 @@ struct drm_sched_fence {
> >* @lock: the lock used by the scheduled and the finished fences.
> >*/
> > spinlock_t  lock;
> > +/**
> > + * @sched_name: the name of the scheduler that owns this fence. We
> > + * keep a copy here since fences can outlive their scheduler.
> > + */
> > +   char sched_name[16];
> >   

Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-06 Thread Christian König

Am 05.04.23 um 18:34 schrieb Asahi Lina:

A signaled scheduler fence can outlive its scheduler, since fences are
independently reference counted.


Well that is actually not correct. Schedulers are supposed to stay 
around until the hw they have been driving is no longer present.


E.g. the reference was scheduler_fence->hw_fence->driver->scheduler.

Your use case is now completely different to that and this won't work 
any more.


This here might just be the first case where that breaks.

Regards,
Christian.


  Therefore, we can't reference the
scheduler in the get_timeline_name() implementation.

Fixes oopses on `cat /sys/kernel/debug/dma_buf/bufinfo` when shared
dma-bufs reference fences from GPU schedulers that no longer exist.

Signed-off-by: Asahi Lina 
---
  drivers/gpu/drm/scheduler/sched_entity.c | 7 ++-
  drivers/gpu/drm/scheduler/sched_fence.c  | 4 +++-
  include/drm/gpu_scheduler.h  | 5 +
  3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 15d04a0ec623..8b3b949b2ce8 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -368,7 +368,12 @@ static bool drm_sched_entity_add_dependency_cb(struct 
drm_sched_entity *entity)
  
  		/*

 * Fence is from the same scheduler, only need to wait for
-* it to be scheduled
+* it to be scheduled.
+*
+* Note: s_fence->sched could have been freed and reallocated
+* as another scheduler. This false positive case is okay, as if
+* the old scheduler was freed all of its jobs must have
+* signaled their completion fences.
 */
fence = dma_fence_get(_fence->scheduled);
dma_fence_put(entity->dependency);
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
b/drivers/gpu/drm/scheduler/sched_fence.c
index 7fd869520ef2..33b145dfa38c 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -66,7 +66,7 @@ static const char *drm_sched_fence_get_driver_name(struct 
dma_fence *fence)
  static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
  {
struct drm_sched_fence *fence = to_drm_sched_fence(f);
-   return (const char *)fence->sched->name;
+   return (const char *)fence->sched_name;
  }
  
  static void drm_sched_fence_free_rcu(struct rcu_head *rcu)

@@ -168,6 +168,8 @@ void drm_sched_fence_init(struct drm_sched_fence *fence,
unsigned seq;
  
  	fence->sched = entity->rq->sched;

+   strlcpy(fence->sched_name, entity->rq->sched->name,
+   sizeof(fence->sched_name));
seq = atomic_inc_return(>fence_seq);
dma_fence_init(>scheduled, _sched_fence_ops_scheduled,
   >lock, entity->fence_context, seq);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 9db9e5e504ee..49f019731891 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -295,6 +295,11 @@ struct drm_sched_fence {
   * @lock: the lock used by the scheduled and the finished fences.
   */
spinlock_t  lock;
+/**
+ * @sched_name: the name of the scheduler that owns this fence. We
+ * keep a copy here since fences can outlive their scheduler.
+ */
+   char sched_name[16];
  /**
   * @owner: job owner for debugging
   */

---
base-commit: fe15c26ee26efa11741a7b632e9f23b01aca4cc6
change-id: 20230406-scheduler-uaf-1-994ec34cac93

Thank you,
~~ Lina





Re: [PATCH] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name

2023-04-05 Thread Luben Tuikov
On 2023-04-05 12:34, Asahi Lina wrote:
> A signaled scheduler fence can outlive its scheduler, since fences are
> independently reference counted. Therefore, we can't reference the
> scheduler in the get_timeline_name() implementation.
> 
> Fixes oopses on `cat /sys/kernel/debug/dma_buf/bufinfo` when shared
> dma-bufs reference fences from GPU schedulers that no longer exist.
> 
> Signed-off-by: Asahi Lina 
> ---

Thanks for fixing this. If there's no objections, I'll add my tag
and I'll push this tomorrow. I think this should go in through
drm-misc-fixes and drm-misc-next.

Reviewed-by: Luben Tuikov 

Regards,
Luben

>  drivers/gpu/drm/scheduler/sched_entity.c | 7 ++-
>  drivers/gpu/drm/scheduler/sched_fence.c  | 4 +++-
>  include/drm/gpu_scheduler.h  | 5 +
>  3 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 15d04a0ec623..8b3b949b2ce8 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -368,7 +368,12 @@ static bool drm_sched_entity_add_dependency_cb(struct 
> drm_sched_entity *entity)
>  
>   /*
>* Fence is from the same scheduler, only need to wait for
> -  * it to be scheduled
> +  * it to be scheduled.
> +  *
> +  * Note: s_fence->sched could have been freed and reallocated
> +  * as another scheduler. This false positive case is okay, as if
> +  * the old scheduler was freed all of its jobs must have
> +  * signaled their completion fences.
>*/
>   fence = dma_fence_get(_fence->scheduled);
>   dma_fence_put(entity->dependency);
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index 7fd869520ef2..33b145dfa38c 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -66,7 +66,7 @@ static const char *drm_sched_fence_get_driver_name(struct 
> dma_fence *fence)
>  static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>  {
>   struct drm_sched_fence *fence = to_drm_sched_fence(f);
> - return (const char *)fence->sched->name;
> + return (const char *)fence->sched_name;
>  }
>  
>  static void drm_sched_fence_free_rcu(struct rcu_head *rcu)
> @@ -168,6 +168,8 @@ void drm_sched_fence_init(struct drm_sched_fence *fence,
>   unsigned seq;
>  
>   fence->sched = entity->rq->sched;
> + strlcpy(fence->sched_name, entity->rq->sched->name,
> + sizeof(fence->sched_name));
>   seq = atomic_inc_return(>fence_seq);
>   dma_fence_init(>scheduled, _sched_fence_ops_scheduled,
>  >lock, entity->fence_context, seq);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 9db9e5e504ee..49f019731891 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -295,6 +295,11 @@ struct drm_sched_fence {
>   * @lock: the lock used by the scheduled and the finished fences.
>   */
>   spinlock_t  lock;
> +/**
> + * @sched_name: the name of the scheduler that owns this fence. We
> + * keep a copy here since fences can outlive their scheduler.
> + */
> + char sched_name[16];
>  /**
>   * @owner: job owner for debugging
>   */
> 
> ---
> base-commit: fe15c26ee26efa11741a7b632e9f23b01aca4cc6
> change-id: 20230406-scheduler-uaf-1-994ec34cac93
> 
> Thank you,
> ~~ Lina
>