Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-08 Thread Christian König

Am 08.05.24 um 13:51 schrieb David Laight:

From: Christian König

Sent: 07 May 2024 15:05

...

I actually have been telling people to (ab)use the epoll behavior to
check if two file descriptors point to the same underlying file when
KCMP isn't available.

In what way?


Something like this:

fd_e = epoll_create1(EPOLL_CLOEXEC);

tmp = dup(fd_A)
epoll_ctl(fd_e, EPOLL_CTL_ADD, tmp, );
dup2(fd_B, tmp);

/* If this return -EEXISTS then the fd_A and fd_B are pointing to the 
same struct file */

epoll_ctl(fd_e, EPOLL_CTL_ADD, tmp, );

close (tmp);
close (fd_e



You can add both fd to the same epoll fd.
Relying on the implicit EPOLL_CTL_DEL not happening until both fd are
closed is a recipe for disaster.
(And I can't see an obvious way of testing it.)

Q6/A6 on epoll(7) should always have had a caveat that it is an
'implementation detail' and shouldn't be relied on.
(it is written as a 'beware of' ...)

The other point is that there are two ways to get multiple fd that
reference the same underlying file.
dup() fork() etc share the file offset, but open("/dev/fd/n") adds
a reference count later and has a separate file offset.


No it doesn't.

Accessing /dev/fd/n or /proc/*/fd/n ideally accesses the same inode, but 
gives you a new struct file.


dup(), fork() etc.. make you actually reference the same struct file 
inside the kernel.


That turned out to be a rather important distinction when working with 
device drivers and DMA-buf.


Regards,
Christian.



I don't know which structure epoll is using, but I suspect it is
the former.
So it may not tell you what you want to know.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-08 Thread David Laight
From: Christian König
> Sent: 07 May 2024 15:05
...
> I actually have been telling people to (ab)use the epoll behavior to
> check if two file descriptors point to the same underlying file when
> KCMP isn't available.

In what way?
You can add both fd to the same epoll fd.
Relying on the implicit EPOLL_CTL_DEL not happening until both fd are
closed is a recipe for disaster.
(And I can't see an obvious way of testing it.)

Q6/A6 on epoll(7) should always have had a caveat that it is an
'implementation detail' and shouldn't be relied on.
(it is written as a 'beware of' ...)

The other point is that there are two ways to get multiple fd that
reference the same underlying file.
dup() fork() etc share the file offset, but open("/dev/fd/n") adds
a reference count later and has a separate file offset.

I don't know which structure epoll is using, but I suspect it is
the former.
So it may not tell you what you want to know.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-07 Thread Rob Clark
On Tue, May 7, 2024 at 11:17 AM T.J. Mercier  wrote:
>
> On Tue, May 7, 2024 at 7:04 AM Christian König  
> wrote:
> >
> > Am 07.05.24 um 15:39 schrieb Daniel Vetter:
> > > On Tue, May 07, 2024 at 12:10:07PM +0200, Christian König wrote:
> > >> Am 06.05.24 um 21:04 schrieb T.J. Mercier:
> > >>> On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
> > >>>  wrote:
> >  Hi TJ,
> > 
> >  Seems I have got answers from [1], where it is agreed upon epoll() is
> >  the source of issue.
> > 
> >  Thanks a lot for the discussion.
> > 
> >  [1] 
> >  https://lore.kernel.org/lkml/2d631f0615918...@google.com/
> > 
> >  Thanks
> >  Charan
> > >>> Oh man, quite a set of threads on this over the weekend. Thanks for the 
> > >>> link.
> > >> Yeah and it also has some interesting side conclusion: We should probably
> > >> tell people to stop using DMA-buf with epoll.
> > >>
> > >> The background is that the mutex approach epoll uses to make files 
> > >> disappear
> > >> from the interest list on close results in the fact that each file can 
> > >> only
> > >> be part of a single epoll at a time.
> > >>
> > >> Now since DMA-buf is build around the idea that we share the buffer
> > >> representation as file between processes it means that only one process 
> > >> at a
> > >> time can use epoll with each DMA-buf.
> > >>
> > >> So for example if a window manager uses epoll everything is fine. If a
> > >> client is using epoll everything is fine as well. But if *both* use 
> > >> epoll at
> > >> the same time it won't work.
> > >>
> > >> This can lead to rather funny and hard to debug combinations of failures 
> > >> and
> > >> I think we need to document this limitation and explicitly point it out.
> > > Ok, I tested this with a small C program, and you're mixing things up.
> > > Here's what I got
> > >
> > > - You cannot add a file twice to the same epoll file/fd. So that part is
> > >correct, and also my understanding from reading the kernel code.
> > >
> > > - You can add the same file to two different epoll file instaces. Which
> > >means it's totally fine to use epoll on a dma_buf in different 
> > > processes
> > >like both in the compositor and in clients.
> >
> > Ah! Than I misunderstood that comment in the discussion. Thanks for
> > clarifying that.
> >
> > >
> > > - Substantially more entertaining, you can nest epoll instances, and e.g.
> > >add a 2nd epoll file as an event to the first one. That way you can add
> > >the same file to both epoll fds, and so end up with the same file
> > >essentially being added twice to the top-level epoll file. So even
> > >within one application there's no real issue when e.g. different
> > >userspace drivers all want to use epoll on the same fd, because you can
> > >just throw in another level of epoll and it's fine again and you won't
> > >get an EEXISTS on EPOLL_CTL_ADD.
> > >
> > >But I also don't think we have this issue right now anywhere, since 
> > > it's
> > >kinda a general epoll issue that happens with any duplicated file.
> >
> > I actually have been telling people to (ab)use the epoll behavior to
> > check if two file descriptors point to the same underlying file when
> > KCMP isn't available.
> >
> > Some environments (Android?) disable KCMP because they see it as
> > security problem.
> >
> Didn't know about this so I checked. Our kernel has CONFIG_KCMP, but
> seccomp does look like it's blocking kcmp for apps.
> https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/SYSCALLS.TXT

Getting a bit off the original topic, but fwiw this is what CrOS did
for CONFIG_KCMP:

https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/3501193

Ie. allow the kcmp syscall, but block type == KCMP_SYSVSEM, which was
the more specific thing that android wanted to block.

BR,
-R


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-07 Thread T.J. Mercier
On Tue, May 7, 2024 at 7:04 AM Christian König  wrote:
>
> Am 07.05.24 um 15:39 schrieb Daniel Vetter:
> > On Tue, May 07, 2024 at 12:10:07PM +0200, Christian König wrote:
> >> Am 06.05.24 um 21:04 schrieb T.J. Mercier:
> >>> On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
> >>>  wrote:
>  Hi TJ,
> 
>  Seems I have got answers from [1], where it is agreed upon epoll() is
>  the source of issue.
> 
>  Thanks a lot for the discussion.
> 
>  [1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/
> 
>  Thanks
>  Charan
> >>> Oh man, quite a set of threads on this over the weekend. Thanks for the 
> >>> link.
> >> Yeah and it also has some interesting side conclusion: We should probably
> >> tell people to stop using DMA-buf with epoll.
> >>
> >> The background is that the mutex approach epoll uses to make files 
> >> disappear
> >> from the interest list on close results in the fact that each file can only
> >> be part of a single epoll at a time.
> >>
> >> Now since DMA-buf is build around the idea that we share the buffer
> >> representation as file between processes it means that only one process at 
> >> a
> >> time can use epoll with each DMA-buf.
> >>
> >> So for example if a window manager uses epoll everything is fine. If a
> >> client is using epoll everything is fine as well. But if *both* use epoll 
> >> at
> >> the same time it won't work.
> >>
> >> This can lead to rather funny and hard to debug combinations of failures 
> >> and
> >> I think we need to document this limitation and explicitly point it out.
> > Ok, I tested this with a small C program, and you're mixing things up.
> > Here's what I got
> >
> > - You cannot add a file twice to the same epoll file/fd. So that part is
> >correct, and also my understanding from reading the kernel code.
> >
> > - You can add the same file to two different epoll file instaces. Which
> >means it's totally fine to use epoll on a dma_buf in different processes
> >like both in the compositor and in clients.
>
> Ah! Than I misunderstood that comment in the discussion. Thanks for
> clarifying that.
>
> >
> > - Substantially more entertaining, you can nest epoll instances, and e.g.
> >add a 2nd epoll file as an event to the first one. That way you can add
> >the same file to both epoll fds, and so end up with the same file
> >essentially being added twice to the top-level epoll file. So even
> >within one application there's no real issue when e.g. different
> >userspace drivers all want to use epoll on the same fd, because you can
> >just throw in another level of epoll and it's fine again and you won't
> >get an EEXISTS on EPOLL_CTL_ADD.
> >
> >But I also don't think we have this issue right now anywhere, since it's
> >kinda a general epoll issue that happens with any duplicated file.
>
> I actually have been telling people to (ab)use the epoll behavior to
> check if two file descriptors point to the same underlying file when
> KCMP isn't available.
>
> Some environments (Android?) disable KCMP because they see it as
> security problem.
>
Didn't know about this so I checked. Our kernel has CONFIG_KCMP, but
seccomp does look like it's blocking kcmp for apps.
https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/SYSCALLS.TXT


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-07 Thread Christian König

Am 07.05.24 um 15:39 schrieb Daniel Vetter:

On Tue, May 07, 2024 at 12:10:07PM +0200, Christian König wrote:

Am 06.05.24 um 21:04 schrieb T.J. Mercier:

On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
 wrote:

Hi TJ,

Seems I have got answers from [1], where it is agreed upon epoll() is
the source of issue.

Thanks a lot for the discussion.

[1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/

Thanks
Charan

Oh man, quite a set of threads on this over the weekend. Thanks for the link.

Yeah and it also has some interesting side conclusion: We should probably
tell people to stop using DMA-buf with epoll.

The background is that the mutex approach epoll uses to make files disappear
from the interest list on close results in the fact that each file can only
be part of a single epoll at a time.

Now since DMA-buf is build around the idea that we share the buffer
representation as file between processes it means that only one process at a
time can use epoll with each DMA-buf.

So for example if a window manager uses epoll everything is fine. If a
client is using epoll everything is fine as well. But if *both* use epoll at
the same time it won't work.

This can lead to rather funny and hard to debug combinations of failures and
I think we need to document this limitation and explicitly point it out.

Ok, I tested this with a small C program, and you're mixing things up.
Here's what I got

- You cannot add a file twice to the same epoll file/fd. So that part is
   correct, and also my understanding from reading the kernel code.

- You can add the same file to two different epoll file instaces. Which
   means it's totally fine to use epoll on a dma_buf in different processes
   like both in the compositor and in clients.


Ah! Than I misunderstood that comment in the discussion. Thanks for 
clarifying that.




- Substantially more entertaining, you can nest epoll instances, and e.g.
   add a 2nd epoll file as an event to the first one. That way you can add
   the same file to both epoll fds, and so end up with the same file
   essentially being added twice to the top-level epoll file. So even
   within one application there's no real issue when e.g. different
   userspace drivers all want to use epoll on the same fd, because you can
   just throw in another level of epoll and it's fine again and you won't
   get an EEXISTS on EPOLL_CTL_ADD.

   But I also don't think we have this issue right now anywhere, since it's
   kinda a general epoll issue that happens with any duplicated file.


I actually have been telling people to (ab)use the epoll behavior to 
check if two file descriptors point to the same underlying file when 
KCMP isn't available.


Some environments (Android?) disable KCMP because they see it as 
security problem.



So I don't think there's any reasons to recommend against using epoll on
dma-buf fd (or sync_file or drm_syncobj or any of the sharing primitives
we have really).


No, that indeed seems to be fine then.

Thanks,
Christian.



Cheers, Sima




Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-07 Thread Daniel Vetter
On Tue, May 07, 2024 at 12:10:07PM +0200, Christian König wrote:
> Am 06.05.24 um 21:04 schrieb T.J. Mercier:
> > On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
> >  wrote:
> > > Hi TJ,
> > > 
> > > Seems I have got answers from [1], where it is agreed upon epoll() is
> > > the source of issue.
> > > 
> > > Thanks a lot for the discussion.
> > > 
> > > [1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/
> > > 
> > > Thanks
> > > Charan
> > Oh man, quite a set of threads on this over the weekend. Thanks for the 
> > link.
> 
> Yeah and it also has some interesting side conclusion: We should probably
> tell people to stop using DMA-buf with epoll.
> 
> The background is that the mutex approach epoll uses to make files disappear
> from the interest list on close results in the fact that each file can only
> be part of a single epoll at a time.
> 
> Now since DMA-buf is build around the idea that we share the buffer
> representation as file between processes it means that only one process at a
> time can use epoll with each DMA-buf.
> 
> So for example if a window manager uses epoll everything is fine. If a
> client is using epoll everything is fine as well. But if *both* use epoll at
> the same time it won't work.
> 
> This can lead to rather funny and hard to debug combinations of failures and
> I think we need to document this limitation and explicitly point it out.

Ok, I tested this with a small C program, and you're mixing things up.
Here's what I got

- You cannot add a file twice to the same epoll file/fd. So that part is
  correct, and also my understanding from reading the kernel code.

- You can add the same file to two different epoll file instaces. Which
  means it's totally fine to use epoll on a dma_buf in different processes
  like both in the compositor and in clients.

- Substantially more entertaining, you can nest epoll instances, and e.g.
  add a 2nd epoll file as an event to the first one. That way you can add
  the same file to both epoll fds, and so end up with the same file
  essentially being added twice to the top-level epoll file. So even
  within one application there's no real issue when e.g. different
  userspace drivers all want to use epoll on the same fd, because you can
  just throw in another level of epoll and it's fine again and you won't
  get an EEXISTS on EPOLL_CTL_ADD.

  But I also don't think we have this issue right now anywhere, since it's
  kinda a general epoll issue that happens with any duplicated file.

So I don't think there's any reasons to recommend against using epoll on
dma-buf fd (or sync_file or drm_syncobj or any of the sharing primitives
we have really).

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-07 Thread Christian König

Am 06.05.24 um 21:04 schrieb T.J. Mercier:

On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
 wrote:

Hi TJ,

Seems I have got answers from [1], where it is agreed upon epoll() is
the source of issue.

Thanks a lot for the discussion.

[1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/

Thanks
Charan

Oh man, quite a set of threads on this over the weekend. Thanks for the link.


Yeah and it also has some interesting side conclusion: We should 
probably tell people to stop using DMA-buf with epoll.


The background is that the mutex approach epoll uses to make files 
disappear from the interest list on close results in the fact that each 
file can only be part of a single epoll at a time.


Now since DMA-buf is build around the idea that we share the buffer 
representation as file between processes it means that only one process 
at a time can use epoll with each DMA-buf.


So for example if a window manager uses epoll everything is fine. If a 
client is using epoll everything is fine as well. But if *both* use 
epoll at the same time it won't work.


This can lead to rather funny and hard to debug combinations of failures 
and I think we need to document this limitation and explicitly point it out.


Regards,
Christian.


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-06 Thread T.J. Mercier
On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
 wrote:
>
> Hi TJ,
>
> Seems I have got answers from [1], where it is agreed upon epoll() is
> the source of issue.
>
> Thanks a lot for the discussion.
>
> [1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/
>
> Thanks
> Charan

Oh man, quite a set of threads on this over the weekend. Thanks for the link.


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-06 Thread Charan Teja Kalla
Hi TJ,

Seems I have got answers from [1], where it is agreed upon epoll() is
the source of issue.

Thanks a lot for the discussion.

[1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/

Thanks
Charan

On 5/5/2024 9:50 PM, Charan Teja Kalla wrote:
> Thanks T.J for the reply!!
> 
> On 5/4/2024 4:43 AM, T.J. Mercier wrote:
>> It looks like a similar conclusion about epoll was reached at:
>> https://lore.kernel.org/all/a87d7ef8-2c59-4dc5-ba0a-b821d1eff...@amd.com/
>>
> I am unaware of this discussion. Thanks...
> 
>> I agree with Christian that it should not be possible for the file to
>> be freed while inside dma_buf_poll. Aside from causing problems in
>> dma_buf_poll, ep_item_poll itself would have issues dereferencing the
>> freed file pointer.
>>
> Not sure about my understanding: ep_item_poll() always call the ->poll()
> interface with a stable 'struct file' because of ep->mtx. This lock
> ensures that:
>a) If eventpoll_release_file() get the ep->mtx first, ->poll()
> corresponds to the epitem(target file) will never be called, because it
> is removed from the rdlist.
> 
>b) If ep_send_events() get the ep->mtx() first, ->poll() will get
> called with a stable 'struct file', __but the refcount(->f_count) of a
> file can be zero__. I am saying that this is stable because the 'struct
> file' contents are still valid till we are in ->poll().
> 
> Can you/Christian help me with what I am missing here to say that
> ->poll() is receiving stale 'struct file*', please?
> 
> And, If you are convinced with above, I think, It should have been the
> responsibility of ->poll() implementation to have taken refcount on a
> file that is going to be still valid even after ->poll() exits. Incase
> of dma_buf_poll() implementation, it took the refcount on a file that is
> not going to be valid once the dma_buf_poll() exits(because of mentioned
> race with the freeing of the 'struct file*').
> 
> So, in dma_buf_poll(), Should we be using atomic_long_inc_not_zero()
> based implementation to take the refcount on a file?
> 
> Thanks,
> Charan


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-05 Thread Charan Teja Kalla
Thanks T.J for the reply!!

On 5/4/2024 4:43 AM, T.J. Mercier wrote:
> It looks like a similar conclusion about epoll was reached at:
> https://lore.kernel.org/all/a87d7ef8-2c59-4dc5-ba0a-b821d1eff...@amd.com/
> 
I am unaware of this discussion. Thanks...

> I agree with Christian that it should not be possible for the file to
> be freed while inside dma_buf_poll. Aside from causing problems in
> dma_buf_poll, ep_item_poll itself would have issues dereferencing the
> freed file pointer.
> 
Not sure about my understanding: ep_item_poll() always call the ->poll()
interface with a stable 'struct file' because of ep->mtx. This lock
ensures that:
   a) If eventpoll_release_file() get the ep->mtx first, ->poll()
corresponds to the epitem(target file) will never be called, because it
is removed from the rdlist.

   b) If ep_send_events() get the ep->mtx() first, ->poll() will get
called with a stable 'struct file', __but the refcount(->f_count) of a
file can be zero__. I am saying that this is stable because the 'struct
file' contents are still valid till we are in ->poll().

Can you/Christian help me with what I am missing here to say that
->poll() is receiving stale 'struct file*', please?

And, If you are convinced with above, I think, It should have been the
responsibility of ->poll() implementation to have taken refcount on a
file that is going to be still valid even after ->poll() exits. Incase
of dma_buf_poll() implementation, it took the refcount on a file that is
not going to be valid once the dma_buf_poll() exits(because of mentioned
race with the freeing of the 'struct file*').

So, in dma_buf_poll(), Should we be using atomic_long_inc_not_zero()
based implementation to take the refcount on a file?

Thanks,
Charan


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-03 Thread T.J. Mercier
On Fri, May 3, 2024 at 6:40 AM Charan Teja Kalla
 wrote:
>
> Thanks Christian/TJ for the inputs!!
>
> On 4/18/2024 12:16 PM, Christian König wrote:
> > As far as I can see the EPOLL holds a reference to the files it
> > contains. So it is perfectly valid to add the file descriptor to EPOLL
> > and then close it.
> >
> > In this case the file won't be closed, but be kept alive by it's
> > reference in the EPOLL file descriptor.
>
> I am not seeing that adding a 'fd' into the epoll monitoring list will
> increase its refcount.  Confirmed by writing a testcase that just do
> open + EPOLL_CTL_ADD and see the /proc/../fdinfo of epoll fd(Added
> file_count() info to the output)
> # cat /proc/136/fdinfo/3
> pos:0
> flags:  02
> mnt_id: 14
> ino:1041
> tfd:4 events:   19 data:4  pos:0 ino:81 sdev:5
> refcount: 1<-- The fd added to the epoll monitoring is still 1(same as
> open file refcount)
>
> From the code too, I don't see a file added in the epoll monitoring list
> will have an extra refcount but momentarily (where it increases the
> refcount of target file, add it to the rbtree of the epoll context and
> then decreasing the file refcount):
> do_epoll_ctl():
> f = fdget(epfd);
> tf = fdget(fd);
> epoll_mutex_lock(&ep->mtx)
> EPOLL_CTL_ADD:
> ep_insert(ep, epds, tf.file, fd, full_check); // Added to the 
> epoll
> monitoring rb tree list as epitem.
> mutex_unlock(&ep->mtx);
> fdput(tf);
> fdput(f);
>
>
> Not sure If i am completely mistaken what you're saying here.
>
> > The fs layer which calls dma_buf_poll() should make sure that the file
> > can't go away until the function returns.
> >
> > Then inside dma_buf_poll() we add another reference to the file while
> > installing the callback:
> >
> > /* Paired with fput in dma_buf_poll_cb */
> > get_file(dmabuf->file); No, exactly that can't
> > happen either.
> >
>
> I am not quite comfortable with epoll internals but I think below race
> is possible where "The 'file' passed to dma_buf_poll() is proper but
> ->f_count == 0, which is indicating that a parallel freeing is
> happening". So, we should check the file->f_count value before taking
> the refcount.
>
> (A 'fd' registered for the epoll monitoring list is maintained as
> 'epitem(epi)' in the rbtree of 'eventpoll(ep, called as epoll context).
>
> epoll_wait()__fput()(as file->f_count=0)
> 
> a) ep_poll_callback():
>  Is the waitqueue function
>called when signaled on the
>wait_queue_head_t of the registered
>poll() function.
>
>1) It links the 'epi' to the ready list
>   of 'ep':
>if (!ep_is_linked(epi))
>  list_add_tail_lockless(&epi->rdllink,
> &ep->rdllist)
>
>2) wake_up(&ep->wq);
> Wake up the process waiting
> on epoll_wait() that endup
> in ep_poll.
>
>
> b) ep_poll():
> 1) while (1) {
> eavail = ep_events_available(ep);
> (checks ep->rdlist)
> ep_send_events(ep);
> (notify the events to user)
> }
> (epoll_wait() calling process gets
>  woken up from a.2 and process the
>  events raised added to rdlist in a.1)
>
>2) ep_send_events():
> mutex_lock(&ep->mtx);
> (** The sync lock is taken **)
> list_for_each_entry_safe(epi, tmp,
> &txlist, rdllink) {
> list_del_init(&epi->rdllink);
> revents = ep_item_poll(epi, &pt, 1)
> (vfs_poll()-->...f_op->poll(=dma_buf_poll)
> }
> mutex_unlock(&ep->mtx)
> (**release the lock.**)
>
> (As part of procession of events,
>  each epitem is removed from rdlist
>  and call the f_op->poll() of a file
>  which will endup in dma_buf_poll())
>
>3) dma_buf_poll():
> get_file(dmabuf->file);
> (** f_count changed from 0->1 **)
> dma_buf_poll_add_cb(resv, true, dcb):
> (registers dma_buf_poll_cb() against fence)
>
>
> c) eventpoll_release_file():
>mutex_lock(&ep->mtx);
>__ep_remove(ep, epi, true):
>mutex_unlock(&ep->mtx);
>   (__ep_remove() will remove the
>'epi' from rbtree and if present
>from rdlist as well)
>
> d) file_free(file), free the 'file'.
>
> e) dma_buf_poll_cb:
>  /* Paired with get_file in dma_buf_poll */
>  fput(dmabuf->file);
>  (f_count changed from 1->0, where
>   we try to free the 'file' again
>   which is UAF/double free).
>
>
>
> In the above race, If c) gets called first, then the 'epi' is removed
> from both rbtree and 'r

Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-03 Thread Charan Teja Kalla
Thanks Christian/TJ for the inputs!!

On 4/18/2024 12:16 PM, Christian König wrote:
> As far as I can see the EPOLL holds a reference to the files it
> contains. So it is perfectly valid to add the file descriptor to EPOLL
> and then close it.
>
> In this case the file won't be closed, but be kept alive by it's
> reference in the EPOLL file descriptor.

I am not seeing that adding a 'fd' into the epoll monitoring list will
increase its refcount.  Confirmed by writing a testcase that just do
open + EPOLL_CTL_ADD and see the /proc/../fdinfo of epoll fd(Added
file_count() info to the output)
# cat /proc/136/fdinfo/3
pos:0
flags:  02
mnt_id: 14
ino:1041
tfd:4 events:   19 data:4  pos:0 ino:81 sdev:5
refcount: 1<-- The fd added to the epoll monitoring is still 1(same as
open file refcount)

>From the code too, I don't see a file added in the epoll monitoring list
will have an extra refcount but momentarily (where it increases the
refcount of target file, add it to the rbtree of the epoll context and
then decreasing the file refcount):
do_epoll_ctl():
f = fdget(epfd);
tf = fdget(fd);
epoll_mutex_lock(&ep->mtx)
EPOLL_CTL_ADD:
ep_insert(ep, epds, tf.file, fd, full_check); // Added to the 
epoll
monitoring rb tree list as epitem.
mutex_unlock(&ep->mtx);
fdput(tf);
fdput(f);


Not sure If i am completely mistaken what you're saying here.

> The fs layer which calls dma_buf_poll() should make sure that the file
> can't go away until the function returns.
> 
> Then inside dma_buf_poll() we add another reference to the file while
> installing the callback:
> 
>     /* Paired with fput in dma_buf_poll_cb */
>     get_file(dmabuf->file); No, exactly that can't
> happen either.
> 

I am not quite comfortable with epoll internals but I think below race
is possible where "The 'file' passed to dma_buf_poll() is proper but
->f_count == 0, which is indicating that a parallel freeing is
happening". So, we should check the file->f_count value before taking
the refcount.

(A 'fd' registered for the epoll monitoring list is maintained as
'epitem(epi)' in the rbtree of 'eventpoll(ep, called as epoll context).

epoll_wait()__fput()(as file->f_count=0)

a) ep_poll_callback():
 Is the waitqueue function
   called when signaled on the
   wait_queue_head_t of the registered
   poll() function.

   1) It links the 'epi' to the ready list
  of 'ep':
   if (!ep_is_linked(epi))
 list_add_tail_lockless(&epi->rdllink,
&ep->rdllist)

   2) wake_up(&ep->wq);
Wake up the process waiting
on epoll_wait() that endup
in ep_poll.


b) ep_poll():
1) while (1) {
eavail = ep_events_available(ep);
(checks ep->rdlist)
ep_send_events(ep);
(notify the events to user)
}
(epoll_wait() calling process gets
 woken up from a.2 and process the
 events raised added to rdlist in a.1)

   2) ep_send_events():
mutex_lock(&ep->mtx);
(** The sync lock is taken **)
list_for_each_entry_safe(epi, tmp,
&txlist, rdllink) {
list_del_init(&epi->rdllink);
revents = ep_item_poll(epi, &pt, 1)
(vfs_poll()-->...f_op->poll(=dma_buf_poll)
}
mutex_unlock(&ep->mtx)
(**release the lock.**)

(As part of procession of events,
 each epitem is removed from rdlist
 and call the f_op->poll() of a file
 which will endup in dma_buf_poll())

   3) dma_buf_poll():
get_file(dmabuf->file);
(** f_count changed from 0->1 **)
dma_buf_poll_add_cb(resv, true, dcb):
(registers dma_buf_poll_cb() against fence)


c) eventpoll_release_file():
   mutex_lock(&ep->mtx);
   __ep_remove(ep, epi, true):
   mutex_unlock(&ep->mtx);
  (__ep_remove() will remove the
   'epi' from rbtree and if present
   from rdlist as well)

d) file_free(file), free the 'file'.

e) dma_buf_poll_cb:
 /* Paired with get_file in dma_buf_poll */
 fput(dmabuf->file);
 (f_count changed from 1->0, where
  we try to free the 'file' again
  which is UAF/double free).



In the above race, If c) gets called first, then the 'epi' is removed
from both rbtree and 'rdlink' under ep->mtx lock thus b.2 don't end up
in calling the ->poll() as it don't see this event in the rdlist.

Race only exist If b.2 executes first, where it will call dma_buf_poll
with __valid 'struct file' under ep->mtx but its refcount is already
could have been zero__. Later When e)

Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-17 Thread Christian König

Am 18.04.24 um 03:33 schrieb zhiguojiang:

在 2024/4/15 19:57, Christian König 写道:
[Some people who received this message don't often get email from 
christian.koe...@amd.com. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]


Am 15.04.24 um 12:35 schrieb zhiguojiang:

在 2024/4/12 14:39, Christian König 写道:

[Some people who received this message don't often get email from
christian.koe...@amd.com. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]

Am 12.04.24 um 08:19 schrieb zhiguojiang:

[SNIP]
-> Here task 2220 do epoll again where internally it will get/put 
then

start to free twice and lead to final crash.

Here is the basic flow:

1. Thread A install the dma_buf_fd via dma_buf_export, the fd 
refcount

is 1

2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)

3. After use the dma buf, Thread A close the fd, then here fd 
refcount

is 0,
  and it will run __fput finally to release the file


Stop, that isn't correct.

The fs layer which calls dma_buf_poll() should make sure that the file
can't go away until the function returns.

Then inside dma_buf_poll() we add another reference to the file while
installing the callback:

    /* Paired with fput in dma_buf_poll_cb */
    get_file(dmabuf->file);

Hi,

The problem may just occurred here.

Is it possible file reference count already decreased to 0 and fput
already being in progressing just before calling
get_file(dmabuf->file) in dma_buf_poll()?


No, exactly that isn't possible.

If a function gets a dma_buf pointer or even more general any reference
counted pointer which has already decreased to 0 then that is a major
bug in the caller of that function.

BTW: It's completely illegal to work around such issues by using
file_count() or RCU functions. So when you suggest stuff like that it
will immediately face rejection.

Regards,
Christian.

Hi,

Thanks for your kind comment.

'If a function gets a dma_buf pointer or even more general any reference

counted pointer which has already decreased to 0 then that is a major

bug in the caller of that function.'

According to the issue log, we can see the dma buf file close and 
epoll operation running in parallel.


Apparently at the moment caller calls epoll, although another task 
caller already called close on the same fd, but this fd was still in 
progress to close, so fd is still valid, thus no EBADF returned to 
caller.


No, exactly that can't happen either.

As far as I can see the EPOLL holds a reference to the files it 
contains. So it is perfectly valid to add the file descriptor to EPOLL 
and then close it.


In this case the file won't be closed, but be kept alive by it's 
reference in the EPOLL file descriptor.




Then the two task contexts operate on same dma buf fd(one is close, 
another is epoll) in kernel space.


Please kindly help to comment whether above scenario is possible.

If there is some bug in the caller logic, Can u help to point it out? :)


Well what could be is that the EPOLL implementation is broken somehow, 
but I have really doubts on that.


As I said before the most likely explanation is that you have a broken 
device driver which messes up the DMA-buf file reference count somehow.


Regards,
Christian.



Regards,
Zhiguo






This reference is only dropped after the callback is completed in
dma_buf_poll_cb():

    /* Paired with get_file in dma_buf_poll */
    fput(dmabuf->file);

So your explanation for the issue just seems to be incorrect.



4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
still resides in one epoll_list.
  Although __fput will call eventpoll_release to remove the file from
binded epoll list,
  but it has small time window where Thread B jumps in.


Well if that is really the case then that would then be a bug in
epoll_ctl().



5. During the small window, Thread B do the poll action for
dma_buf_fd, where it will fget/fput for the file,
  this means the fd refcount will be 0 -> 1 -> 0, and it will call
__fput again.
  This will lead to __fput twice for the same file.

6. So the potenial fix is use get_file_rcu which to check if file
refcount already zero which means under free.
  If so, we just return and no need to do the dma_buf_poll.


Well to say it bluntly as far as I can see this suggestion is 
completely

nonsense.

When the reference to the file goes away while dma_buf_poll() is
executed then that's a massive bug in the caller of that function.

Regards,
Christian.



Here is the race condition:

Thread A Thread B
dma_buf_export
fd_refcount is 1
epoll_ctl(EPOLL_ADD)
add dma_buf_fd to epoll list
close(dma_buf_fd)
fd_refcount is 0
__fput
dma_buf_poll
fget
fput
fd_refcount is zero again

Thanks













Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-17 Thread zhiguojiang




在 2024/4/15 19:57, Christian König 写道:
[Some people who received this message don't often get email from 
christian.koe...@amd.com. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]


Am 15.04.24 um 12:35 schrieb zhiguojiang:

在 2024/4/12 14:39, Christian König 写道:

[Some people who received this message don't often get email from
christian.koe...@amd.com. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]

Am 12.04.24 um 08:19 schrieb zhiguojiang:

[SNIP]
-> Here task 2220 do epoll again where internally it will get/put then
start to free twice and lead to final crash.

Here is the basic flow:

1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount
is 1

2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)

3. After use the dma buf, Thread A close the fd, then here fd refcount
is 0,
  and it will run __fput finally to release the file


Stop, that isn't correct.

The fs layer which calls dma_buf_poll() should make sure that the file
can't go away until the function returns.

Then inside dma_buf_poll() we add another reference to the file while
installing the callback:

    /* Paired with fput in dma_buf_poll_cb */
    get_file(dmabuf->file);

Hi,

The problem may just occurred here.

Is it possible file reference count already decreased to 0 and fput
already being in progressing just before calling
get_file(dmabuf->file) in dma_buf_poll()?


No, exactly that isn't possible.

If a function gets a dma_buf pointer or even more general any reference
counted pointer which has already decreased to 0 then that is a major
bug in the caller of that function.

BTW: It's completely illegal to work around such issues by using
file_count() or RCU functions. So when you suggest stuff like that it
will immediately face rejection.

Regards,
Christian.

Hi,

Thanks for your kind comment.

'If a function gets a dma_buf pointer or even more general any reference

counted pointer which has already decreased to 0 then that is a major

bug in the caller of that function.'

According to the issue log, we can see the dma buf file close and epoll 
operation running in parallel.


Apparently at the moment caller calls epoll, although another task 
caller already called close on the same fd, but this fd was still in 
progress to close, so fd is still valid, thus no EBADF returned to caller.


Then the two task contexts operate on same dma buf fd(one is close, 
another is epoll) in kernel space.


Please kindly help to comment whether above scenario is possible.

If there is some bug in the caller logic, Can u help to point it out? :)

Regards,
Zhiguo






This reference is only dropped after the callback is completed in
dma_buf_poll_cb():

    /* Paired with get_file in dma_buf_poll */
    fput(dmabuf->file);

So your explanation for the issue just seems to be incorrect.



4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
still resides in one epoll_list.
  Although __fput will call eventpoll_release to remove the file from
binded epoll list,
  but it has small time window where Thread B jumps in.


Well if that is really the case then that would then be a bug in
epoll_ctl().



5. During the small window, Thread B do the poll action for
dma_buf_fd, where it will fget/fput for the file,
  this means the fd refcount will be 0 -> 1 -> 0, and it will call
__fput again.
  This will lead to __fput twice for the same file.

6. So the potenial fix is use get_file_rcu which to check if file
refcount already zero which means under free.
  If so, we just return and no need to do the dma_buf_poll.


Well to say it bluntly as far as I can see this suggestion is 
completely

nonsense.

When the reference to the file goes away while dma_buf_poll() is
executed then that's a massive bug in the caller of that function.

Regards,
Christian.



Here is the race condition:

Thread A Thread B
dma_buf_export
fd_refcount is 1
epoll_ctl(EPOLL_ADD)
add dma_buf_fd to epoll list
close(dma_buf_fd)
fd_refcount is 0
__fput
dma_buf_poll
fget
fput
fd_refcount is zero again

Thanks











Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-15 Thread Christian König

Am 15.04.24 um 12:35 schrieb zhiguojiang:

在 2024/4/12 14:39, Christian König 写道:
[Some people who received this message don't often get email from 
christian.koe...@amd.com. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]


Am 12.04.24 um 08:19 schrieb zhiguojiang:

[SNIP]
-> Here task 2220 do epoll again where internally it will get/put then
start to free twice and lead to final crash.

Here is the basic flow:

1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount
is 1

2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)

3. After use the dma buf, Thread A close the fd, then here fd refcount
is 0,
  and it will run __fput finally to release the file


Stop, that isn't correct.

The fs layer which calls dma_buf_poll() should make sure that the file
can't go away until the function returns.

Then inside dma_buf_poll() we add another reference to the file while
installing the callback:

    /* Paired with fput in dma_buf_poll_cb */
    get_file(dmabuf->file);

Hi,

The problem may just occurred here.

Is it possible file reference count already decreased to 0 and fput 
already being in progressing just before calling 
get_file(dmabuf->file) in dma_buf_poll()?


No, exactly that isn't possible.

If a function gets a dma_buf pointer or even more general any reference 
counted pointer which has already decreased to 0 then that is a major 
bug in the caller of that function.


BTW: It's completely illegal to work around such issues by using 
file_count() or RCU functions. So when you suggest stuff like that it 
will immediately face rejection.


Regards,
Christian.





This reference is only dropped after the callback is completed in
dma_buf_poll_cb():

    /* Paired with get_file in dma_buf_poll */
    fput(dmabuf->file);

So your explanation for the issue just seems to be incorrect.



4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
still resides in one epoll_list.
  Although __fput will call eventpoll_release to remove the file from
binded epoll list,
  but it has small time window where Thread B jumps in.


Well if that is really the case then that would then be a bug in
epoll_ctl().



5. During the small window, Thread B do the poll action for
dma_buf_fd, where it will fget/fput for the file,
  this means the fd refcount will be 0 -> 1 -> 0, and it will call
__fput again.
  This will lead to __fput twice for the same file.

6. So the potenial fix is use get_file_rcu which to check if file
refcount already zero which means under free.
  If so, we just return and no need to do the dma_buf_poll.


Well to say it bluntly as far as I can see this suggestion is completely
nonsense.

When the reference to the file goes away while dma_buf_poll() is
executed then that's a massive bug in the caller of that function.

Regards,
Christian.



Here is the race condition:

Thread A Thread B
dma_buf_export
fd_refcount is 1
epoll_ctl(EPOLL_ADD)
add dma_buf_fd to epoll list
close(dma_buf_fd)
fd_refcount is 0
__fput
dma_buf_poll
fget
fput
fd_refcount is zero again

Thanks









Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-15 Thread zhiguojiang




在 2024/4/12 14:39, Christian König 写道:
[Some people who received this message don't often get email from 
christian.koe...@amd.com. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]


Am 12.04.24 um 08:19 schrieb zhiguojiang:

[SNIP]
-> Here task 2220 do epoll again where internally it will get/put then
start to free twice and lead to final crash.

Here is the basic flow:

1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount
is 1

2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)

3. After use the dma buf, Thread A close the fd, then here fd refcount
is 0,
  and it will run __fput finally to release the file


Stop, that isn't correct.

The fs layer which calls dma_buf_poll() should make sure that the file
can't go away until the function returns.

Then inside dma_buf_poll() we add another reference to the file while
installing the callback:

    /* Paired with fput in dma_buf_poll_cb */
    get_file(dmabuf->file);

Hi,

The problem may just occurred here.

Is it possible file reference count already decreased to 0 and fput 
already being in progressing just before calling get_file(dmabuf->file) 
in dma_buf_poll()?




This reference is only dropped after the callback is completed in
dma_buf_poll_cb():

    /* Paired with get_file in dma_buf_poll */
    fput(dmabuf->file);

So your explanation for the issue just seems to be incorrect.



4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
still resides in one epoll_list.
  Although __fput will call eventpoll_release to remove the file from
binded epoll list,
  but it has small time window where Thread B jumps in.


Well if that is really the case then that would then be a bug in
epoll_ctl().



5. During the small window, Thread B do the poll action for
dma_buf_fd, where it will fget/fput for the file,
  this means the fd refcount will be 0 -> 1 -> 0, and it will call
__fput again.
  This will lead to __fput twice for the same file.

6. So the potenial fix is use get_file_rcu which to check if file
refcount already zero which means under free.
  If so, we just return and no need to do the dma_buf_poll.


Well to say it bluntly as far as I can see this suggestion is completely
nonsense.

When the reference to the file goes away while dma_buf_poll() is
executed then that's a massive bug in the caller of that function.

Regards,
Christian.



Here is the race condition:

Thread A Thread B
dma_buf_export
fd_refcount is 1
epoll_ctl(EPOLL_ADD)
add dma_buf_fd to epoll list
close(dma_buf_fd)
fd_refcount is 0
__fput
dma_buf_poll
fget
fput
fd_refcount is zero again

Thanks







Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-11 Thread Christian König

Am 12.04.24 um 08:19 schrieb zhiguojiang:

[SNIP]
-> Here task 2220 do epoll again where internally it will get/put then 
start to free twice and lead to final crash.


Here is the basic flow:

1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount 
is 1


2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)

3. After use the dma buf, Thread A close the fd, then here fd refcount 
is 0,

  and it will run __fput finally to release the file


Stop, that isn't correct.

The fs layer which calls dma_buf_poll() should make sure that the file 
can't go away until the function returns.


Then inside dma_buf_poll() we add another reference to the file while 
installing the callback:


    /* Paired with fput in dma_buf_poll_cb */
    get_file(dmabuf->file);

This reference is only dropped after the callback is completed in 
dma_buf_poll_cb():


    /* Paired with get_file in dma_buf_poll */
    fput(dmabuf->file);

So your explanation for the issue just seems to be incorrect.



4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it 
still resides in one epoll_list.
  Although __fput will call eventpoll_release to remove the file from 
binded epoll list,

  but it has small time window where Thread B jumps in.


Well if that is really the case then that would then be a bug in 
epoll_ctl().




5. During the small window, Thread B do the poll action for 
dma_buf_fd, where it will fget/fput for the file,
  this means the fd refcount will be 0 -> 1 -> 0, and it will call 
__fput again.

  This will lead to __fput twice for the same file.

6. So the potenial fix is use get_file_rcu which to check if file 
refcount already zero which means under free.

  If so, we just return and no need to do the dma_buf_poll.


Well to say it bluntly as far as I can see this suggestion is completely 
nonsense.


When the reference to the file goes away while dma_buf_poll() is 
executed then that's a massive bug in the caller of that function.


Regards,
Christian.



Here is the race condition:

Thread A Thread B
dma_buf_export
fd_refcount is 1
epoll_ctl(EPOLL_ADD)
add dma_buf_fd to epoll list
close(dma_buf_fd)
fd_refcount is 0
__fput
dma_buf_poll
fget
fput
fd_refcount is zero again

Thanks





Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-11 Thread zhiguojiang




在 2024/4/3 2:22, T.J. Mercier 写道:

[你通常不会收到来自 tjmerc...@google.com 的电子邮件。请访问 
https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]

On Tue, Apr 2, 2024 at 1:08 AM Christian König  wrote:

Am 02.04.24 um 08:49 schrieb zhiguojiang:

As far as I can see that's not because of the DMA-buf code, but
because you are somehow using this interface incorrectly.

When dma_buf_poll() is called it is mandatory for the caller to hold
a reference to the file descriptor on which the poll operation is
executed.

So adding code like "if (!file_count(file))" in the beginning of
dma_buf_poll() is certainly broken.

My best guess is that you have some unbalanced
dma_buf_get()/dma_buf_put() somewhere instead.



Hi Christian,

The kernel dma_buf_poll() code shound not cause system crashes due to
the user mode usage logical issues ?

What user mode logical issues are you talking about? Closing a file
while polling on it is perfectly valid.

dma_buf_poll() is called by the filesystem layer and it's the filesystem
layer which should make sure that a file can't be closed while polling
for an event.

If that doesn't work then you have stumbled over a massive bug in the fs
layer. And I have some doubts that this is actually the case.

What is more likely is that some driver messes up the reference count
and because of this you see an UAF.

Anyway as far as I can see the DMA-buf code is correct regarding this.

Regards,
Christian.

I tried to reproduce this problem but I couldn't. What I see is a ref
get taken when poll is first called. So subsequently closing the fd in
userspace while it's being polled doesn't take the refcount all the
way to 0. That happens when dma_buf_poll_cb fires, either due to an
event or when the fd is closed upon timeout.

I don't really see how this could be triggered from userspace so I am
also suspicious of dma_buf_get/put.

Hi,

Panic signature:

> list_del corruption, ff8a6f143a90->next is LIST_POISON1
> (dead0100)
> [ cut here ]
> kernel BUG at lib/list_debug.c:55!
> Internal error: Oops - BUG: f2000800 [#1] PREEMPT SMP
> pc : __list_del_entry_valid+0x98/0xd4
> lr : __list_del_entry_valid+0x98/0xd4
> sp : ffc01d413d00
> x29: ffc01d413d00 x28: 00c0 x27: 0020
> x26:  x25:  x24: 00080007
> x23: ff8b22e5dcc0 x22: ff88a6be12d0 x21: ff8b22e572b0
> x20: ff80254ed0a0 x19: ff8a6f143a00 x18: ffda5efed3c0
> x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
> x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0018
> x11: ff8b6c188000 x10:  x9 : 6c8413a194897b00
> x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
> x5 : ff8b6c3b2a3e x4 : ff8b6c3b2a40 x3 : 10301005
> x2 : 0001 x1 : 00c0 x0 : 004e
> Call trace:
>  __list_del_entry_valid+0x98/0xd4
>  dma_buf_file_release+0x48/0x90
>  __fput+0xf4/0x280
>  fput+0x10/0x20
>  task_work_run+0xcc/0xf4
>  do_notify_resume+0x2a0/0x33c
>  el0_svc+0x5c/0xa4
>  el0t_64_sync_handler+0x68/0xb4
>  el0t_64_sync+0x1a0/0x1a4

It is caused that the same dma buf file released twice in short time, so 
it should be race issue of dma buf release.


Below ftrace log shows the procedure how dma buf release twice:

Line 715473:    android.display-2220  [006] 22160.660738: bprint:    
__fget_files __fget: [file: 0xff8ab9e57b80, dmabuf: 
0xff8b1baa6a00, count: 0x3] call:(('do_epoll_ctl', 
356)ffd4ad46411c<-('__arm64_sys_epoll_ctl', 
112)ffd4ad463dd8<-('invoke_syscall', 
96)ffd4acebe00c<-('el0_svc_common', 
140)ffd4acebdf40<-('do_el0_svc', 40)ffd4acebde44<-('el0_svc', 
36)ffd4ae57ffcc<-('el0t_64_sync_handler', 136)ffd4ae57ff44)


Line 715475:    android.display-2220  [006] 22160.660739: bprint:    
get_file get_file[file: 0xff8ab9e57b80, dmabuf: 0xff8b1baa6a00, 
f_count: 0x4] call:(('dma_buf_poll', 760)ffd4adb685c8<-('ep_insert', 
1120)ffd4ad464bcc<-('do_epoll_ctl', 
1196)ffd4ad464464<-('__arm64_sys_epoll_ctl', 
112)ffd4ad463dd8<-('invoke_syscall', 
96)ffd4acebe00c<-('el0_svc_common', 
140)ffd4acebdf40<-('do_el0_svc', 40)ffd4acebde44)


Line 715477:    android.display-2220  [006] 22160.660740: bprint:    
fput_many fput for dma buf file[file: 0xff8ab9e57b80, dmabuf: 
0xff8b1baa6a00, count: 0x4] call:(('dma_buf_poll', 
1104)ffd4adb68720<-('ep_insert', 
1120)ffd4ad464bcc<-('do_epoll_ctl', 
1196)ffd4ad464464<-('__arm64_sys_epoll_ctl', 
112)ffd4ad463dd8<-('invoke_syscall', 
96)ffd4acebe00c<-('el0_svc_common', 
140)ffd4acebdf40<-('do_el0_svc', 40)ffd4acebde44)


Line 715479:    android.display-2220  [006] 22160.660741: bprint:    
fput_many fput for dma buf file[file: 0xff8ab9e57b80, dmabuf: 
0xff8b1baa6a00, count: 0x3] call:(('do_epoll_ctl', 
652)ffd4ad464244<-('__arm64_sys_epoll_ctl', 
112)ffd4ad463

Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-02 Thread T.J. Mercier
On Tue, Apr 2, 2024 at 1:08 AM Christian König  wrote:
>
> Am 02.04.24 um 08:49 schrieb zhiguojiang:
> >> As far as I can see that's not because of the DMA-buf code, but
> >> because you are somehow using this interface incorrectly.
> >>
> >> When dma_buf_poll() is called it is mandatory for the caller to hold
> >> a reference to the file descriptor on which the poll operation is
> >> executed.
> >>
> >> So adding code like "if (!file_count(file))" in the beginning of
> >> dma_buf_poll() is certainly broken.
> >>
> >> My best guess is that you have some unbalanced
> >> dma_buf_get()/dma_buf_put() somewhere instead.
> >>
> >>
> > Hi Christian,
> >
> > The kernel dma_buf_poll() code shound not cause system crashes due to
> > the user mode usage logical issues ?
>
> What user mode logical issues are you talking about? Closing a file
> while polling on it is perfectly valid.
>
> dma_buf_poll() is called by the filesystem layer and it's the filesystem
> layer which should make sure that a file can't be closed while polling
> for an event.
>
> If that doesn't work then you have stumbled over a massive bug in the fs
> layer. And I have some doubts that this is actually the case.
>
> What is more likely is that some driver messes up the reference count
> and because of this you see an UAF.
>
> Anyway as far as I can see the DMA-buf code is correct regarding this.
>
> Regards,
> Christian.

I tried to reproduce this problem but I couldn't. What I see is a ref
get taken when poll is first called. So subsequently closing the fd in
userspace while it's being polled doesn't take the refcount all the
way to 0. That happens when dma_buf_poll_cb fires, either due to an
event or when the fd is closed upon timeout.

I don't really see how this could be triggered from userspace so I am
also suspicious of dma_buf_get/put.


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-02 Thread Christian König

Am 02.04.24 um 08:49 schrieb zhiguojiang:
As far as I can see that's not because of the DMA-buf code, but 
because you are somehow using this interface incorrectly.


When dma_buf_poll() is called it is mandatory for the caller to hold 
a reference to the file descriptor on which the poll operation is 
executed.


So adding code like "if (!file_count(file))" in the beginning of 
dma_buf_poll() is certainly broken.


My best guess is that you have some unbalanced 
dma_buf_get()/dma_buf_put() somewhere instead.




Hi Christian,

The kernel dma_buf_poll() code shound not cause system crashes due to 
the user mode usage logical issues ?


What user mode logical issues are you talking about? Closing a file 
while polling on it is perfectly valid.


dma_buf_poll() is called by the filesystem layer and it's the filesystem 
layer which should make sure that a file can't be closed while polling 
for an event.


If that doesn't work then you have stumbled over a massive bug in the fs 
layer. And I have some doubts that this is actually the case.


What is more likely is that some driver messes up the reference count 
and because of this you see an UAF.


Anyway as far as I can see the DMA-buf code is correct regarding this.

Regards,
Christian.



Thanks


在 2024/4/1 20:22, Christian König 写道:

Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:

The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
that the dmabuf file fd is added to the epoll event listener list, and
when it is released, it is not removed from the epoll list, which leads
to the UAF(Use-After-Free) issue.


As far as I can see that's not because of the DMA-buf code, but 
because you are somehow using this interface incorrectly.


When dma_buf_poll() is called it is mandatory for the caller to hold 
a reference to the file descriptor on which the poll operation is 
executed.


So adding code like "if (!file_count(file))" in the beginning of 
dma_buf_poll() is certainly broken.


My best guess is that you have some unbalanced 
dma_buf_get()/dma_buf_put() somewhere instead.


Regards,
Christian.



The UAF issue can be solved by checking dmabuf file->f_count value and
skipping the poll operation for the closed dmabuf file in the
dma_buf_poll(). We have tested this solved patch multiple times and
have not reproduced the uaf issue.

crash dump:
list_del corruption, ff8a6f143a90->next is LIST_POISON1
(dead0100)
[ cut here ]
kernel BUG at lib/list_debug.c:55!
Internal error: Oops - BUG: f2000800 [#1] PREEMPT SMP
pc : __list_del_entry_valid+0x98/0xd4
lr : __list_del_entry_valid+0x98/0xd4
sp : ffc01d413d00
x29: ffc01d413d00 x28: 00c0 x27: 0020
x26:  x25:  x24: 00080007
x23: ff8b22e5dcc0 x22: ff88a6be12d0 x21: ff8b22e572b0
x20: ff80254ed0a0 x19: ff8a6f143a00 x18: ffda5efed3c0
x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0018
x11: ff8b6c188000 x10:  x9 : 6c8413a194897b00
x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
x5 : ff8b6c3b2a3e x4 : ff8b6c3b2a40 x3 : 10301005
x2 : 0001 x1 : 00c0 x0 : 004e
Call trace:
  __list_del_entry_valid+0x98/0xd4
  dma_buf_file_release+0x48/0x90
  __fput+0xf4/0x280
  fput+0x10/0x20
  task_work_run+0xcc/0xf4
  do_notify_resume+0x2a0/0x33c
  el0_svc+0x5c/0xa4
  el0t_64_sync_handler+0x68/0xb4
  el0t_64_sync+0x1a0/0x1a4
Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d421)
---[ end trace  ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs

Signed-off-by: Zhiguo Jiang 
---
  drivers/dma-buf/dma-buf.c | 28 
  1 file changed, 24 insertions(+), 4 deletions(-)
  mode change 100644 => 100755 drivers/dma-buf/dma-buf.c

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..e469dd9288cc
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)

  struct dma_resv *resv;
  __poll_t events;
  +    /* Check if the file exists */
+    if (!file_count(file))
+    return EPOLLERR;
+
  dmabuf = file->private_data;
  if (!dmabuf || !dmabuf->resv)
  return EPOLLERR;
@@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)

  spin_unlock_irq(&dmabuf->poll.lock);
    if (events & EPOLLOUT) {
-    /* Paired with fput in dma_buf_poll_cb */
-    get_file(dmabuf->file);
+    /*
+ * Paired with fput in dma_buf_poll_cb,
+ * Skip poll for the closed file.
+ */
+    if (!get_file_rcu(&dmabuf->file)) {
+    events &= ~EPOLLOUT;
+    dcb->active = 0;
+    goto cl

Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-01 Thread zhiguojiang
As far as I can see that's not because of the DMA-buf code, but 
because you are somehow using this interface incorrectly.


When dma_buf_poll() is called it is mandatory for the caller to hold a 
reference to the file descriptor on which the poll operation is executed.


So adding code like "if (!file_count(file))" in the beginning of 
dma_buf_poll() is certainly broken.


My best guess is that you have some unbalanced 
dma_buf_get()/dma_buf_put() somewhere instead.




Hi Christian,

The kernel dma_buf_poll() code shound not cause system crashes due to 
the user mode usage logical issues ?


Thanks


在 2024/4/1 20:22, Christian König 写道:

Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:

The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
that the dmabuf file fd is added to the epoll event listener list, and
when it is released, it is not removed from the epoll list, which leads
to the UAF(Use-After-Free) issue.


As far as I can see that's not because of the DMA-buf code, but 
because you are somehow using this interface incorrectly.


When dma_buf_poll() is called it is mandatory for the caller to hold a 
reference to the file descriptor on which the poll operation is executed.


So adding code like "if (!file_count(file))" in the beginning of 
dma_buf_poll() is certainly broken.


My best guess is that you have some unbalanced 
dma_buf_get()/dma_buf_put() somewhere instead.


Regards,
Christian.



The UAF issue can be solved by checking dmabuf file->f_count value and
skipping the poll operation for the closed dmabuf file in the
dma_buf_poll(). We have tested this solved patch multiple times and
have not reproduced the uaf issue.

crash dump:
list_del corruption, ff8a6f143a90->next is LIST_POISON1
(dead0100)
[ cut here ]
kernel BUG at lib/list_debug.c:55!
Internal error: Oops - BUG: f2000800 [#1] PREEMPT SMP
pc : __list_del_entry_valid+0x98/0xd4
lr : __list_del_entry_valid+0x98/0xd4
sp : ffc01d413d00
x29: ffc01d413d00 x28: 00c0 x27: 0020
x26:  x25:  x24: 00080007
x23: ff8b22e5dcc0 x22: ff88a6be12d0 x21: ff8b22e572b0
x20: ff80254ed0a0 x19: ff8a6f143a00 x18: ffda5efed3c0
x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0018
x11: ff8b6c188000 x10:  x9 : 6c8413a194897b00
x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
x5 : ff8b6c3b2a3e x4 : ff8b6c3b2a40 x3 : 10301005
x2 : 0001 x1 : 00c0 x0 : 004e
Call trace:
  __list_del_entry_valid+0x98/0xd4
  dma_buf_file_release+0x48/0x90
  __fput+0xf4/0x280
  fput+0x10/0x20
  task_work_run+0xcc/0xf4
  do_notify_resume+0x2a0/0x33c
  el0_svc+0x5c/0xa4
  el0t_64_sync_handler+0x68/0xb4
  el0t_64_sync+0x1a0/0x1a4
Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d421)
---[ end trace  ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs

Signed-off-by: Zhiguo Jiang 
---
  drivers/dma-buf/dma-buf.c | 28 
  1 file changed, 24 insertions(+), 4 deletions(-)
  mode change 100644 => 100755 drivers/dma-buf/dma-buf.c

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..e469dd9288cc
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)

  struct dma_resv *resv;
  __poll_t events;
  +    /* Check if the file exists */
+    if (!file_count(file))
+    return EPOLLERR;
+
  dmabuf = file->private_data;
  if (!dmabuf || !dmabuf->resv)
  return EPOLLERR;
@@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)

  spin_unlock_irq(&dmabuf->poll.lock);
    if (events & EPOLLOUT) {
-    /* Paired with fput in dma_buf_poll_cb */
-    get_file(dmabuf->file);
+    /*
+ * Paired with fput in dma_buf_poll_cb,
+ * Skip poll for the closed file.
+ */
+    if (!get_file_rcu(&dmabuf->file)) {
+    events &= ~EPOLLOUT;
+    dcb->active = 0;
+    goto clear_out_event;
+    }
    if (!dma_buf_poll_add_cb(resv, true, dcb))
  /* No callback queued, wake up any other waiters */
@@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file, 
poll_table *poll)

  }
  }
  +clear_out_event:
  if (events & EPOLLIN) {
  struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
  @@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file 
*file, poll_table *poll)

  spin_unlock_irq(&dmabuf->poll.lock);
    if (events & EPOLLIN) {
-    /* Paired with fput in dma_buf_poll_cb */
-    get_file(dmabuf->file);
+    /*
+  

Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-01 Thread zhiguojiang

Hi T.J.,


What is the most recent kernel version you've seen the bug on?

The latest kernel version of the issue we discovered is kernel-6.1.25, and
kernel-5.15 also reported this issue.

You are closing the dmabuf fd from another thread while it is still
part of the epoll interest list?

Yes, we found that threads related to android.display modules performed
an operation to close this dmabuf fd while it was still part of the 
epoll interest
list,  but the specific threads were random. So I think this is also 
issue with

the logic of kernel dmabuf code.

Thanks,
Zhiguo


在 2024/3/30 7:36, T.J. Mercier 写道:

[你通常不会收到来自 tjmerc...@google.com 的电子邮件。请访问 
https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]

On Tue, Mar 26, 2024 at 7:29 PM Zhiguo Jiang  wrote:

The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
that the dmabuf file fd is added to the epoll event listener list, and
when it is released, it is not removed from the epoll list, which leads
to the UAF(Use-After-Free) issue.

The UAF issue can be solved by checking dmabuf file->f_count value and
skipping the poll operation for the closed dmabuf file in the
dma_buf_poll(). We have tested this solved patch multiple times and
have not reproduced the uaf issue.


Hi Zhiguo,

What is the most recent kernel version you've seen the bug on?

You are closing the dmabuf fd from another thread while it is still
part of the epoll interest list?

Thanks,
T.J.




Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-04-01 Thread Christian König

Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:

The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
that the dmabuf file fd is added to the epoll event listener list, and
when it is released, it is not removed from the epoll list, which leads
to the UAF(Use-After-Free) issue.


As far as I can see that's not because of the DMA-buf code, but because 
you are somehow using this interface incorrectly.


When dma_buf_poll() is called it is mandatory for the caller to hold a 
reference to the file descriptor on which the poll operation is executed.


So adding code like "if (!file_count(file))" in the beginning of 
dma_buf_poll() is certainly broken.


My best guess is that you have some unbalanced 
dma_buf_get()/dma_buf_put() somewhere instead.


Regards,
Christian.



The UAF issue can be solved by checking dmabuf file->f_count value and
skipping the poll operation for the closed dmabuf file in the
dma_buf_poll(). We have tested this solved patch multiple times and
have not reproduced the uaf issue.

crash dump:
list_del corruption, ff8a6f143a90->next is LIST_POISON1
(dead0100)
[ cut here ]
kernel BUG at lib/list_debug.c:55!
Internal error: Oops - BUG: f2000800 [#1] PREEMPT SMP
pc : __list_del_entry_valid+0x98/0xd4
lr : __list_del_entry_valid+0x98/0xd4
sp : ffc01d413d00
x29: ffc01d413d00 x28: 00c0 x27: 0020
x26:  x25:  x24: 00080007
x23: ff8b22e5dcc0 x22: ff88a6be12d0 x21: ff8b22e572b0
x20: ff80254ed0a0 x19: ff8a6f143a00 x18: ffda5efed3c0
x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0018
x11: ff8b6c188000 x10:  x9 : 6c8413a194897b00
x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
x5 : ff8b6c3b2a3e x4 : ff8b6c3b2a40 x3 : 10301005
x2 : 0001 x1 : 00c0 x0 : 004e
Call trace:
  __list_del_entry_valid+0x98/0xd4
  dma_buf_file_release+0x48/0x90
  __fput+0xf4/0x280
  fput+0x10/0x20
  task_work_run+0xcc/0xf4
  do_notify_resume+0x2a0/0x33c
  el0_svc+0x5c/0xa4
  el0t_64_sync_handler+0x68/0xb4
  el0t_64_sync+0x1a0/0x1a4
Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d421)
---[ end trace  ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs

Signed-off-by: Zhiguo Jiang 
---
  drivers/dma-buf/dma-buf.c | 28 
  1 file changed, 24 insertions(+), 4 deletions(-)
  mode change 100644 => 100755 drivers/dma-buf/dma-buf.c

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..e469dd9288cc
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, poll_table 
*poll)
struct dma_resv *resv;
__poll_t events;
  
+	/* Check if the file exists */

+   if (!file_count(file))
+   return EPOLLERR;
+
dmabuf = file->private_data;
if (!dmabuf || !dmabuf->resv)
return EPOLLERR;
@@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, poll_table 
*poll)
spin_unlock_irq(&dmabuf->poll.lock);
  
  		if (events & EPOLLOUT) {

-   /* Paired with fput in dma_buf_poll_cb */
-   get_file(dmabuf->file);
+   /*
+* Paired with fput in dma_buf_poll_cb,
+* Skip poll for the closed file.
+*/
+   if (!get_file_rcu(&dmabuf->file)) {
+   events &= ~EPOLLOUT;
+   dcb->active = 0;
+   goto clear_out_event;
+   }
  
  			if (!dma_buf_poll_add_cb(resv, true, dcb))

/* No callback queued, wake up any other 
waiters */
@@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table 
*poll)
}
}
  
+clear_out_event:

if (events & EPOLLIN) {
struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
  
@@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)

spin_unlock_irq(&dmabuf->poll.lock);
  
  		if (events & EPOLLIN) {

-   /* Paired with fput in dma_buf_poll_cb */
-   get_file(dmabuf->file);
+   /*
+* Paired with fput in dma_buf_poll_cb,
+* Skip poll for the closed file.
+*/
+   if (!get_file_rcu(&dmabuf->file)) {
+   events &= ~EPOLLIN;
+   dcb->active = 0;
+   goto clear_in_event;
+   }
  
  			if (!dma_buf_poll_add_cb(resv, false,

Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-03-29 Thread T.J. Mercier
On Tue, Mar 26, 2024 at 7:29 PM Zhiguo Jiang  wrote:
>
> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
> that the dmabuf file fd is added to the epoll event listener list, and
> when it is released, it is not removed from the epoll list, which leads
> to the UAF(Use-After-Free) issue.
>
> The UAF issue can be solved by checking dmabuf file->f_count value and
> skipping the poll operation for the closed dmabuf file in the
> dma_buf_poll(). We have tested this solved patch multiple times and
> have not reproduced the uaf issue.
>

Hi Zhiguo,

What is the most recent kernel version you've seen the bug on?

You are closing the dmabuf fd from another thread while it is still
part of the epoll interest list?

Thanks,
T.J.