Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue
Hi TJ, Seems I have got answers from [1], where it is agreed upon epoll() is the source of issue. Thanks a lot for the discussion. [1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/ Thanks Charan On 5/5/2024 9:50 PM, Charan Teja Kalla wrote: > Thanks T.J for the reply!! > > On 5/4/2024 4:43 AM, T.J. Mercier wrote: >> It looks like a similar conclusion about epoll was reached at: >> https://lore.kernel.org/all/a87d7ef8-2c59-4dc5-ba0a-b821d1eff...@amd.com/ >> > I am unaware of this discussion. Thanks... > >> I agree with Christian that it should not be possible for the file to >> be freed while inside dma_buf_poll. Aside from causing problems in >> dma_buf_poll, ep_item_poll itself would have issues dereferencing the >> freed file pointer. >> > Not sure about my understanding: ep_item_poll() always call the ->poll() > interface with a stable 'struct file' because of ep->mtx. This lock > ensures that: >a) If eventpoll_release_file() get the ep->mtx first, ->poll() > corresponds to the epitem(target file) will never be called, because it > is removed from the rdlist. > >b) If ep_send_events() get the ep->mtx() first, ->poll() will get > called with a stable 'struct file', __but the refcount(->f_count) of a > file can be zero__. I am saying that this is stable because the 'struct > file' contents are still valid till we are in ->poll(). > > Can you/Christian help me with what I am missing here to say that > ->poll() is receiving stale 'struct file*', please? > > And, If you are convinced with above, I think, It should have been the > responsibility of ->poll() implementation to have taken refcount on a > file that is going to be still valid even after ->poll() exits. Incase > of dma_buf_poll() implementation, it took the refcount on a file that is > not going to be valid once the dma_buf_poll() exits(because of mentioned > race with the freeing of the 'struct file*'). > > So, in dma_buf_poll(), Should we be using atomic_long_inc_not_zero() > based implementation to take the refcount on a file? > > Thanks, > Charan
Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue
Thanks T.J for the reply!! On 5/4/2024 4:43 AM, T.J. Mercier wrote: > It looks like a similar conclusion about epoll was reached at: > https://lore.kernel.org/all/a87d7ef8-2c59-4dc5-ba0a-b821d1eff...@amd.com/ > I am unaware of this discussion. Thanks... > I agree with Christian that it should not be possible for the file to > be freed while inside dma_buf_poll. Aside from causing problems in > dma_buf_poll, ep_item_poll itself would have issues dereferencing the > freed file pointer. > Not sure about my understanding: ep_item_poll() always call the ->poll() interface with a stable 'struct file' because of ep->mtx. This lock ensures that: a) If eventpoll_release_file() get the ep->mtx first, ->poll() corresponds to the epitem(target file) will never be called, because it is removed from the rdlist. b) If ep_send_events() get the ep->mtx() first, ->poll() will get called with a stable 'struct file', __but the refcount(->f_count) of a file can be zero__. I am saying that this is stable because the 'struct file' contents are still valid till we are in ->poll(). Can you/Christian help me with what I am missing here to say that ->poll() is receiving stale 'struct file*', please? And, If you are convinced with above, I think, It should have been the responsibility of ->poll() implementation to have taken refcount on a file that is going to be still valid even after ->poll() exits. Incase of dma_buf_poll() implementation, it took the refcount on a file that is not going to be valid once the dma_buf_poll() exits(because of mentioned race with the freeing of the 'struct file*'). So, in dma_buf_poll(), Should we be using atomic_long_inc_not_zero() based implementation to take the refcount on a file? Thanks, Charan
Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue
Thanks Christian/TJ for the inputs!! On 4/18/2024 12:16 PM, Christian König wrote: > As far as I can see the EPOLL holds a reference to the files it > contains. So it is perfectly valid to add the file descriptor to EPOLL > and then close it. > > In this case the file won't be closed, but be kept alive by it's > reference in the EPOLL file descriptor. I am not seeing that adding a 'fd' into the epoll monitoring list will increase its refcount. Confirmed by writing a testcase that just do open + EPOLL_CTL_ADD and see the /proc/../fdinfo of epoll fd(Added file_count() info to the output) # cat /proc/136/fdinfo/3 pos:0 flags: 02 mnt_id: 14 ino:1041 tfd:4 events: 19 data:4 pos:0 ino:81 sdev:5 refcount: 1<-- The fd added to the epoll monitoring is still 1(same as open file refcount) >From the code too, I don't see a file added in the epoll monitoring list will have an extra refcount but momentarily (where it increases the refcount of target file, add it to the rbtree of the epoll context and then decreasing the file refcount): do_epoll_ctl(): f = fdget(epfd); tf = fdget(fd); epoll_mutex_lock(&ep->mtx) EPOLL_CTL_ADD: ep_insert(ep, epds, tf.file, fd, full_check); // Added to the epoll monitoring rb tree list as epitem. mutex_unlock(&ep->mtx); fdput(tf); fdput(f); Not sure If i am completely mistaken what you're saying here. > The fs layer which calls dma_buf_poll() should make sure that the file > can't go away until the function returns. > > Then inside dma_buf_poll() we add another reference to the file while > installing the callback: > > /* Paired with fput in dma_buf_poll_cb */ > get_file(dmabuf->file); No, exactly that can't > happen either. > I am not quite comfortable with epoll internals but I think below race is possible where "The 'file' passed to dma_buf_poll() is proper but ->f_count == 0, which is indicating that a parallel freeing is happening". So, we should check the file->f_count value before taking the refcount. (A 'fd' registered for the epoll monitoring list is maintained as 'epitem(epi)' in the rbtree of 'eventpoll(ep, called as epoll context). epoll_wait()__fput()(as file->f_count=0) a) ep_poll_callback(): Is the waitqueue function called when signaled on the wait_queue_head_t of the registered poll() function. 1) It links the 'epi' to the ready list of 'ep': if (!ep_is_linked(epi)) list_add_tail_lockless(&epi->rdllink, &ep->rdllist) 2) wake_up(&ep->wq); Wake up the process waiting on epoll_wait() that endup in ep_poll. b) ep_poll(): 1) while (1) { eavail = ep_events_available(ep); (checks ep->rdlist) ep_send_events(ep); (notify the events to user) } (epoll_wait() calling process gets woken up from a.2 and process the events raised added to rdlist in a.1) 2) ep_send_events(): mutex_lock(&ep->mtx); (** The sync lock is taken **) list_for_each_entry_safe(epi, tmp, &txlist, rdllink) { list_del_init(&epi->rdllink); revents = ep_item_poll(epi, &pt, 1) (vfs_poll()-->...f_op->poll(=dma_buf_poll) } mutex_unlock(&ep->mtx) (**release the lock.**) (As part of procession of events, each epitem is removed from rdlist and call the f_op->poll() of a file which will endup in dma_buf_poll()) 3) dma_buf_poll(): get_file(dmabuf->file); (** f_count changed from 0->1 **) dma_buf_poll_add_cb(resv, true, dcb): (registers dma_buf_poll_cb() against fence) c) eventpoll_release_file(): mutex_lock(&ep->mtx); __ep_remove(ep, epi, true): mutex_unlock(&ep->mtx); (__ep_remove() will remove the 'epi' from rbtree and if present from rdlist as well) d) file_free(file), free the 'file'. e) dma_buf_poll_cb: /* Paired with get_file in dma_buf_poll */ fput(dmabuf->file); (f_count changed from 1->0, where we try to free the 'file' again which is UAF/double free). In the above race, If c) gets called first, then the 'epi' is removed from both rbtree and 'rdlink' under ep->mtx lock thus b.2 don't end up in calling the ->poll() as it don't see this event in the rdlist. Race only exist If b.2 executes first, where it will call dma_buf_poll with __valid 'struct file' under ep->mtx but its refcount is already could have been zero__. Later When e)
Re: [PATCH] dma-buf: fix dma_buf_export init order
Thanks Christian for this nice cleanup!! On 12/6/2022 8:42 PM, Christian König wrote: > @@ -638,10 +630,21 @@ struct dma_buf *dma_buf_export(const struct > dma_buf_export_info *exp_info) > if (!try_module_get(exp_info->owner)) > return ERR_PTR(-ENOENT); > > + file = dma_buf_getfile(exp_info->size, exp_info->flags); > + if (IS_ERR(file)) { > + ret = PTR_ERR(file); > + goto err_module; > + } > + > + if (!exp_info->resv) > + alloc_size += sizeof(struct dma_resv); > + else > + /* prevent &dma_buf[1] == dma_buf->resv */ > + alloc_size += 1; > dmabuf = kzalloc(alloc_size, GFP_KERNEL); > if (!dmabuf) { > ret = -ENOMEM; > - goto err_module; > + goto err_file; > } > > dmabuf->priv = exp_info->priv; > @@ -653,44 +656,36 @@ struct dma_buf *dma_buf_export(const struct > dma_buf_export_info *exp_info) > init_waitqueue_head(&dmabuf->poll); > dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll; > dmabuf->cb_in.active = dmabuf->cb_out.active = 0; > + mutex_init(&dmabuf->lock); > + INIT_LIST_HEAD(&dmabuf->attachments); > > if (!resv) { > - resv = (struct dma_resv *)&dmabuf[1]; > - dma_resv_init(resv); > + dmabuf->resv = (struct dma_resv *)&dmabuf[1]; > + dma_resv_init(dmabuf->resv); > + } else { > + dmabuf->resv = resv; > } > - dmabuf->resv = resv; > > - file = dma_buf_getfile(dmabuf, exp_info->flags); > - if (IS_ERR(file)) { > - ret = PTR_ERR(file); > + ret = dma_buf_stats_setup(dmabuf, file); > + if (ret) > goto err_dmabuf; > - } > > + file->private_data = dmabuf; > + file->f_path.dentry->d_fsdata = dmabuf; > dmabuf->file = file; > > - mutex_init(&dmabuf->lock); > - INIT_LIST_HEAD(&dmabuf->attachments); > - > mutex_lock(&db_list.lock); > list_add(&dmabuf->list_node, &db_list.head); > mutex_unlock(&db_list.lock); > > - ret = dma_buf_stats_setup(dmabuf); > - if (ret) > - goto err_sysfs; > - > return dmabuf; > > -err_sysfs: > - /* > - * Set file->f_path.dentry->d_fsdata to NULL so that when > - * dma_buf_release() gets invoked by dentry_ops, it exits > - * early before calling the release() dma_buf op. > - */ > - file->f_path.dentry->d_fsdata = NULL; > - fput(file); > err_dmabuf: > + if (!resv) > + dma_resv_fini(dmabuf->resv); > kfree(dmabuf); > +err_file: > + fput(file); This fput() still endup in calling the dma_buf_file_release() where there are no checks before accessing the file->private_data(=dmabuf) which can result in null pointer access. Am I missing something trivial? > err_module: > module_put(exp_info->owner); > return ERR_PTR(ret); > -- 2.34.1
Re: [Linaro-mm-sig] Re: [PATCH] dma-buf: Fix possible UAF in dma_buf_export
Thanks Christian/TJ for all your inputs!! On 11/24/2022 6:25 PM, Christian König wrote: >>> I was already wondering why the order is this way. >>> >>> Why is dma_buf_stats_setup() needing the file in the first place? >> >> dmabuf->file will be used in dma_buf_stats_setup(), the >> dma_buf_stats_setup() as follows: >> >>> 171 int dma_buf_stats_setup(struct dma_buf *dmabuf) >>> 172 { >>> 173 struct dma_buf_sysfs_entry *sysfs_entry; >>> 174 int ret; >>> 175 >>> 176 if (!dmabuf || !dmabuf->file) >>> 177 return -EINVAL; >>> 178 >>> 179 if (!dmabuf->exp_name) { >>> 180 pr_err("exporter name must not be empty if stats >>> needed\n"); >>> 181 return -EINVAL; >>> 182 } >>> 183 >>> 184 sysfs_entry = kzalloc(sizeof(struct dma_buf_sysfs_entry), >>> GFP_KERNEL); >>> 185 if (!sysfs_entry) >>> 186 return -ENOMEM; >>> 187 >>> 188 sysfs_entry->kobj.kset = dma_buf_per_buffer_stats_kset; >>> 189 sysfs_entry->dmabuf = dmabuf; >>> 190 >>> 191 dmabuf->sysfs_entry = sysfs_entry; >>> 192 >>> 193 /* create the directory for buffer stats */ >>> 194 ret = kobject_init_and_add(&sysfs_entry->kobj, >>> &dma_buf_ktype, NULL, >>> 195 "%lu", >>> file_inode(dmabuf->file)->i_ino); > > Ah, so it uses the i_ino of the file for the sysfs unique name. > > I'm going to take another look how to properly clean this up. > How about deleting the dmabuf from the db_list directly in the error path (which is usually done by the fput()) and then continue with the normal fput() here. Just compile tested the below code and If the logic make sense for you, will send the final tested patch. --><- diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index e6f36c0..10a1727 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -87,19 +87,28 @@ static void dma_buf_release(struct dentry *dentry) kfree(dmabuf); } -static int dma_buf_file_release(struct inode *inode, struct file *file) +static void dma_buf_db_list_remove(struct file *file) { struct dma_buf *dmabuf; - if (!is_dma_buf_file(file)) - return -EINVAL; - dmabuf = file->private_data; + if (!dmabuf) + return; mutex_lock(&db_list.lock); list_del(&dmabuf->list_node); mutex_unlock(&db_list.lock); + file->private_data = NULL; +} + +static int dma_buf_file_release(struct inode *inode, struct file *file) +{ + if (!is_dma_buf_file(file)) + return -EINVAL; + + dma_buf_db_list_remove(file); + return 0; } @@ -688,6 +697,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) * early before calling the release() dma_buf op. */ file->f_path.dentry->d_fsdata = NULL; + + dma_buf_db_list_remove(file); fput(file); err_dmabuf: kfree(dmabuf); ><- > Thanks for pointing this out, > Christian.
Re: [PATCH] dma-buf: Fix possible UAF in dma_buf_export
Thanks T.J and Christian for the inputs. On 11/19/2022 7:00 PM, Christian König wrote: >> >> Yes, exactly that's the idea. >> >> The only alternatives I can see would be to either move allocating >> the >> file and so completing the dma_buf initialization last again or just >> ignore errors from sysfs. >> >> > If we still want to avoid calling dmabuf->ops->release(dmabuf) in >> > dma_buf_release like the comment says I guess we could use >> sysfs_entry >> > and ERR_PTR to flag that, otherwise it looks like we'd need a bit >> > somewhere. >> >> No, this should be dropped as far as I can see. The sysfs cleanup >> code >> looks like it can handle not initialized kobj pointers. >> >> >> Yeah there is also the null check in dma_buf_stats_teardown() that >> would prevent it from running, but I understood the comment to be >> referring to the release() dma_buf_ops call into the exporter which >> comes right after the teardown call. That looks like it's preventing >> the fput task work calling back into the exporter after the exporter >> already got an error from dma_buf_export(). Otherwise the exporter >> sees a release() for a buffer that it doesn't know about / thinks >> shouldn't exist. So I could imagine an exporter trying to double free: >> once for the failed dma_buf_export() call, and again when the >> release() op is called later. > > > Oh, very good point as well. Yeah, then creating the file should > probably come last. > @Gaosheng: Could you please make these changes or you let me to do? > Regards, > Christian.
Re: [PATCH] dma-buf: Fix possible UAF in dma_buf_export
Sometime back Dan also reported the same issue[1] where I do mentioned that fput()-->dma_buf_file_release() will remove it from the list. But it seems that I failed to notice fput() here calls the dma_buf_file_release() asynchronously i.e. dmabuf that is accessed in the close path is already freed. Am I wrong here? Should we have the __fput_sync(file) here instead of just fput(file) which can solve this problem? [1]https://lore.kernel.org/all/20220516084704.GG29930@kadam/ Thanks, Charan On 11/17/2022 11:51 AM, Gaosheng Cui wrote: > Smatch report warning as follows: > > drivers/dma-buf/dma-buf.c:681 dma_buf_export() warn: > '&dmabuf->list_node' not removed from list > > If dma_buf_stats_setup() fails in dma_buf_export(), goto err_sysfs > and dmabuf will be freed, but dmabuf->list_node will not be removed > from db_list.head, then list traversal may cause UAF. > > Fix by removeing it from db_list.head before free(). > > Fixes: ef3a6b70507a ("dma-buf: call dma_buf_stats_setup after dmabuf is in > valid list") > Signed-off-by: Gaosheng Cui > --- > drivers/dma-buf/dma-buf.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index b809513b03fe..6848f50226d5 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -675,6 +675,9 @@ struct dma_buf *dma_buf_export(const struct > dma_buf_export_info *exp_info) > return dmabuf; > > err_sysfs: > + mutex_lock(&db_list.lock); > + list_del(&dmabuf->list_node); > + mutex_unlock(&db_list.lock); > /* >* Set file->f_path.dentry->d_fsdata to NULL so that when >* dma_buf_release() gets invoked by dentry_ops, it exits
Re: [bug report] dma-buf: call dma_buf_stats_setup after dmabuf is in valid list
++ Adding Christian On 5/16/2022 12:19 PM, Dan Carpenter wrote: > Hello Charan Teja Reddy, > > The patch ef3a6b70507a: "dma-buf: call dma_buf_stats_setup after > dmabuf is in valid list" from May 10, 2022, leads to the following > Smatch static checker warning: > > drivers/dma-buf/dma-buf.c:569 dma_buf_export() > warn: '&dmabuf->list_node' not removed from list > > drivers/dma-buf/dma-buf.c >538 file = dma_buf_getfile(dmabuf, exp_info->flags); >539 if (IS_ERR(file)) { >540 ret = PTR_ERR(file); >541 goto err_dmabuf; >542 } >543 >544 file->f_mode |= FMODE_LSEEK; >545 dmabuf->file = file; >546 >547 mutex_init(&dmabuf->lock); >548 INIT_LIST_HEAD(&dmabuf->attachments); >549 >550 mutex_lock(&db_list.lock); >551 list_add(&dmabuf->list_node, &db_list.head); > > Added to the list > >552 mutex_unlock(&db_list.lock); >553 >554 ret = dma_buf_stats_setup(dmabuf); >555 if (ret) >556 goto err_sysfs; > > Goto > >557 >558 return dmabuf; >559 >560 err_sysfs: >561 /* >562 * Set file->f_path.dentry->d_fsdata to NULL so that when >563 * dma_buf_release() gets invoked by dentry_ops, it exits >564 * early before calling the release() dma_buf op. >565 */ >566 file->f_path.dentry->d_fsdata = NULL; >567 fput(file); >568 err_dmabuf: >569 kfree(dmabuf); > > dmabuf is freed, but it's still on the list so it leads to a use after > free. This seems to be a false positive. On closing the file @line no:567, it ends up calling dma_buf_file_release() which does remove dmabuf from its list. static int dma_buf_file_release(struct inode *inode, struct file *file) { .. mutex_lock(&db_list.lock); list_del(&dmabuf->list_node); mutex_unlock(&db_list.lock); .. } > >570 err_module: >571 module_put(exp_info->owner); >572 return ERR_PTR(ret); >573 } > > regards, > dan carpenter
[PATCH V3 RESEND] dma-buf: ensure unique directory name for dmabuf stats
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail. Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry. This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry. To make the dentry name as unique, use the dmabuf fs specific inode which is based on the simple atomic variable increment. There is tmpfs subsystem too which relies on its own inode generation rather than relying on the get_next_ino() for the same reason of avoiding the duplicate inodes[1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=e809d5f0b5c912fe981dce738f3283b2010665f0 Signed-off-by: Charan Teja Kalla Cc: # 5.15.x+ Reviewed-by: Greg Kroah-Hartman --- Changes in V3-resend: -- Collect all the tags and apply stable tag. Changes in V3: -- Used the atomic64 variable to have dmabuf files its own inodes. -- Ensured no UAPI breakage as suggested by Christian. Changes in V2: -- Used the atomic64_t variable to generate a unique_id to be appended to inode to have an unique directory with name -- Suggested by christian -- Updated the ABI documentation -- Identified by Greg. -- Massaged the commit log. -- https://lore.kernel.org/all/1652191562-18700-1-git-send-email-quic_chara...@quicinc.com/ Changes in V1: -- Used the inode->i_ctime->tv_secs as an id appended to inode to create the unique directory with name . -- https://lore.kernel.org/all/1652178212-22383-1-git-send-email-quic_chara...@quicinc.com/ drivers/dma-buf/dma-buf.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a6fc96e..0ad5039 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -407,6 +407,7 @@ static inline int is_dma_buf_file(struct file *file) static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) { + static atomic64_t dmabuf_inode = ATOMIC64_INIT(0); struct file *file; struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb); @@ -416,6 +417,13 @@ static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) inode->i_size = dmabuf->size; inode_set_bytes(inode, dmabuf->size); + /* +* The ->i_ino acquired from get_next_ino() is not unique thus +* not suitable for using it as dentry name by dmabuf stats. +* Override ->i_ino with the unique and dmabuffs specific +* value. +*/ + inode->i_ino = atomic64_add_return(1, &dmabuf_inode); file = alloc_file_pseudo(inode, dma_buf_mnt, "dmabuf", flags, &dma_buf_fops); if (IS_ERR(file)) -- 2.7.4
Re: [PATCH V3] dma-buf: ensure unique directory name for dmabuf stats
On 5/13/2022 3:59 PM, Christian König wrote: > Am 13.05.22 um 12:18 schrieb Charan Teja Kalla: >> On 5/13/2022 3:41 PM, Greg KH wrote: >>>> Reported-by: kernel test robot >>> The trest robot did not say that the dmabuf stat name was being >>> duplicated, did it? >>> >> It reported a printk warning on V2[1]. Should we remove this on V3? > > We only add the kernel test robot is report when it found the underlying > problem and not just noted some warning on an intermediate patch version. Noted. Thanks!! > >> @Christian: Could you please drop this tag while merging? > > Sure, I don't have much on my plate at the moment. But don't let it > become a habit. > Sure. I am also thinking If it is worth to add stable tag? Though it is not crashing the kernel but definitely making the dma_buf_export to fail for no reason. If yes, I can resend the patch with all these tags. > Going to push it upstream through drm-misc-fixes now.
Re: [PATCH V3] dma-buf: ensure unique directory name for dmabuf stats
On 5/13/2022 3:41 PM, Greg KH wrote: >> Reported-by: kernel test robot > The trest robot did not say that the dmabuf stat name was being > duplicated, did it? > It reported a printk warning on V2[1]. Should we remove this on V3? @Christian: Could you please drop this tag while merging? [1] https://lore.kernel.org/all/202205110511.e0d8txxc-...@intel.com/ >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >> index a6fc96e..0ad5039 100644 >> --- a/drivers/dma-buf/dma-buf.c >> +++ b/drivers/dma-buf/dma-buf.c >> @@ -407,6 +407,7 @@ static inline int is_dma_buf_file(struct file *file) >> >> static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) >> { >> +static atomic64_t dmabuf_inode = ATOMIC64_INIT(0); >> struct file *file; >> struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb); >> >> @@ -416,6 +417,13 @@ static struct file *dma_buf_getfile(struct dma_buf >> *dmabuf, int flags) >> inode->i_size = dmabuf->size; >> inode_set_bytes(inode, dmabuf->size); >> >> +/* >> + * The ->i_ino acquired from get_next_ino() is not unique thus >> + * not suitable for using it as dentry name by dmabuf stats. >> + * Override ->i_ino with the unique and dmabuffs specific >> + * value. >> + */ >> +inode->i_ino = atomic64_add_return(1, &dmabuf_inode); >> file = alloc_file_pseudo(inode, dma_buf_mnt, "dmabuf", >> flags, &dma_buf_fops); >> if (IS_ERR(file)) >> -- >> 2.7.4 >> > Reviewed-by: Greg Kroah-Hartman Thanks for the ACK. --Charan
[PATCH V3] dma-buf: ensure unique directory name for dmabuf stats
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail. Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry. This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry. To make the dentry name as unique, use the dmabuf fs specific inode which is based on the simple atomic variable increment. There is tmpfs subsystem too which relies on its own inode generation rather than relying on the get_next_ino() for the same reason of avoiding the duplicate inodes[1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=e809d5f0b5c912fe981dce738f3283b2010665f0 Reported-by: kernel test robot Signed-off-by: Charan Teja Kalla --- Changes in V3: -- Used the atomic64 variable to have dmabuf files its own inodes. -- Ensured no UAPI breakage as suggested by Christian. Changes in V2: -- Used the atomic64_t variable to generate a unique_id to be appended to inode to have an unique directory with name -- Suggested by christian -- Updated the ABI documentation -- Identified by Greg. -- Massaged the commit log. -- https://lore.kernel.org/all/1652191562-18700-1-git-send-email-quic_chara...@quicinc.com/ Changes in V1: -- Used the inode->i_ctime->tv_secs as an id appended to inode to create the unique directory with name . -- https://lore.kernel.org/all/1652178212-22383-1-git-send-email-quic_chara...@quicinc.com/ drivers/dma-buf/dma-buf.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a6fc96e..0ad5039 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -407,6 +407,7 @@ static inline int is_dma_buf_file(struct file *file) static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) { + static atomic64_t dmabuf_inode = ATOMIC64_INIT(0); struct file *file; struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb); @@ -416,6 +417,13 @@ static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) inode->i_size = dmabuf->size; inode_set_bytes(inode, dmabuf->size); + /* +* The ->i_ino acquired from get_next_ino() is not unique thus +* not suitable for using it as dentry name by dmabuf stats. +* Override ->i_ino with the unique and dmabuffs specific +* value. +*/ + inode->i_ino = atomic64_add_return(1, &dmabuf_inode); file = alloc_file_pseudo(inode, dma_buf_mnt, "dmabuf", flags, &dma_buf_fops); if (IS_ERR(file)) -- 2.7.4
Re: [PATCH V2] dmabuf: ensure unique directory name for dmabuf stats
Thanks Christian for the comments!! On 5/11/2022 12:33 PM, Christian König wrote: > >> The single number approach, generated by atomic, wouldn't break the >> uapi, but that number won't give any meaningful information especially >> when this is targeted just for debug purpose. And just 'inode' is not >> usable for already stated reasons. > > Well, why do you want to use the ino in the first place? This is an > anonymous inode not associated with any filesystem, so that number is > meaningless anyway. > It is just for ease of debugging. Nothing more. I can quickly traverse the /sys/kernel/dmabuf/buffers/* and get complete information about the dmabuf buffers while relating to which process this buffer is allocated by, using this inode as the 'unique' reference. https://cs.android.com/android/platform/superproject/+/master:system/memory/libmeminfo/libdmabufinfo/tools/dmabuf_dump.cpp >> How about using the atomic number generated it self used as inode >> number? I see tmpfs also maintains its own inode numbers for the same >> overflow reasons[2]. > > Yeah, that could potentially work as well. > Thanks. Will work on the next version of this patch. > Regards, > Christian.
Re: [PATCH V2] dmabuf: ensure unique directory name for dmabuf stats
Thanks Christian for the inputs!! On 5/10/2022 10:52 PM, Christian König wrote: if (!dmabuf || !dmabuf->file) return -EINVAL; @@ -192,7 +193,8 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) /* create the directory for buffer stats */ ret = kobject_init_and_add(&sysfs_entry->kobj, &dma_buf_ktype, NULL, - "%lu", file_inode(dmabuf->file)->i_ino); + "%lu-%lu", file_inode(dmabuf->file)->i_ino, >>> Why not just use the unique value here? Or is the inode number necessary >>> for something? >> This will ease the debugging a lot. Given the dump, I can easily map >> which dmabuf buffer to the process. On the crashutilty I just have to >> search for this inode in the files output, just one example. > > T.J. Mercier just confirmed my suspicion that this would break the UAPI. > So that won't work. > > This needs to be a single number, preferable documented as such. Usually, What are the chances that a patch breaking UAPI will get accepted. IMO, If there are few users, I had learnt that it is allowed to merge. (Eg: In [1] where Andrew, -mm maintainer, mentioned that: "I think we should just absorb any transitory damage which this causes people." for the patch posted breaking the UAPI). Even the patch c715def51591 ("dma-buf: Delete the DMA-BUF attachment sysfs statistics") deleted the sysfs entries which also comes under the UAPI breakage but still allowed to merge. On those lines, Is it fair to say If few users are there, uapi breakage changes are allowed to merge on the assumption that userspace code needs to be aligned with the new uapi changes? To my knowledge, Android is the only user which is just getting the dmabuf stats as part of the debug code. The single number approach, generated by atomic, wouldn't break the uapi, but that number won't give any meaningful information especially when this is targeted just for debug purpose. And just 'inode' is not usable for already stated reasons. How about using the atomic number generated it self used as inode number? I see tmpfs also maintains its own inode numbers for the same overflow reasons[2]. The code will be like below(untested): diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index a6fc96e..eeed770 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -408,11 +408,17 @@ static inline int is_dma_buf_file(struct file *file) static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) { struct file *file; + static atomic64_t unique_id = ATOMIC64_INIT(0); struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb); if (IS_ERR(inode)) return ERR_CAST(inode); + /* +* Override the inode->i_no number with the unique +* dmabuf specific value +*/ + inode->i_no = atomic64_add_return(1, &unique_id); inode->i_size = dmabuf->size; inode_set_bytes(inode, dmabuf->size); [1] https://patchwork.kernel.org/project/linux-mm/patch/4f091776142f2ebf7b94018146de72318474e686.1647008754.git.quic_chara...@quicinc.com/#24780139 [2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=e809d5f0b5c912fe981dce738f3283b2010665f0 Thanks, Charan
Re: [PATCH V2] dmabuf: ensure unique directory name for dmabuf stats
On 5/10/2022 8:42 PM, Christian König wrote: >> * The information in the interface can also be used to derive >> per-exporter >> * statistics. The data from the interface can be gathered on error >> conditions >> @@ -172,6 +172,7 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) >> { >> struct dma_buf_sysfs_entry *sysfs_entry; >> int ret; >> + static atomic64_t unique_id = ATOMIC_INIT(0); > > Please move that to the beginning of the declarations. > Done. Any scripts I can run at my end to catch these type of trivial changes? checkpatch.pl didn't report this coding style. >> if (!dmabuf || !dmabuf->file) >> return -EINVAL; >> @@ -192,7 +193,8 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) >> /* create the directory for buffer stats */ >> ret = kobject_init_and_add(&sysfs_entry->kobj, &dma_buf_ktype, >> NULL, >> - "%lu", file_inode(dmabuf->file)->i_ino); >> + "%lu-%lu", file_inode(dmabuf->file)->i_ino, > > Why not just use the unique value here? Or is the inode number necessary > for something? This will ease the debugging a lot. Given the dump, I can easily map which dmabuf buffer to the process. On the crashutilty I just have to search for this inode in the files output, just one example.
[PATCH V2] dmabuf: ensure unique directory name for dmabuf stats
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail. Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry. This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry. To make the directory entry as unique, append the unique_id for every inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/. Signed-off-by: Charan Teja Kalla --- Changes in V2: -- Used the atomic64_t variable to generate a unique_id to be appended to inode to have an unique directory with name -- Suggested by christian -- Updated the ABI documentation -- Identified by Greg. -- Massaged the commit log. Changes in V1: -- Used the inode->i_ctime->tv_secs as an id appended to inode to create the unique directory with name . -- https://lore.kernel.org/all/1652178212-22383-1-git-send-email-quic_chara...@quicinc.com/ Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers | 10 +- drivers/dma-buf/Kconfig | 6 +++--- drivers/dma-buf/dma-buf-sysfs-stats.c | 8 +--- 3 files changed, 13 insertions(+), 11 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers b/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers index 5d3bc99..9fffbd3 100644 --- a/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers +++ b/Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers @@ -4,19 +4,19 @@ KernelVersion:v5.13 Contact: Hridya Valsaraju Description: The /sys/kernel/dmabuf/buffers directory contains a snapshot of the internal state of every DMA-BUF. - /sys/kernel/dmabuf/buffers/ will contain the - statistics for the DMA-BUF with the unique inode number - + /sys/kernel/dmabuf/buffers/ will + contain the statistics for the DMA-BUF with the unique + pair Users: kernel memory tuning/debugging tools -What: /sys/kernel/dmabuf/buffers//exporter_name +What: /sys/kernel/dmabuf/buffers//exporter_name Date: May 2021 KernelVersion: v5.13 Contact: Hridya Valsaraju Description: This file is read-only and contains the name of the exporter of the DMA-BUF. -What: /sys/kernel/dmabuf/buffers//size +What: /sys/kernel/dmabuf/buffers//size Date: May 2021 KernelVersion: v5.13 Contact: Hridya Valsaraju diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index 541efe0..5bcbdb1 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -81,9 +81,9 @@ menuconfig DMABUF_SYSFS_STATS Choose this option to enable DMA-BUF sysfs statistics in location /sys/kernel/dmabuf/buffers. - /sys/kernel/dmabuf/buffers/ will contain - statistics for the DMA-BUF with the unique inode number - . + /sys/kernel/dmabuf/buffers/ will contain + statistics for the DMA-BUF with the unique pair + . source "drivers/dma-buf/heaps/Kconfig" diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-buf/dma-buf-sysfs-stats.c index 2bba0ba..29e9e23 100644 --- a/drivers/dma-buf/dma-buf-sysfs-stats.c +++ b/drivers/dma-buf/dma-buf-sysfs-stats.c @@ -38,8 +38,8 @@ * * The following stats are exposed by the interface: * - * * ``/sys/kernel/dmabuf/buffers//exporter_name`` - * * ``/sys/kernel/dmabuf/buffers//size`` + * * ``/sys/kernel/dmabuf/buffers//exporter_name`` + * * ``/sys/kernel/dmabuf/buffers//size`` * * The information in the interface can also be used to derive per-exporter * statistics. The data from the interface can be gathered on error conditions @@ -172,6 +172,7 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) { struct dma_buf_sysfs_entry *sysfs_entry; int ret; + static atomic64_t unique_id = ATOMIC_INIT(0); if (!dmabuf || !dmabuf->file) return -EINVAL; @@ -192,7 +193,8 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) /* create the directory for buffer stats */ ret = kobject_init_and_add(&am
Re: [PATCH] dmabuf: ensure unique directory name for dmabuf stats
Thanks Christian for the inputs!! On 5/10/2022 5:05 PM, Christian König wrote: > >> And what's to keep the seconds field from also being the same? > > Well exporting two DMA-bufs with the same ino in the same nanosecond > should be basically impossible, but I would rather opt for using a 64bit > atomic in that function. > > This should be 100% UAPI compatible and even if we manage to create one > buffer ever ns we need ~500years to wrap around. I see that the inode->i_ctime->tv_sec is already defined as 64bit(time64_t tv_sec), hence used it. This way we don't need extra static atomic_t variable just to get a unique name. Just pasting excerpt from the reply posted to Greg about why this secs will always be a unique: with secs field added, to get the same inode-secs string, the uint should overflow in the same second which is impossible. Let me know If you still opt for atomic variable only.
Re: [PATCH] dmabuf: ensure unique directory name for dmabuf stats
Thanks Greg for the inputs!! On 5/10/2022 4:30 PM, Greg KH wrote: >> The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> >> alloc_anon_inode()) to get an inode number and uses the same as a >> directory name under /sys/kernel/dmabuf/buffers/. This directory is >> used to collect the dmabuf stats and it is created through >> dma_buf_stats_setup(). At current, failure to create this directory >> entry can make the dma_buf_export() to fail. >> >> Now, as the get_next_ino() can definitely give a repetitive inode no >> causing the directory entry creation to fail with -EEXIST. This is a >> problem on the systems where dmabuf stats functionality is enabled on >> the production builds can make the dma_buf_export(), though the dmabuf >> memory is allocated successfully, to fail just because it couldn't >> create stats entry. > Then maybe we should not fail the creation path of the kobject fails to > be created? It's just for debugging, it should be fine if the creation > of it isn't there. Not creating the debug node under some special cases can make this interface not reliable if one wants to know info about the created dmabuf buffers. Please help in correcting me If my perspective is wrong here. IIUC, except this -EEXIST condition, under the other conditions (-EINVAL and -ENOMEM) failure is fine. Since, we are going to fix the -EEXIST error in this patch, my opinion is failure in the kobject creation path is acceptable for the reasons: a) The user is expected to pass the valid dmabuf to create the stats node, b) The user can undefine the CONFIG_DMABUF_SYSFS_STATS if he don't want this stats. > >> This issue we are able to see on the snapdragon system within 13 days >> where there already exists a directory with inode no "122602" so >> dma_buf_stats_setup() failed with -EEXIST as it is trying to create >> the same directory entry. >> >> To make the directory entry as unique, append the inode creation time to >> the inode. With this change the stats directory entries will be in the >> format of: /sys/kernel/dmabuf/buffers/-> secs>. > As you are changing the format here, shouldn't the Documentation/ABI/ > entry for this also be changed? > > And what's to keep the seconds field from also being the same? get_next_ino() just increases the inode number monotonically and return to the caller and it is 'unsigned int' data type. Thus 2 successive calls always generate the different inode_number but can be the same secs value. With inode-secs format, this will be still be a unique string. Say it will be like ino1-sec1 and ino2-sec1. Now after the inode number overflow and wraps, we may get the ino1 again from the get_next_ino() but then secs will be different i.e. say it may be like ino1-secn and ion2-secn. So, it always be a unique string. IOW, with secs field added, to get the same inode-secs string, the uint should overflow in the same second which is impossible. Thanks for pointing out the changes to be done in ABI document. Will do it in the next spin.
[PATCH] dmabuf: ensure unique directory name for dmabuf stats
The dmabuf file uses get_next_ino()(through dma_buf_getfile() -> alloc_anon_inode()) to get an inode number and uses the same as a directory name under /sys/kernel/dmabuf/buffers/. This directory is used to collect the dmabuf stats and it is created through dma_buf_stats_setup(). At current, failure to create this directory entry can make the dma_buf_export() to fail. Now, as the get_next_ino() can definitely give a repetitive inode no causing the directory entry creation to fail with -EEXIST. This is a problem on the systems where dmabuf stats functionality is enabled on the production builds can make the dma_buf_export(), though the dmabuf memory is allocated successfully, to fail just because it couldn't create stats entry. This issue we are able to see on the snapdragon system within 13 days where there already exists a directory with inode no "122602" so dma_buf_stats_setup() failed with -EEXIST as it is trying to create the same directory entry. To make the directory entry as unique, append the inode creation time to the inode. With this change the stats directory entries will be in the format of: /sys/kernel/dmabuf/buffers/-. Signed-off-by: Charan Teja Kalla --- drivers/dma-buf/dma-buf-sysfs-stats.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-buf/dma-buf-sysfs-stats.c index 2bba0ba..292cb31 100644 --- a/drivers/dma-buf/dma-buf-sysfs-stats.c +++ b/drivers/dma-buf/dma-buf-sysfs-stats.c @@ -192,7 +192,8 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) /* create the directory for buffer stats */ ret = kobject_init_and_add(&sysfs_entry->kobj, &dma_buf_ktype, NULL, - "%lu", file_inode(dmabuf->file)->i_ino); + "%lu-%lu", file_inode(dmabuf->file)->i_ino, + file_inode(dmabuf->file)->i_ctime.tv_sec); if (ret) goto err_sysfs_dmabuf; -- 2.7.4
Re: [PATCH] dma-buf: call dma_buf_stats_setup after dmabuf is in valid list
Hello Mercier, On 5/10/2022 3:19 AM, T.J. Mercier wrote: > On Mon, May 9, 2022 at 12:50 PM Charan Teja Kalla > wrote: >> From: Charan Teja Reddy >> >> When dma_buf_stats_setup() fails, it closes the dmabuf file which >> results into the calling of dma_buf_file_release() where it does >> list_del(&dmabuf->list_node) with out first adding it to the proper >> list. This is resulting into panic in the below path: >> __list_del_entry_valid+0x38/0xac >> dma_buf_file_release+0x74/0x158 >> __fput+0xf4/0x428 >> fput+0x14/0x24 >> task_work_run+0x178/0x24c >> do_notify_resume+0x194/0x264 >> work_pending+0xc/0x5f0 >> >> Fix it by moving the dma_buf_stats_setup() after dmabuf is added to the >> list. >> >> Fixes: bdb8d06dfefd ("dmabuf: Add the capability to expose DMA-BUF stats in >> sysfs") >> Signed-off-by: Charan Teja Reddy > Tested-by: T.J. Mercier > Acked-by: T.J. Mercier > Thanks for the Ack. Also Realized that it should have: Cc: # 5.15.x+
[PATCH] dma-buf: call dma_buf_stats_setup after dmabuf is in valid list
From: Charan Teja Reddy When dma_buf_stats_setup() fails, it closes the dmabuf file which results into the calling of dma_buf_file_release() where it does list_del(&dmabuf->list_node) with out first adding it to the proper list. This is resulting into panic in the below path: __list_del_entry_valid+0x38/0xac dma_buf_file_release+0x74/0x158 __fput+0xf4/0x428 fput+0x14/0x24 task_work_run+0x178/0x24c do_notify_resume+0x194/0x264 work_pending+0xc/0x5f0 Fix it by moving the dma_buf_stats_setup() after dmabuf is added to the list. Fixes: bdb8d06dfefd ("dmabuf: Add the capability to expose DMA-BUF stats in sysfs") Signed-off-by: Charan Teja Reddy --- drivers/dma-buf/dma-buf.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 602b12d..a6fc96e 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -543,10 +543,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) file->f_mode |= FMODE_LSEEK; dmabuf->file = file; - ret = dma_buf_stats_setup(dmabuf); - if (ret) - goto err_sysfs; - mutex_init(&dmabuf->lock); INIT_LIST_HEAD(&dmabuf->attachments); @@ -554,6 +550,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) list_add(&dmabuf->list_node, &db_list.head); mutex_unlock(&db_list.lock); + ret = dma_buf_stats_setup(dmabuf); + if (ret) + goto err_sysfs; + return dmabuf; err_sysfs: -- 2.7.4
Re: [PATCH v2] dmabuf: use spinlock to access dmabuf->name
Thanks Mike for the inputs. On 6/22/2020 5:10 PM, Ruhl, Michael J wrote: >> -Original Message- >> From: charante=codeaurora@mg.codeaurora.org >> On Behalf Of Charan Teja >> Kalla >> Sent: Monday, June 22, 2020 5:26 AM >> To: Ruhl, Michael J ; Sumit Semwal >> ; david.lai...@aculab.com; open list:DMA >> BUFFER SHARING FRAMEWORK ; DRI mailing >> list >> Cc: Linaro MM SIG ; LKML > ker...@vger.kernel.org> >> Subject: Re: [PATCH v2] dmabuf: use spinlock to access dmabuf->name >> >> Hello Mike, >> >> On 6/19/2020 7:11 PM, Ruhl, Michael J wrote: >>>> -----Original Message- >>>> From: charante=codeaurora@mg.codeaurora.org >>>> On Behalf Of Charan >> Teja >>>> Kalla >>>> Sent: Friday, June 19, 2020 7:57 AM >>>> To: Sumit Semwal ; Ruhl, Michael J >>>> ; david.lai...@aculab.com; open list:DMA >>>> BUFFER SHARING FRAMEWORK ; DRI >> mailing >>>> list >>>> Cc: Linaro MM SIG ; LKML >>> ker...@vger.kernel.org> >>>> Subject: [PATCH v2] dmabuf: use spinlock to access dmabuf->name >>>> >>>> There exists a sleep-while-atomic bug while accessing the dmabuf->name >>>> under mutex in the dmabuffs_dname(). This is caused from the SELinux >>>> permissions checks on a process where it tries to validate the inherited >>>> files from fork() by traversing them through iterate_fd() (which >>>> traverse files under spin_lock) and call >>>> match_file(security/selinux/hooks.c) where the permission checks >> happen. >>>> This audit information is logged using dump_common_audit_data() where >> it >>>> calls d_path() to get the file path name. If the file check happen on >>>> the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to >>>> access dmabuf->name. The flow will be like below: >>>> flush_unauthorized_files() >>>> iterate_fd() >>>>spin_lock() --> Start of the atomic section. >>>> match_file() >>>>file_has_perm() >>>> avc_has_perm() >>>>avc_audit() >>>> slow_avc_audit() >>>>common_lsm_audit() >>>> dump_common_audit_data() >>>>audit_log_d_path() >>>> d_path() >>>>dmabuffs_dname() >>>> mutex_lock()--> Sleep while atomic. >>>> >>>> Call trace captured (on 4.19 kernels) is below: >>>> ___might_sleep+0x204/0x208 >>>> __might_sleep+0x50/0x88 >>>> __mutex_lock_common+0x5c/0x1068 >>>> __mutex_lock_common+0x5c/0x1068 >>>> mutex_lock_nested+0x40/0x50 >>>> dmabuffs_dname+0xa0/0x170 >>>> d_path+0x84/0x290 >>>> audit_log_d_path+0x74/0x130 >>>> common_lsm_audit+0x334/0x6e8 >>>> slow_avc_audit+0xb8/0xf8 >>>> avc_has_perm+0x154/0x218 >>>> file_has_perm+0x70/0x180 >>>> match_file+0x60/0x78 >>>> iterate_fd+0x128/0x168 >>>> selinux_bprm_committing_creds+0x178/0x248 >>>> security_bprm_committing_creds+0x30/0x48 >>>> install_exec_creds+0x1c/0x68 >>>> load_elf_binary+0x3a4/0x14e0 >>>> search_binary_handler+0xb0/0x1e0 >>>> >>>> So, use spinlock to access dmabuf->name to avoid sleep-while-atomic. >>>> >>>> Cc: [5.3+] >>>> Signed-off-by: Charan Teja Reddy >>>> --- >>>> >>>> Changes in V2: Addressed review comments from Ruhl, Michael J >>>> >>>> Changes in V1: https://lore.kernel.org/patchwork/patch/1255055/ >>>> >>>> drivers/dma-buf/dma-buf.c | 11 +++ >>>> include/linux/dma-buf.h | 1 + >>>> 2 files changed, 8 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>>> index 01ce125..d81d298 100644 >>>> --- a/drivers/dma-buf/dma-buf.c >>>> +++ b/drivers/dma-buf/dma-buf.c >>>> @@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry >> *dentry, >>>> char *buffer, int buflen) >>>>size_t ret = 0; >>>> >>>>dmabuf = dentry->d_fsdata; >>>> - dma_resv_lock(dmabuf->resv, NULL); >>>> + spin_lock(&dmabuf->name_lock); >>>>if
Re: [PATCH v2] dmabuf: use spinlock to access dmabuf->name
Hello Mike, On 6/19/2020 7:11 PM, Ruhl, Michael J wrote: >> -Original Message- >> From: charante=codeaurora@mg.codeaurora.org >> On Behalf Of Charan Teja >> Kalla >> Sent: Friday, June 19, 2020 7:57 AM >> To: Sumit Semwal ; Ruhl, Michael J >> ; david.lai...@aculab.com; open list:DMA >> BUFFER SHARING FRAMEWORK ; DRI mailing >> list >> Cc: Linaro MM SIG ; LKML > ker...@vger.kernel.org> >> Subject: [PATCH v2] dmabuf: use spinlock to access dmabuf->name >> >> There exists a sleep-while-atomic bug while accessing the dmabuf->name >> under mutex in the dmabuffs_dname(). This is caused from the SELinux >> permissions checks on a process where it tries to validate the inherited >> files from fork() by traversing them through iterate_fd() (which >> traverse files under spin_lock) and call >> match_file(security/selinux/hooks.c) where the permission checks happen. >> This audit information is logged using dump_common_audit_data() where it >> calls d_path() to get the file path name. If the file check happen on >> the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to >> access dmabuf->name. The flow will be like below: >> flush_unauthorized_files() >> iterate_fd() >>spin_lock() --> Start of the atomic section. >> match_file() >>file_has_perm() >> avc_has_perm() >>avc_audit() >> slow_avc_audit() >> common_lsm_audit() >>dump_common_audit_data() >> audit_log_d_path() >>d_path() >>dmabuffs_dname() >> mutex_lock()--> Sleep while atomic. >> >> Call trace captured (on 4.19 kernels) is below: >> ___might_sleep+0x204/0x208 >> __might_sleep+0x50/0x88 >> __mutex_lock_common+0x5c/0x1068 >> __mutex_lock_common+0x5c/0x1068 >> mutex_lock_nested+0x40/0x50 >> dmabuffs_dname+0xa0/0x170 >> d_path+0x84/0x290 >> audit_log_d_path+0x74/0x130 >> common_lsm_audit+0x334/0x6e8 >> slow_avc_audit+0xb8/0xf8 >> avc_has_perm+0x154/0x218 >> file_has_perm+0x70/0x180 >> match_file+0x60/0x78 >> iterate_fd+0x128/0x168 >> selinux_bprm_committing_creds+0x178/0x248 >> security_bprm_committing_creds+0x30/0x48 >> install_exec_creds+0x1c/0x68 >> load_elf_binary+0x3a4/0x14e0 >> search_binary_handler+0xb0/0x1e0 >> >> So, use spinlock to access dmabuf->name to avoid sleep-while-atomic. >> >> Cc: [5.3+] >> Signed-off-by: Charan Teja Reddy >> --- >> >> Changes in V2: Addressed review comments from Ruhl, Michael J >> >> Changes in V1: https://lore.kernel.org/patchwork/patch/1255055/ >> >> drivers/dma-buf/dma-buf.c | 11 +++ >> include/linux/dma-buf.h | 1 + >> 2 files changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >> index 01ce125..d81d298 100644 >> --- a/drivers/dma-buf/dma-buf.c >> +++ b/drivers/dma-buf/dma-buf.c >> @@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry *dentry, >> char *buffer, int buflen) >> size_t ret = 0; >> >> dmabuf = dentry->d_fsdata; >> -dma_resv_lock(dmabuf->resv, NULL); >> +spin_lock(&dmabuf->name_lock); >> if (dmabuf->name) >> ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN); >> -dma_resv_unlock(dmabuf->resv); >> +spin_unlock(&dmabuf->name_lock); >> >> return dynamic_dname(dentry, buffer, buflen, "/%s:%s", >> dentry->d_name.name, ret > 0 ? name : ""); >> @@ -341,8 +341,10 @@ static long dma_buf_set_name(struct dma_buf >> *dmabuf, const char __user *buf) >> kfree(name); >> goto out_unlock; >> } >> +spin_lock(&dmabuf->name_lock); >> kfree(dmabuf->name); >> dmabuf->name = name; >> +spin_unlock(&dmabuf->name_lock); > > While this code path is ok, I would have separated the protection of the > attachment list and the name manipulation. > > dma_resv_lock(resv) > if (!list_empty(attachment) > ret = -EBUSY > dma_resv_unlock(resv) > > if (ret) { > kfree(name) > return ret; > } Is it that the name should be visible before importer attaches to the dmabuf,(using dma_buf_attach()), thus _buf_set_name() is under the _resv_lock() as well? > > spinlock(nam_lock) > ... > > N
[PATCH v2] dmabuf: use spinlock to access dmabuf->name
There exists a sleep-while-atomic bug while accessing the dmabuf->name under mutex in the dmabuffs_dname(). This is caused from the SELinux permissions checks on a process where it tries to validate the inherited files from fork() by traversing them through iterate_fd() (which traverse files under spin_lock) and call match_file(security/selinux/hooks.c) where the permission checks happen. This audit information is logged using dump_common_audit_data() where it calls d_path() to get the file path name. If the file check happen on the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to access dmabuf->name. The flow will be like below: flush_unauthorized_files() iterate_fd() spin_lock() --> Start of the atomic section. match_file() file_has_perm() avc_has_perm() avc_audit() slow_avc_audit() common_lsm_audit() dump_common_audit_data() audit_log_d_path() d_path() dmabuffs_dname() mutex_lock()--> Sleep while atomic. Call trace captured (on 4.19 kernels) is below: ___might_sleep+0x204/0x208 __might_sleep+0x50/0x88 __mutex_lock_common+0x5c/0x1068 __mutex_lock_common+0x5c/0x1068 mutex_lock_nested+0x40/0x50 dmabuffs_dname+0xa0/0x170 d_path+0x84/0x290 audit_log_d_path+0x74/0x130 common_lsm_audit+0x334/0x6e8 slow_avc_audit+0xb8/0xf8 avc_has_perm+0x154/0x218 file_has_perm+0x70/0x180 match_file+0x60/0x78 iterate_fd+0x128/0x168 selinux_bprm_committing_creds+0x178/0x248 security_bprm_committing_creds+0x30/0x48 install_exec_creds+0x1c/0x68 load_elf_binary+0x3a4/0x14e0 search_binary_handler+0xb0/0x1e0 So, use spinlock to access dmabuf->name to avoid sleep-while-atomic. Cc: [5.3+] Signed-off-by: Charan Teja Reddy --- Changes in V2: Addressed review comments from Ruhl, Michael J Changes in V1: https://lore.kernel.org/patchwork/patch/1255055/ drivers/dma-buf/dma-buf.c | 11 +++ include/linux/dma-buf.h | 1 + 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 01ce125..d81d298 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) size_t ret = 0; dmabuf = dentry->d_fsdata; - dma_resv_lock(dmabuf->resv, NULL); + spin_lock(&dmabuf->name_lock); if (dmabuf->name) ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN); - dma_resv_unlock(dmabuf->resv); + spin_unlock(&dmabuf->name_lock); return dynamic_dname(dentry, buffer, buflen, "/%s:%s", dentry->d_name.name, ret > 0 ? name : ""); @@ -341,8 +341,10 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf) kfree(name); goto out_unlock; } + spin_lock(&dmabuf->name_lock); kfree(dmabuf->name); dmabuf->name = name; + spin_unlock(&dmabuf->name_lock); out_unlock: dma_resv_unlock(dmabuf->resv); @@ -405,10 +407,10 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1); seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name); - dma_resv_lock(dmabuf->resv, NULL); + spin_lock(&dmabuf->name_lock); if (dmabuf->name) seq_printf(m, "name:\t%s\n", dmabuf->name); - dma_resv_unlock(dmabuf->resv); + spin_unlock(&dmabuf->name_lock); } static const struct file_operations dma_buf_fops = { @@ -546,6 +548,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->size = exp_info->size; dmabuf->exp_name = exp_info->exp_name; dmabuf->owner = exp_info->owner; + spin_lock_init(&dmabuf->name_lock); init_waitqueue_head(&dmabuf->poll); dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll; dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0; diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index ab0c156..93108fd 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -311,6 +311,7 @@ struct dma_buf { void *vmap_ptr; const char *exp_name; const char *name; + spinlock_t name_lock; struct module *owner; struct list_head list_node; void *priv; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dmabuf: use spinlock to access dmabuf->name
On 6/17/2020 11:13 PM, Ruhl, Michael J wrote: >> -Original Message- >> From: charante=codeaurora@mg.codeaurora.org >> On Behalf Of Charan Teja >> Kalla >> Sent: Wednesday, June 17, 2020 2:29 AM >> To: Ruhl, Michael J ; Sumit Semwal >> ; open list:DMA BUFFER SHARING FRAMEWORK >> ; DRI mailing list > de...@lists.freedesktop.org> >> Cc: Linaro MM SIG ; >> vinme...@codeaurora.org; LKML ; >> sta...@vger.kernel.org >> Subject: Re: [PATCH] dmabuf: use spinlock to access dmabuf->name >> >> Thanks Michael for the comments.. >> >> On 6/16/2020 7:29 PM, Ruhl, Michael J wrote: >>>> -Original Message----- >>>> From: dri-devel On Behalf Of >>>> Ruhl, Michael J >>>> Sent: Tuesday, June 16, 2020 9:51 AM >>>> To: Charan Teja Kalla ; Sumit Semwal >>>> ; open list:DMA BUFFER SHARING >> FRAMEWORK >>>> ; DRI mailing list >>> de...@lists.freedesktop.org> >>>> Cc: Linaro MM SIG ; >>>> vinme...@codeaurora.org; LKML ; >>>> sta...@vger.kernel.org >>>> Subject: RE: [PATCH] dmabuf: use spinlock to access dmabuf->name >>>> >>>>> -Original Message- >>>>> From: dri-devel On Behalf Of >>>>> Charan Teja Kalla >>>>> Sent: Thursday, June 11, 2020 9:40 AM >>>>> To: Sumit Semwal ; open list:DMA BUFFER >>>>> SHARING FRAMEWORK ; DRI mailing list >> >>>> de...@lists.freedesktop.org> >>>>> Cc: Linaro MM SIG ; >>>>> vinme...@codeaurora.org; LKML ; >>>>> sta...@vger.kernel.org >>>>> Subject: [PATCH] dmabuf: use spinlock to access dmabuf->name >>>>> >>>>> There exists a sleep-while-atomic bug while accessing the dmabuf->name >>>>> under mutex in the dmabuffs_dname(). This is caused from the SELinux >>>>> permissions checks on a process where it tries to validate the inherited >>>>> files from fork() by traversing them through iterate_fd() (which >>>>> traverse files under spin_lock) and call >>>>> match_file(security/selinux/hooks.c) where the permission checks >> happen. >>>>> This audit information is logged using dump_common_audit_data() >> where it >>>>> calls d_path() to get the file path name. If the file check happen on >>>>> the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex >> to >>>>> access dmabuf->name. The flow will be like below: >>>>> flush_unauthorized_files() >>>>> iterate_fd() >>>>>spin_lock() --> Start of the atomic section. >>>>> match_file() >>>>>file_has_perm() >>>>> avc_has_perm() >>>>>avc_audit() >>>>> slow_avc_audit() >>>>> common_lsm_audit() >>>>> dump_common_audit_data() >>>>> audit_log_d_path() >>>>> d_path() >>>>>dmabuffs_dname() >>>>> mutex_lock()--> Sleep while atomic. >>>>> >>>>> Call trace captured (on 4.19 kernels) is below: >>>>> ___might_sleep+0x204/0x208 >>>>> __might_sleep+0x50/0x88 >>>>> __mutex_lock_common+0x5c/0x1068 >>>>> __mutex_lock_common+0x5c/0x1068 >>>>> mutex_lock_nested+0x40/0x50 >>>>> dmabuffs_dname+0xa0/0x170 >>>>> d_path+0x84/0x290 >>>>> audit_log_d_path+0x74/0x130 >>>>> common_lsm_audit+0x334/0x6e8 >>>>> slow_avc_audit+0xb8/0xf8 >>>>> avc_has_perm+0x154/0x218 >>>>> file_has_perm+0x70/0x180 >>>>> match_file+0x60/0x78 >>>>> iterate_fd+0x128/0x168 >>>>> selinux_bprm_committing_creds+0x178/0x248 >>>>> security_bprm_committing_creds+0x30/0x48 >>>>> install_exec_creds+0x1c/0x68 >>>>> load_elf_binary+0x3a4/0x14e0 >>>>> search_binary_handler+0xb0/0x1e0 >>>>> >>>>> So, use spinlock to access dmabuf->name to avoid sleep-while-atomic. >>>>> >>>>> Cc: [5.3+] >>>>> Signed-off-by: Charan Teja Reddy >>>>> --- >>>>> drivers/dma-buf/dma-buf.c | 13 +++-- >>>>> include/linux/dma-buf.h | 1 + >>>>> 2 files changed, 8 ins
Re: [PATCH] dmabuf: use spinlock to access dmabuf->name
On 6/17/2020 1:51 PM, David Laight wrote: > From: Charan Teja Kalla >> Sent: 17 June 2020 07:29 > ... >>>> If name is freed you will copy garbage, but the only way >>>> for that to happen is that _set_name or _release have to be called >>>> at just the right time. >>>> >>>> And the above would probably only be an issue if the set_name >>>> was called, so you will get NULL or a real name. >> >> And there exists a use-after-free to avoid which requires the lock. Say >> that memcpy() in dmabuffs_dname is in progress and in parallel _set_name >> will free the same buffer that memcpy is operating on. > > If the name is being looked at while the item is being freed > you almost certainly have much bigger problems that just > the name being a 'junk' pointer. True, thus needs the lock. > > David. > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > 1PT, UK > Registration No: 1397386 (Wales) > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dmabuf: use spinlock to access dmabuf->name
Thanks Michael for the comments.. On 6/16/2020 7:29 PM, Ruhl, Michael J wrote: >> -Original Message- >> From: dri-devel On Behalf Of >> Ruhl, Michael J >> Sent: Tuesday, June 16, 2020 9:51 AM >> To: Charan Teja Kalla ; Sumit Semwal >> ; open list:DMA BUFFER SHARING FRAMEWORK >> ; DRI mailing list > de...@lists.freedesktop.org> >> Cc: Linaro MM SIG ; >> vinme...@codeaurora.org; LKML ; >> sta...@vger.kernel.org >> Subject: RE: [PATCH] dmabuf: use spinlock to access dmabuf->name >> >>> -Original Message- >>> From: dri-devel On Behalf Of >>> Charan Teja Kalla >>> Sent: Thursday, June 11, 2020 9:40 AM >>> To: Sumit Semwal ; open list:DMA BUFFER >>> SHARING FRAMEWORK ; DRI mailing list >> de...@lists.freedesktop.org> >>> Cc: Linaro MM SIG ; >>> vinme...@codeaurora.org; LKML ; >>> sta...@vger.kernel.org >>> Subject: [PATCH] dmabuf: use spinlock to access dmabuf->name >>> >>> There exists a sleep-while-atomic bug while accessing the dmabuf->name >>> under mutex in the dmabuffs_dname(). This is caused from the SELinux >>> permissions checks on a process where it tries to validate the inherited >>> files from fork() by traversing them through iterate_fd() (which >>> traverse files under spin_lock) and call >>> match_file(security/selinux/hooks.c) where the permission checks happen. >>> This audit information is logged using dump_common_audit_data() where it >>> calls d_path() to get the file path name. If the file check happen on >>> the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to >>> access dmabuf->name. The flow will be like below: >>> flush_unauthorized_files() >>> iterate_fd() >>>spin_lock() --> Start of the atomic section. >>> match_file() >>>file_has_perm() >>> avc_has_perm() >>>avc_audit() >>> slow_avc_audit() >>> common_lsm_audit() >>> dump_common_audit_data() >>> audit_log_d_path() >>> d_path() >>>dmabuffs_dname() >>> mutex_lock()--> Sleep while atomic. >>> >>> Call trace captured (on 4.19 kernels) is below: >>> ___might_sleep+0x204/0x208 >>> __might_sleep+0x50/0x88 >>> __mutex_lock_common+0x5c/0x1068 >>> __mutex_lock_common+0x5c/0x1068 >>> mutex_lock_nested+0x40/0x50 >>> dmabuffs_dname+0xa0/0x170 >>> d_path+0x84/0x290 >>> audit_log_d_path+0x74/0x130 >>> common_lsm_audit+0x334/0x6e8 >>> slow_avc_audit+0xb8/0xf8 >>> avc_has_perm+0x154/0x218 >>> file_has_perm+0x70/0x180 >>> match_file+0x60/0x78 >>> iterate_fd+0x128/0x168 >>> selinux_bprm_committing_creds+0x178/0x248 >>> security_bprm_committing_creds+0x30/0x48 >>> install_exec_creds+0x1c/0x68 >>> load_elf_binary+0x3a4/0x14e0 >>> search_binary_handler+0xb0/0x1e0 >>> >>> So, use spinlock to access dmabuf->name to avoid sleep-while-atomic. >>> >>> Cc: [5.3+] >>> Signed-off-by: Charan Teja Reddy >>> --- >>> drivers/dma-buf/dma-buf.c | 13 +++-- >>> include/linux/dma-buf.h | 1 + >>> 2 files changed, 8 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>> index 01ce125..2e0456c 100644 >>> --- a/drivers/dma-buf/dma-buf.c >>> +++ b/drivers/dma-buf/dma-buf.c >>> @@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry *dentry, >>> char *buffer, int buflen) >>> size_t ret = 0; >>> >>> dmabuf = dentry->d_fsdata; >>> - dma_resv_lock(dmabuf->resv, NULL); >>> + spin_lock(&dmabuf->name_lock); >>> if (dmabuf->name) >>> ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN); >>> - dma_resv_unlock(dmabuf->resv); >>> + spin_unlock(&dmabuf->name_lock); >> >> I am not really clear on why you need this lock. >> >> If name == NULL you have no issues. >> If name is real, you have no issues. Yeah, ideal cases... >> >> If name is freed you will copy garbage, but the only way >> for that to happen is that _set_name or _release have to be called >> at just the right time. >> >> And the above would probably only be an issue if the set_name >&
Re: [PATCH v2] dma-buf: Move dma_buf_release() from fops to dentry_ops
Thanks Sumit for the fix. On 6/11/2020 5:14 PM, Sumit Semwal wrote: > Charan Teja reported a 'use-after-free' in dmabuffs_dname [1], which > happens if the dma_buf_release() is called while the userspace is > accessing the dma_buf pseudo fs's dmabuffs_dname() in another process, > and dma_buf_release() releases the dmabuf object when the last reference > to the struct file goes away. > > I discussed with Arnd Bergmann, and he suggested that rather than tying > the dma_buf_release() to the file_operations' release(), we can tie it to > the dentry_operations' d_release(), which will be called when the last ref > to the dentry is removed. > > The path exercised by __fput() calls f_op->release() first, and then calls > dput, which eventually calls d_op->d_release(). > > In the 'normal' case, when no userspace access is happening via dma_buf > pseudo fs, there should be exactly one fd, file, dentry and inode, so > closing the fd will kill of everything right away. > > In the presented case, the dentry's d_release() will be called only when > the dentry's last ref is released. > > Therefore, lets move dma_buf_release() from fops->release() to > d_ops->d_release() > > Many thanks to Arnd for his FS insights :) > > [1]: https://lore.kernel.org/patchwork/patch/1238278/ > > Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls") > Reported-by: syzbot+3643a18836bce555b...@syzkaller.appspotmail.com > Cc: [5.3+] > Cc: Arnd Bergmann > Reported-by: Charan Teja Reddy > Reviewed-by: Arnd Bergmann > Signed-off-by: Sumit Semwal > Tested this patch for Android running on Snapdragon hardware and see no issues. Tested-by: Charan Teja Reddy > --- > v2: per Arnd: Moved dma_buf_release() above to avoid forward declaration; > removed dentry_ops check. > --- > drivers/dma-buf/dma-buf.c | 54 ++- > 1 file changed, 25 insertions(+), 29 deletions(-) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 01ce125f8e8d..412629601ad3 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -54,37 +54,11 @@ static char *dmabuffs_dname(struct dentry *dentry, char > *buffer, int buflen) >dentry->d_name.name, ret > 0 ? name : ""); > } > > -static const struct dentry_operations dma_buf_dentry_ops = { > - .d_dname = dmabuffs_dname, > -}; > - > -static struct vfsmount *dma_buf_mnt; > - > -static int dma_buf_fs_init_context(struct fs_context *fc) > -{ > - struct pseudo_fs_context *ctx; > - > - ctx = init_pseudo(fc, DMA_BUF_MAGIC); > - if (!ctx) > - return -ENOMEM; > - ctx->dops = &dma_buf_dentry_ops; > - return 0; > -} > - > -static struct file_system_type dma_buf_fs_type = { > - .name = "dmabuf", > - .init_fs_context = dma_buf_fs_init_context, > - .kill_sb = kill_anon_super, > -}; > - > -static int dma_buf_release(struct inode *inode, struct file *file) > +static void dma_buf_release(struct dentry *dentry) > { > struct dma_buf *dmabuf; > > - if (!is_dma_buf_file(file)) > - return -EINVAL; > - > - dmabuf = file->private_data; > + dmabuf = dentry->d_fsdata; > > BUG_ON(dmabuf->vmapping_counter); > > @@ -110,9 +84,32 @@ static int dma_buf_release(struct inode *inode, struct > file *file) > module_put(dmabuf->owner); > kfree(dmabuf->name); > kfree(dmabuf); > +} > + > +static const struct dentry_operations dma_buf_dentry_ops = { > + .d_dname = dmabuffs_dname, > + .d_release = dma_buf_release, > +}; > + > +static struct vfsmount *dma_buf_mnt; > + > +static int dma_buf_fs_init_context(struct fs_context *fc) > +{ > + struct pseudo_fs_context *ctx; > + > + ctx = init_pseudo(fc, DMA_BUF_MAGIC); > + if (!ctx) > + return -ENOMEM; > + ctx->dops = &dma_buf_dentry_ops; > return 0; > } > > +static struct file_system_type dma_buf_fs_type = { > + .name = "dmabuf", > + .init_fs_context = dma_buf_fs_init_context, > + .kill_sb = kill_anon_super, > +}; > + > static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct > *vma) > { > struct dma_buf *dmabuf; > @@ -412,7 +409,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, > struct file *file) > } > > static const struct file_operations dma_buf_fops = { > - .release= dma_buf_release, > .mmap = dma_buf_mmap_internal, > .llseek = dma_buf_llseek, > .poll = dma_buf_poll, > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] dmabuf: use spinlock to access dmabuf->name
There exists a sleep-while-atomic bug while accessing the dmabuf->name under mutex in the dmabuffs_dname(). This is caused from the SELinux permissions checks on a process where it tries to validate the inherited files from fork() by traversing them through iterate_fd() (which traverse files under spin_lock) and call match_file(security/selinux/hooks.c) where the permission checks happen. This audit information is logged using dump_common_audit_data() where it calls d_path() to get the file path name. If the file check happen on the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to access dmabuf->name. The flow will be like below: flush_unauthorized_files() iterate_fd() spin_lock() --> Start of the atomic section. match_file() file_has_perm() avc_has_perm() avc_audit() slow_avc_audit() common_lsm_audit() dump_common_audit_data() audit_log_d_path() d_path() dmabuffs_dname() mutex_lock()--> Sleep while atomic. Call trace captured (on 4.19 kernels) is below: ___might_sleep+0x204/0x208 __might_sleep+0x50/0x88 __mutex_lock_common+0x5c/0x1068 __mutex_lock_common+0x5c/0x1068 mutex_lock_nested+0x40/0x50 dmabuffs_dname+0xa0/0x170 d_path+0x84/0x290 audit_log_d_path+0x74/0x130 common_lsm_audit+0x334/0x6e8 slow_avc_audit+0xb8/0xf8 avc_has_perm+0x154/0x218 file_has_perm+0x70/0x180 match_file+0x60/0x78 iterate_fd+0x128/0x168 selinux_bprm_committing_creds+0x178/0x248 security_bprm_committing_creds+0x30/0x48 install_exec_creds+0x1c/0x68 load_elf_binary+0x3a4/0x14e0 search_binary_handler+0xb0/0x1e0 So, use spinlock to access dmabuf->name to avoid sleep-while-atomic. Cc: [5.3+] Signed-off-by: Charan Teja Reddy --- drivers/dma-buf/dma-buf.c | 13 +++-- include/linux/dma-buf.h | 1 + 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 01ce125..2e0456c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) size_t ret = 0; dmabuf = dentry->d_fsdata; - dma_resv_lock(dmabuf->resv, NULL); + spin_lock(&dmabuf->name_lock); if (dmabuf->name) ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN); - dma_resv_unlock(dmabuf->resv); + spin_unlock(&dmabuf->name_lock); return dynamic_dname(dentry, buffer, buflen, "/%s:%s", dentry->d_name.name, ret > 0 ? name : ""); @@ -335,7 +335,7 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf) if (IS_ERR(name)) return PTR_ERR(name); - dma_resv_lock(dmabuf->resv, NULL); + spin_lock(&dmabuf->name_lock); if (!list_empty(&dmabuf->attachments)) { ret = -EBUSY; kfree(name); @@ -345,7 +345,7 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf) dmabuf->name = name; out_unlock: - dma_resv_unlock(dmabuf->resv); + spin_unlock(&dmabuf->name_lock); return ret; } @@ -405,10 +405,10 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1); seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name); - dma_resv_lock(dmabuf->resv, NULL); + spin_lock(&dmabuf->name_lock); if (dmabuf->name) seq_printf(m, "name:\t%s\n", dmabuf->name); - dma_resv_unlock(dmabuf->resv); + spin_unlock(&dmabuf->name_lock); } static const struct file_operations dma_buf_fops = { @@ -546,6 +546,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->size = exp_info->size; dmabuf->exp_name = exp_info->exp_name; dmabuf->owner = exp_info->owner; + spin_lock_init(&dmabuf->name_lock); init_waitqueue_head(&dmabuf->poll); dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll; dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0; diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index ab0c156..93108fd 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -311,6 +311,7 @@ struct dma_buf { void *vmap_ptr; const char *exp_name; const char *name; + spinlock_t name_lock; struct module *owner; struct list_head list_node; void *priv; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2] dma-buf: fix use-after-free in dmabuffs_dname
Thank you for the reply. On 5/13/2020 9:33 PM, Sumit Semwal wrote: > On Wed, 13 May 2020 at 21:16, Daniel Vetter wrote: >> >> On Wed, May 13, 2020 at 02:51:12PM +0200, Greg KH wrote: >>> On Wed, May 13, 2020 at 05:40:26PM +0530, Charan Teja Kalla wrote: >>>> >>>> Thank you Greg for the comments. >>>> On 5/12/2020 2:22 PM, Greg KH wrote: >>>>> On Fri, May 08, 2020 at 12:11:03PM +0530, Charan Teja Reddy wrote: >>>>>> The following race occurs while accessing the dmabuf object exported as >>>>>> file: >>>>>> P1 P2 >>>>>> dma_buf_release() dmabuffs_dname() >>>>>> [say lsof reading /proc//fd/] >>>>>> >>>>>> read dmabuf stored in dentry->d_fsdata >>>>>> Free the dmabuf object >>>>>> Start accessing the dmabuf structure >>>>>> >>>>>> In the above description, the dmabuf object freed in P1 is being >>>>>> accessed from P2 which is resulting into the use-after-free. Below is >>>>>> the dump stack reported. >>>>>> >>>>>> We are reading the dmabuf object stored in the dentry->d_fsdata but >>>>>> there is no binding between the dentry and the dmabuf which means that >>>>>> the dmabuf can be freed while it is being read from ->d_fsdata and >>>>>> inuse. Reviews on the patch V1 says that protecting the dmabuf inuse >>>>>> with an extra refcount is not a viable solution as the exported dmabuf >>>>>> is already under file's refcount and keeping the multiple refcounts on >>>>>> the same object coordinated is not possible. >>>>>> >>>>>> As we are reading the dmabuf in ->d_fsdata just to get the user passed >>>>>> name, we can directly store the name in d_fsdata thus can avoid the >>>>>> reading of dmabuf altogether. >>>>>> >>>>>> Call Trace: >>>>>> kasan_report+0x12/0x20 >>>>>> __asan_report_load8_noabort+0x14/0x20 >>>>>> dmabuffs_dname+0x4f4/0x560 >>>>>> tomoyo_realpath_from_path+0x165/0x660 >>>>>> tomoyo_get_realpath >>>>>> tomoyo_check_open_permission+0x2a3/0x3e0 >>>>>> tomoyo_file_open >>>>>> tomoyo_file_open+0xa9/0xd0 >>>>>> security_file_open+0x71/0x300 >>>>>> do_dentry_open+0x37a/0x1380 >>>>>> vfs_open+0xa0/0xd0 >>>>>> path_openat+0x12ee/0x3490 >>>>>> do_filp_open+0x192/0x260 >>>>>> do_sys_openat2+0x5eb/0x7e0 >>>>>> do_sys_open+0xf2/0x180 >>>>>> >>>>>> Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls") >>>>>> Reported-by: syzbot+3643a18836bce555b...@syzkaller.appspotmail.com >>>>>> Cc: [5.3+] >>>>>> Signed-off-by: Charan Teja Reddy >>>>>> --- >>>>>> >>>>>> Changes in v2: >>>>>> >>>>>> - Pass the user passed name in ->d_fsdata instead of dmabuf >>>>>> - Improve the commit message >>>>>> >>>>>> Changes in v1: (https://patchwork.kernel.org/patch/11514063/) >>>>>> >>>>>> drivers/dma-buf/dma-buf.c | 17 ++--- >>>>>> 1 file changed, 10 insertions(+), 7 deletions(-) >>>>>> >>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>>>>> index 01ce125..0071f7d 100644 >>>>>> --- a/drivers/dma-buf/dma-buf.c >>>>>> +++ b/drivers/dma-buf/dma-buf.c >>>>>> @@ -25,6 +25,7 @@ >>>>>> #include >>>>>> #include >>>>>> #include >>>>>> +#include >>>>>> >>>>>> #include >>>>>> #include >>>>>> @@ -40,15 +41,13 @@ struct dma_buf_list { >>>>>> >>>>>> static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int >>>>>> buflen) >>>>>> { >>>>>> -struct dma_buf *dmabuf; >>>>>> char name[DMA_BUF_NAME_LEN]; >>>>>> size_t ret = 0; >>>&
Re: [PATCH v2] dma-buf: fix use-after-free in dmabuffs_dname
Thank you Greg for the comments. On 5/12/2020 2:22 PM, Greg KH wrote: > On Fri, May 08, 2020 at 12:11:03PM +0530, Charan Teja Reddy wrote: >> The following race occurs while accessing the dmabuf object exported as >> file: >> P1 P2 >> dma_buf_release() dmabuffs_dname() >> [say lsof reading /proc//fd/] >> >> read dmabuf stored in dentry->d_fsdata >> Free the dmabuf object >> Start accessing the dmabuf structure >> >> In the above description, the dmabuf object freed in P1 is being >> accessed from P2 which is resulting into the use-after-free. Below is >> the dump stack reported. >> >> We are reading the dmabuf object stored in the dentry->d_fsdata but >> there is no binding between the dentry and the dmabuf which means that >> the dmabuf can be freed while it is being read from ->d_fsdata and >> inuse. Reviews on the patch V1 says that protecting the dmabuf inuse >> with an extra refcount is not a viable solution as the exported dmabuf >> is already under file's refcount and keeping the multiple refcounts on >> the same object coordinated is not possible. >> >> As we are reading the dmabuf in ->d_fsdata just to get the user passed >> name, we can directly store the name in d_fsdata thus can avoid the >> reading of dmabuf altogether. >> >> Call Trace: >> kasan_report+0x12/0x20 >> __asan_report_load8_noabort+0x14/0x20 >> dmabuffs_dname+0x4f4/0x560 >> tomoyo_realpath_from_path+0x165/0x660 >> tomoyo_get_realpath >> tomoyo_check_open_permission+0x2a3/0x3e0 >> tomoyo_file_open >> tomoyo_file_open+0xa9/0xd0 >> security_file_open+0x71/0x300 >> do_dentry_open+0x37a/0x1380 >> vfs_open+0xa0/0xd0 >> path_openat+0x12ee/0x3490 >> do_filp_open+0x192/0x260 >> do_sys_openat2+0x5eb/0x7e0 >> do_sys_open+0xf2/0x180 >> >> Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls") >> Reported-by: syzbot+3643a18836bce555b...@syzkaller.appspotmail.com >> Cc: [5.3+] >> Signed-off-by: Charan Teja Reddy >> --- >> >> Changes in v2: >> >> - Pass the user passed name in ->d_fsdata instead of dmabuf >> - Improve the commit message >> >> Changes in v1: (https://patchwork.kernel.org/patch/11514063/) >> >> drivers/dma-buf/dma-buf.c | 17 ++--- >> 1 file changed, 10 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >> index 01ce125..0071f7d 100644 >> --- a/drivers/dma-buf/dma-buf.c >> +++ b/drivers/dma-buf/dma-buf.c >> @@ -25,6 +25,7 @@ >> #include >> #include >> #include >> +#include >> >> #include >> #include >> @@ -40,15 +41,13 @@ struct dma_buf_list { >> >> static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) >> { >> -struct dma_buf *dmabuf; >> char name[DMA_BUF_NAME_LEN]; >> size_t ret = 0; >> >> -dmabuf = dentry->d_fsdata; >> -dma_resv_lock(dmabuf->resv, NULL); >> -if (dmabuf->name) >> -ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN); >> -dma_resv_unlock(dmabuf->resv); >> +spin_lock(&dentry->d_lock); > > Are you sure this lock always protects d_fsdata? I think yes. In the dma-buf.c, I have to make sure that d_fsdata should always be under d_lock thus it will be protected. (In this posted patch there is one place(in dma_buf_set_name) that is missed, will update this in V3). > >> +if (dentry->d_fsdata) >> +ret = strlcpy(name, dentry->d_fsdata, DMA_BUF_NAME_LEN); >> +spin_unlock(&dentry->d_lock); >> >> return dynamic_dname(dentry, buffer, buflen, "/%s:%s", >> dentry->d_name.name, ret > 0 ? name : ""); > > If the above check fails the name will be what? How could d_name.name > be valid but d_fsdata not be valid? In case of check fails, empty string "" is appended to the name by the code, ret > 0 ? name : "", ret is initialized to zero. Thus the name string will be like "/dmabuf:". Regarding the validity of d_fsdata, we are setting the dmabuf's dentry->d_fsdata to NULL in the dma_buf_release() thus can go invalid if that dmabuf is in the free path. > > >> @@ -80,12 +79,16 @@ static int dma_buf_fs_init_context(struct fs_context *fc) >> static int dma_buf_release(struct inode *inode, struct file *file) >> { >> struct dma_buf *dmabuf; >> +struct dentry *dentry = file->f_path.dentry; >> >> if (!is_dma_buf_file(file)) >> return -EINVAL; >> >> dmabuf = file->private_data; >> >> +spin_lock(&dentry->d_lock); >> +dentry->d_fsdata = NULL; >> +spin_unlock(&dentry->d_lock); >> BUG_ON(dmabuf->vmapping_counter); >> >> /* >> @@ -343,6 +346,7 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, >> const char __user *buf) >> } >> kfree(dmabuf->name); >> dmabuf->name = name; >> +dmabuf->file->f_path.dentry->d_fsdata = name; > > You are just changing the use of d_fsdata from being a pointer to the > dmabuf to being a pointer
Re: [PATCH] dma-buf: fix use-after-free in dmabuffs_dname
Thank you Greg for the comments. On 5/6/2020 2:30 PM, Greg KH wrote: On Wed, May 06, 2020 at 02:00:10PM +0530, Charan Teja Kalla wrote: Thank you Greg for the reply. On 5/5/2020 3:38 PM, Greg KH wrote: On Tue, Apr 28, 2020 at 01:24:02PM +0530, Charan Teja Reddy wrote: The following race occurs while accessing the dmabuf object exported as file: P1 P2 dma_buf_release() dmabuffs_dname() [say lsof reading /proc//fd/] read dmabuf stored in dentry->fsdata Free the dmabuf object Start accessing the dmabuf structure In the above description, the dmabuf object freed in P1 is being accessed from P2 which is resulting into the use-after-free. Below is the dump stack reported. Call Trace: kasan_report+0x12/0x20 __asan_report_load8_noabort+0x14/0x20 dmabuffs_dname+0x4f4/0x560 tomoyo_realpath_from_path+0x165/0x660 tomoyo_get_realpath tomoyo_check_open_permission+0x2a3/0x3e0 tomoyo_file_open tomoyo_file_open+0xa9/0xd0 security_file_open+0x71/0x300 do_dentry_open+0x37a/0x1380 vfs_open+0xa0/0xd0 path_openat+0x12ee/0x3490 do_filp_open+0x192/0x260 do_sys_openat2+0x5eb/0x7e0 do_sys_open+0xf2/0x180 Fixes: bb2bb90 ("dma-buf: add DMA_BUF_SET_NAME ioctls") Nit, please read the documentation for how to do a Fixes: line properly, you need more digits: Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls") Will update the patch Reported-by:syzbot+3643a18836bce555b...@syzkaller.appspotmail.com Signed-off-by: Charan Teja Reddy Also, any reason you didn't include the other mailing lists that get_maintainer.pl said to? Really sorry for not sending to complete list. Added now. And finally, no cc: stable in the s-o-b area for an issue that needs to be backported to older kernels? Will update the patch. --- drivers/dma-buf/dma-buf.c | 25 +++-- include/linux/dma-buf.h | 1 + 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 570c923..069d8f78 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -38,18 +39,34 @@ struct dma_buf_list { static struct dma_buf_list db_list; +static void dmabuf_dent_put(struct dma_buf *dmabuf) +{ + if (atomic_dec_and_test(&dmabuf->dent_count)) { + kfree(dmabuf->name); + kfree(dmabuf); + } Why not just use a kref instead of an open-coded atomic value? Kref approach looks cleaner. will update the patch accordingly. +} + static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) { struct dma_buf *dmabuf; char name[DMA_BUF_NAME_LEN]; size_t ret = 0; + spin_lock(&dentry->d_lock); dmabuf = dentry->d_fsdata; + if (!dmabuf || !atomic_add_unless(&dmabuf->dent_count, 1, 0)) { + spin_unlock(&dentry->d_lock); + goto out; How can dmabuf not be valid here? And isn't there already a usecount for the buffer? dmabuf exported as file simply relies on that file's refcount, thus fput() releases the dmabuf. We are storing the dmabuf in the dentry->d_fsdata but there is no binding between the dentry and the dmabuf. So, flow will be like 1) P1 calls fput(dmabuf_fd) 2) P2 trying to access the file information of P1. Eg: lsof command trying to list out the dmabuf_fd information using /proc//fd/dmabuf_fd 3) P1 calls the file->f_op->release(dmabuf_fd_file)(ends up in calling dma_buf_release()), thus frees up the dmabuf buffer. 4) P2 access the dmabuf stored in the dentry->d_fsdata which was freed in step 3. So we need to have some refcount mechanism to avoid the use-after-free in step 4. Ok, but watch out, now you have 2 different reference counts for the same structure. Keeping them coordinated is almost always an impossible task so you need to only rely on one. If you can't use the file api, just drop all of the reference counting logic in there and only use the kref one. I feel that changing the refcount logic now to dma-buf objects involve changes in the core dma-buf framework. NO? Instead, how about passing the user passed name directly in the ->d_fsdata inplace of dmabuf object? Because we just need user passed name in the dmabuffs_dname(). With this we can avoid the need for extra refcount on dmabuf. Posted patch-V2: https://lkml.org/lkml/2020/5/8/158 good luck! greg k-h -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf: fix use-after-free in dmabuffs_dname
Thank you Greg for the reply. On 5/5/2020 3:38 PM, Greg KH wrote: On Tue, Apr 28, 2020 at 01:24:02PM +0530, Charan Teja Reddy wrote: The following race occurs while accessing the dmabuf object exported as file: P1 P2 dma_buf_release() dmabuffs_dname() [say lsof reading /proc//fd/] read dmabuf stored in dentry->fsdata Free the dmabuf object Start accessing the dmabuf structure In the above description, the dmabuf object freed in P1 is being accessed from P2 which is resulting into the use-after-free. Below is the dump stack reported. Call Trace: kasan_report+0x12/0x20 __asan_report_load8_noabort+0x14/0x20 dmabuffs_dname+0x4f4/0x560 tomoyo_realpath_from_path+0x165/0x660 tomoyo_get_realpath tomoyo_check_open_permission+0x2a3/0x3e0 tomoyo_file_open tomoyo_file_open+0xa9/0xd0 security_file_open+0x71/0x300 do_dentry_open+0x37a/0x1380 vfs_open+0xa0/0xd0 path_openat+0x12ee/0x3490 do_filp_open+0x192/0x260 do_sys_openat2+0x5eb/0x7e0 do_sys_open+0xf2/0x180 Fixes: bb2bb90 ("dma-buf: add DMA_BUF_SET_NAME ioctls") Nit, please read the documentation for how to do a Fixes: line properly, you need more digits: Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls") Will update the patch Reported-by:syzbot+3643a18836bce555b...@syzkaller.appspotmail.com Signed-off-by: Charan Teja Reddy Also, any reason you didn't include the other mailing lists that get_maintainer.pl said to? Really sorry for not sending to complete list. Added now. And finally, no cc: stable in the s-o-b area for an issue that needs to be backported to older kernels? Will update the patch. --- drivers/dma-buf/dma-buf.c | 25 +++-- include/linux/dma-buf.h | 1 + 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 570c923..069d8f78 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -38,18 +39,34 @@ struct dma_buf_list { static struct dma_buf_list db_list; +static void dmabuf_dent_put(struct dma_buf *dmabuf) +{ + if (atomic_dec_and_test(&dmabuf->dent_count)) { + kfree(dmabuf->name); + kfree(dmabuf); + } Why not just use a kref instead of an open-coded atomic value? Kref approach looks cleaner. will update the patch accordingly. +} + static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) { struct dma_buf *dmabuf; char name[DMA_BUF_NAME_LEN]; size_t ret = 0; + spin_lock(&dentry->d_lock); dmabuf = dentry->d_fsdata; + if (!dmabuf || !atomic_add_unless(&dmabuf->dent_count, 1, 0)) { + spin_unlock(&dentry->d_lock); + goto out; How can dmabuf not be valid here? And isn't there already a usecount for the buffer? dmabuf exported as file simply relies on that file's refcount, thus fput() releases the dmabuf. We are storing the dmabuf in the dentry->d_fsdata but there is no binding between the dentry and the dmabuf. So, flow will be like 1) P1 calls fput(dmabuf_fd) 2) P2 trying to access the file information of P1. Eg: lsof command trying to list out the dmabuf_fd information using /proc//fd/dmabuf_fd 3) P1 calls the file->f_op->release(dmabuf_fd_file)(ends up in calling dma_buf_release()), thus frees up the dmabuf buffer. 4) P2 access the dmabuf stored in the dentry->d_fsdata which was freed in step 3. So we need to have some refcount mechanism to avoid the use-after-free in step 4. + } + spin_unlock(&dentry->d_lock); dma_resv_lock(dmabuf->resv, NULL); if (dmabuf->name) ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN); dma_resv_unlock(dmabuf->resv); + dmabuf_dent_put(dmabuf); +out: return dynamic_dname(dentry, buffer, buflen, "/%s:%s", dentry->d_name.name, ret > 0 ? name : ""); } @@ -80,12 +97,16 @@ static int dma_buf_fs_init_context(struct fs_context *fc) static int dma_buf_release(struct inode *inode, struct file *file) { struct dma_buf *dmabuf; + struct dentry *dentry = file->f_path.dentry; if (!is_dma_buf_file(file)) return -EINVAL; dmabuf = file->private_data; + spin_lock(&dentry->d_lock); + dentry->d_fsdata = NULL; + spin_unlock(&dentry->d_lock); BUG_ON(dmabuf->vmapping_counter); /* @@ -108,8 +129,7 @@ static int dma_buf_release(struct inode *inode, struct file *file) dma_resv_fini(dmabuf->resv); module_put(dmabuf->owner); - kfree(dmabuf->name); - kfree(dmabuf); + dmabuf_dent_put(dmabuf); return 0; } @@ -548,6 +568,7 @@ struct dma_buf *dma_buf_export(con