Hi Herton,

On 08/07/2015 07:09 PM, Herton R. Krzesinski wrote:
The current semaphore code allows a potential use after free: in exit_sem we may
free the task's sem_undo_list while there is still another task looping through
the same semaphore set and cleaning the sem_undo list at freeary function (the
task called IPC_RMID for the same semaphore set).
Correct, good catch!

semid==-1 can happen due to two reasons:
a) end of sem_undo_list (i.e.: last undo structure in CLONE_SYSVSEM group)
   b) parallel IPC_RMID

If semid==-1 happens due to a parallel IPC_RMID, then exit_sem does not free all sem_undo structures that belong to the current CLONE_SYSVSEM group.
But it does free the sem_undo_list structure.

Since:
- struct sem_undo contains a link to struct sem_undo_list.
- struct sem_undo_list is kfreed immediately at the end of exit_sem()
- the parallel IPC_RMID will find the sem_undo structure, then follow
  the link to sem_undo_list to unlink it
        -> use after free, spinlock debug errors because spinlock
                was already overwritten by slab debug.

(what makes it worse: un->semid is read twice, without synchronization. It should be read once, with synchronization)

Signed-off-by: Herton R. Krzesinski<her...@redhat.com>
I would add: Cc: <sta...@vger.kernel.org>
---
  ipc/sem.c | 24 ++++++++++++++++--------
  1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index bc3d530..35ccddd 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2074,17 +2074,24 @@ void exit_sem(struct task_struct *tsk)
                rcu_read_lock();
                un = list_entry_rcu(ulp->list_proc.next,
                                    struct sem_undo, list_proc);
-               if (&un->list_proc == &ulp->list_proc)
-                       semid = -1;
-                else
-                       semid = un->semid;
+               if (&un->list_proc == &ulp->list_proc) {
+                       rcu_read_unlock();
+                       /* Make sure we wait for any place still referencing
+                        * the current ulp to finish */
+                       synchronize_rcu();
Sorry, no. synchronize_rcu() is a high-latency operation.
We can't call it within exit_sem(). We could use kfree_rcu(), but I don't see that we need it:

Which race do you imagine?
ulp is accessed by:
- freeary(). Race impossible due to explicit locking.
- exit_sem(). Race impossible due to ulp->refcount
- find_alloc_undo(). Race impossible, because it operates on current->sysvsem.undo_list.
"current" is in do_exit, thus can't be inside semtimedop().

+                       break;
+               }
+               spin_lock(&ulp->lock);
+               semid = un->semid;
+               spin_unlock(&ulp->lock);
Ok/good.
Note (I've tried it first):
Just "READ_ONCE(un->semid)" would be insufficient, because this can happen:
 A: thread 1, within freeary:
 A: spin_lock(&ulp->lock);
 A: un->semid = -1;
 B:    thread 2, within exit_sem():
 B:    if (un->semid == -1) exit;
 B:    kfree(ulp);
 A: spin_unlock(&ulp->lock); <<<< use-after-free, bug

+ /* exit_sem raced with IPC_RMID, nothing to do */
                if (semid == -1) {
                        rcu_read_unlock();
-                       break;
+                       continue;
                }
- sma = sem_obtain_object_check(tsk->nsproxy->ipc_ns, un->semid);
+               sma = sem_obtain_object_check(tsk->nsproxy->ipc_ns, semid);
                /* exit_sem raced with IPC_RMID, nothing to do */
                if (IS_ERR(sma)) {
                        rcu_read_unlock();
Ok.
@@ -2112,9 +2119,10 @@ void exit_sem(struct task_struct *tsk)
                ipc_assert_locked_object(&sma->sem_perm);
                list_del(&un->list_id);
- spin_lock(&ulp->lock);
+               /* we should be the last process using this ulp, so no need
+                * to acquire ulp->lock here; we are also protected against
+                * IPC_RMID as we hold sma->sem_perm.lock */
                list_del_rcu(&un->list_proc);
-               spin_unlock(&ulp->lock);
/* perform adjustments registered in un */
                for (i = 0; i < sma->sem_nsems; i++) {
a) "we should be the last" or "we are the last"?
b) The bug that you have found is probably old, thus it must go into the stable kernels as well.
I would not do this change together with the bugfix.

Perhaps make two patches? One cc stable, the other one without cc stable.
--
    Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to