On Mon, 23 Sep 2019, Peter Zijlstra wrote: > On Mon, Sep 23, 2019 at 11:18:20AM +0800, Yunfeng Cui wrote: > > I use model checker find a issue of robust and pi futex. On below > > situation, the owner can't find something in pi_state_list, while > > the requester will be blocked, never be awaked. > > > > CPU0 CPU1 > > futex_lock_pi > > /*some cs code*/ > > futex_lock_pi > > futex_lock_pi_atomic > > ... > > newval = uval | FUTEX_WAITERS; > > ret = lock_pi_update_atomic(uaddr, uval, newval); > > ... > > attach_to_pi_owner > > .... > > p = find_get_task_by_vpid(pid); > > if (!p) > > return handle_exit_race(uaddr, uval, NULL); > > .... > > raw_spin_lock_irq(&p->pi_lock); > > .... > > pi_state = alloc_pi_state(); > > .... > > do_exit->mm_release > > if (unlikely(tsk->robust_list)) { > > exit_robust_list(tsk); > > tsk->robust_list = NULL; > > } > > if (unlikely(!list_empty(&tsk->pi_state_list))) > > exit_pi_state_list(tsk); /*WILL MISS*/ > > list_add(&pi_state->list, &p->pi_state_list); > > WILL BLOCKED, NEVER WAKEUP! > > Did you forget/overlook the pi_lock fiddling in do_exit() ? I'm thinking > that would make the above impossible.
Right. I was trying to construct a case which allows the above, but failed to do so. Let's look at the exiting task: exit() exit_signals() tsk->flags |= PF_EXITING; smp_mb(); raw_spin_lock_irq(&tsk->pi_lock); (1) raw_spin_unlock_irq(&tsk->pi_lock); exit_mm() mm_release() exit_robust_list() if (!list_empty(&tsk->pi_state_list))) exit_pi_state_list(tsk); And now at the attaching task: attach_to_pi_owner() raw_spin_lock_irq(tsk->pi_lock); if (tsk->flags & PF_EXITING) return; pi_state = alloc_pi_state() list_add(pi_state, tsk->pi_state_list); See (1) above. That's the crucial point. Once the exiting task has set PF_EXITING and acquired tsk->pi_lock, it is impossible for the attaching task to queue itself as it _must_ observe PF_EXITING after it acquired tsk->pi_lock. If it manages to acquire tsk->pi_lock _before_ the existing task does that, then it either observes PF_EXITING or not. If it does, it goes out. If it does not, it queues itself on tsk->pi_state_list and will be cleaned up by the exiting task. Simplified concurrency picture: Case 1: Attacher does not see PF_EXITING CPU 0 CPU 1 lock(&tsk->pi_lock); tsk->flags |= PF_EXITING; if (!(tsk->flags & PF_EXITING)) queue(pi_state, tsk); smp_mb(); unlock(&tsk->pi_lock); lock(&tsk->pi_lock); (1) unlock(&tsk->pi_lock); if (!list_empty(&tsk->pi_state_list))) exit_pi_state_list(tsk); Case 2: Attacher does see PF_EXITING before (1) CPU 0 CPU 1 lock(&tsk->pi_lock); tsk->flags |= PF_EXITING; if (tsk->flags & PF_EXITING) { unlock(&tsk->pi_lock); smp_mb(); return; lock(&tsk->pi_lock); } (1) unlock(&tsk->pi_lock); The attacher CANNOT be queued in tsk->pi_state_list Case 2: Attacher does see PF_EXITING after (1) CPU 0 CPU 1 tsk->flags |= PF_EXITING; smp_mb(); lock(&tsk->pi_lock); lock(&tsk->pi_lock); (1) unlock(&tsk->pi_lock); if (tsk->flags & PF_EXITING) { unlock(&tsk->pi_lock); return; } There are no other cases possible. If the attacher can observe !(tsk->flags & PF_EXITING) _after_ (1) then there is something seriously wrong, but not in the futex code. That would be a massive memory ordering issue. Thanks, tglx