Re: vfs: lockdep splat with prepare_bprm_creds
On 03/16, Al Viro wrote: > > On Fri, Mar 15, 2013 at 07:19:56PM +0100, Oleg Nesterov wrote: > > > Cough... I am shy to disclose my ignorance, but could you explain how > > open_exec()->do_filp_open(MAY_EXEC) can succeed in this case? At least > > acl_permission_check() looks as if open_exec() should fail... > > Umm... point. It might be a false positive, actually - some other > seq_file-based sucker (while chmod +x /proc/self/stack will fail, > chmod +x /proc/vmstat won't) that could be fed to execve(), leading to > 1) kernel_read() from execve() can grab m.lock for *some* seq_file m, > while holding ->cred_guard_mutex Yes, thanks. I am wondering if lock_trace() is really useful... Lets ignore proc_pid_syscall() and proc_pid_personality(). Suppose we change proc_pid_stack() int proc_pid_stack(...) { ... save_stack_trace_tsk(task, &trace); if (!ptrace_may_access(task, PTRACE_MODE_ATTACH)) goto return -EPERM; for (i = 0; i < trace.nr_entries; i++) seq_printf(...); return 0; } Sure, without cred_guard_mutex we can race with install_exec_creds(). But is it a problem in practice? In any case lock_trace() can't protect against commit_creds()... We can even do task_lock(task); err = __ptrace_may_access(task, mode); if (!err) save_stack_trace_tsk(...); task_unlock(task); This way task_lock() protects us against exec_mmap(). And even exec_mmap() was already called and the task is going to do install_exec_creds() we can't show the stack of this process "after" exec. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: vfs: lockdep splat with prepare_bprm_creds
On Fri, Mar 15, 2013 at 07:19:56PM +0100, Oleg Nesterov wrote: > On 03/15, Al Viro wrote: > > > > On Fri, Mar 15, 2013 at 12:07:14AM -0400, Sasha Levin wrote: > > > Hi all, > > > > > > While fuzzing with trinity inside a KVM tools guest running latest -next > > > kernel > > > I've stumbled on the following. > > > > > > Dave Jones reported something similar, but that seemed to involve > > > cgroup's mutex > > > and didn't seem like it was the same issue as this one. > > > > Lovely... It's an execve() attempt on a "binary" that is, in fact, a procfs > > file (/proc//stack), > Cough... I am shy to disclose my ignorance, but could you explain how > open_exec()->do_filp_open(MAY_EXEC) can succeed in this case? At least > acl_permission_check() looks as if open_exec() should fail... Umm... point. It might be a false positive, actually - some other seq_file-based sucker (while chmod +x /proc/self/stack will fail, chmod +x /proc/vmstat won't) that could be fed to execve(), leading to 1) kernel_read() from execve() can grab m.lock for *some* seq_file m, while holding ->cred_guard_mutex 2) read() on /proc/self/stack tries to grab ->cred_guard_mutex, while holding m.lock for a different seq_file m ... with lockdep having no idea that there's a reason why (1) and (2) can't have the same seq_file involved, said reason being that all files with ->read() trying to grab ->cred_guard_mutex don't have exec bit set *and* are impossible to chmod. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: vfs: lockdep splat with prepare_bprm_creds
On 03/15, Al Viro wrote: > > On Fri, Mar 15, 2013 at 12:07:14AM -0400, Sasha Levin wrote: > > Hi all, > > > > While fuzzing with trinity inside a KVM tools guest running latest -next > > kernel > > I've stumbled on the following. > > > > Dave Jones reported something similar, but that seemed to involve cgroup's > > mutex > > and didn't seem like it was the same issue as this one. > > Lovely... It's an execve() attempt on a "binary" that is, in fact, a procfs > file (/proc//stack), probably... other lock_trace() callers can't generate this lockdep output afaics. > with its ->read() trying to grab ->cred_guard_mutex. > The fact that it's seq_file-based is irrelevant here - all that matters is > that we have ->read() for some file trying to grab ->cred_guard_mutex. Yes, perhaps the patch below makes sense anyway as a cleanup, but obviously it can't help. Cough... I am shy to disclose my ignorance, but could you explain how open_exec()->do_filp_open(MAY_EXEC) can succeed in this case? At least acl_permission_check() looks as if open_exec() should fail... Just curious, thanks in advance. Oleg. --- x/fs/proc/base.c +++ x/fs/proc/base.c @@ -317,12 +317,12 @@ static int proc_pid_stack(struct seq_fil err = lock_trace(task); if (!err) { save_stack_trace_tsk(task, &trace); + unlock_trace(task); for (i = 0; i < trace.nr_entries; i++) { seq_printf(m, "[<%pK>] %pS\n", (void *)entries[i], (void *)entries[i]); } - unlock_trace(task); } kfree(entries); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: vfs: lockdep splat with prepare_bprm_creds
On Fri, Mar 15, 2013 at 12:07:14AM -0400, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running latest -next > kernel > I've stumbled on the following. > > Dave Jones reported something similar, but that seemed to involve cgroup's > mutex > and didn't seem like it was the same issue as this one. Lovely... It's an execve() attempt on a "binary" that is, in fact, a procfs file (/proc//stack), with its ->read() trying to grab ->cred_guard_mutex. The fact that it's seq_file-based is irrelevant here - all that matters is that we have ->read() for some file trying to grab ->cred_guard_mutex. It's not *quite* a deadlock, though - all these guys are using mutex_lock_killable()... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
vfs: lockdep splat with prepare_bprm_creds
Hi all, While fuzzing with trinity inside a KVM tools guest running latest -next kernel I've stumbled on the following. Dave Jones reported something similar, but that seemed to involve cgroup's mutex and didn't seem like it was the same issue as this one. [ 549.400084] == [ 549.400084] [ INFO: possible circular locking dependency detected ] [ 549.400084] 3.9.0-rc2-next-20130314-sasha-00046-g3897511 #295 Tainted: G W [ 549.400084] --- [ 549.420729] can: request_module (can-proto-0) failed. [ 549.400084] trinity-child20/12048 is trying to acquire lock: [ 549.400084] (&p->lock){+.+.+.}, at: [] seq_read+0x3a/0x3d0 [ 549.400084] [ 549.400084] but task is already holding lock: [ 549.400084] (&sig->cred_guard_mutex){+.+.+.}, at: [] prepare_bprm_creds+0x31/0x80 [ 549.400084] [ 549.400084] which lock already depends on the new lock. [ 549.400084] [ 549.400084] [ 549.400084] the existing dependency chain (in reverse order) is: [ 549.400084] -> #1 (&sig->cred_guard_mutex){+.+.+.}: [ 549.400084][] check_prevs_add+0xba/0x1a0 [ 549.400084][] validate_chain.isra.21+0x6d0/0x800 [ 549.400084][] __lock_acquire+0xa23/0xb10 [ 549.400084][] lock_acquire+0x1ca/0x270 [ 549.400084][] __mutex_lock_common+0x5a/0x5a0 [ 549.400084][] mutex_lock_killable_nested+0x3f/0x50 [ 549.400084][] lock_trace+0x28/0x70 [ 549.400084][] proc_pid_stack+0x65/0xf0 [ 549.400084][] proc_single_show+0x5a/0xa0 [ 549.400084][] seq_read+0x1af/0x3d0 [ 549.400084][] do_loop_readv_writev+0x4b/0x90 [ 549.400084][] do_readv_writev+0xf6/0x1d0 [ 549.400084][] vfs_readv+0x3e/0x60 [ 549.400084][] SyS_readv+0x50/0xd0 [ 549.400084][] tracesys+0xe1/0xe6 [ 549.400084] -> #0 (&p->lock){+.+.+.}: [ 549.400084][] check_prev_add+0x145/0x710 [ 549.400084][] check_prevs_add+0xba/0x1a0 [ 549.400084][] validate_chain.isra.21+0x6d0/0x800 [ 549.400084][] __lock_acquire+0xa23/0xb10 [ 549.400084][] lock_acquire+0x1ca/0x270 [ 549.400084][] __mutex_lock_common+0x5a/0x5a0 [ 549.400084][] mutex_lock_nested+0x3f/0x50 [ 549.400084][] seq_read+0x3a/0x3d0 [ 549.400084][] proc_reg_read+0x201/0x230 [ 549.400084][] vfs_read+0xb5/0x180 [ 549.400084][] kernel_read+0x41/0x60 [ 549.400084][] prepare_binprm+0x18d/0x1b0 [ 549.400084][] do_execve_common.isra.21+0x1b6/0x380 [ 549.400084][] do_execve+0x13/0x20 [ 549.400084][] SyS_execve+0x3e/0x60 [ 549.400084][] stub_execve+0x69/0xa0 [ 549.400084] [ 549.400084] other info that might help us debug this: [ 549.400084] [ 549.400084] Possible unsafe locking scenario: [ 549.400084] [ 549.400084]CPU0CPU1 [ 549.400084] [ 549.400084] lock(&sig->cred_guard_mutex); [ 549.400084]lock(&p->lock); [ 549.400084]lock(&sig->cred_guard_mutex); [ 549.400084] lock(&p->lock); [ 549.400084] [ 549.400084] *** DEADLOCK *** [ 549.400084] [ 549.400084] 1 lock held by trinity-child20/12048: [ 549.400084] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [] prepare_bprm_creds+0x31/0x80 [ 549.400084] [ 549.400084] stack backtrace: [ 549.400084] Pid: 12048, comm: trinity-child20 Tainted: GW 3.9.0-rc2-next-20130314-sasha-00046-g3897511 #295 [ 549.400084] Call Trace: [ 549.400084] [] print_circular_bug+0xd3/0xe4 [ 549.400084] [] check_prev_add+0x145/0x710 [ 549.400084] [] check_prevs_add+0xba/0x1a0 [ 549.400084] [] ? sched_clock+0x15/0x20 [ 549.400084] [] validate_chain.isra.21+0x6d0/0x800 [ 549.400084] [] __lock_acquire+0xa23/0xb10 [ 549.400084] [] ? kvm_clock_read+0x38/0x70 [ 549.400084] [] ? lock_release_holdtime+0x12e/0x140 [ 549.400084] [] ? sched_clock+0x15/0x20 [ 549.400084] [] ? sched_clock_local+0x25/0x90 [ 549.400084] [] ? deactivate_slab+0x7d6/0x820 [ 549.400084] [] lock_acquire+0x1ca/0x270 [ 549.400084] [] ? seq_read+0x3a/0x3d0 [ 549.400084] [] ? sched_clock_local+0x25/0x90 [ 549.400084] [] ? seq_lseek+0x110/0x110 [ 549.400084] [] __mutex_lock_common+0x5a/0x5a0 [ 549.400084] [] ? seq_read+0x3a/0x3d0 [ 549.400084] [] ? __lock_is_held+0x52/0x80 [ 549.400084] [] ? seq_read+0x3a/0x3d0 [ 549.400084] [] ? seq_lseek+0x110/0x110 [ 549.400084] [] mutex_lock_nested+0x3f/0x50 [ 549.400084] [] seq_read+0x3a/0x3d0 [ 549.400084] [] ? delay_tsc+0xdd/0x110 [ 549.400084] [] ? seq_lseek+0x110/0x110 [ 549.400084] [] ? seq_lseek+0x110/0x110 [ 549.400084] [] proc_reg_read+0x201/0x230 [ 549.400084] [] ? proc_reg_write+0x230/0x230 [ 549.400084] [] vfs_read+0xb5/0x180 [ 549.400084] [] kernel_read+0x41/0x60 [ 549.400084] [] prepare_binprm+0x18d/0x1b0 [ 549.400084] [] do