On Mon, Nov 6, 2017 at 12:25 PM, Jamie Iles <jamie.i...@oracle.com> wrote: > Hi Dmitry, > > On Mon, Nov 06, 2017 at 12:02:19PM +0100, Dmitry Vyukov wrote: >> On Thu, Nov 2, 2017 at 6:01 PM, Oleg Nesterov <o...@redhat.com> wrote: >> > On 11/01, Dmitry Vyukov wrote: >> >> >> >> On Tue, Oct 31, 2017 at 7:34 PM, Oleg Nesterov <o...@redhat.com> wrote: >> >> > Hmm. I do not see reproducer in this email... >> >> >> >> Ah, sorry. You can see full thread with attachments here: >> >> https://groups.google.com/forum/#!topic/syzkaller-bugs/EUmYZU4m5gU >> > >> > Heh. I can't say I enjoyed reading the reproducer ;) >> > >> >> >> > WARNING: CPU: 0 PID: 1 at kernel/signal.c:340 >> >> >> > task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340 >> >> >> > Kernel panic - not syncing: panic_on_warn set ... >> >> >> > >> >> >> > CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-mm1+ #5 >> >> > >> >> > So this is init process with SIGNAL_UNKILLABLE flag set. And I hope it >> >> > has >> >> > the pending SIGKILL, otherwise there is something else. >> > >> > From repro.c >> > >> > line 111 r[8] = syscall(__NR_ptrace, 0x10ul, r[7]); >> > >> > this is PTRACE_ATTACH >> > >> > line 115 syscall(__NR_ptrace, 0x4200ul, r[7], 0x40000012ul, >> > 0x100012ul); >> > >> > this is PTRACE_SETOPTIONS and "data" includes PTRACE_O_EXITKILL. >> > >> > r[7] is initialized at >> > >> > line 110 r[7] = *(uint32_t*)0x20f9cffc; >> > >> > so if it is eq to 1 then it can attach to init and in this case the problem >> > can be explained by the wrong SIGNAL_UNKILLABLE/SIGKILL logic. >> > >> > But how *(uint32_t*)0x20f9cffc can be 1 ? >> > >> > line 108 r[6] = syscall(__NR_fcntl, r[1], 0x10ul, 0x20f9cff8ul); >> > >> > this is F_GETOWN_EX, addr = 0x20f9cff8 == 0x20f9cffc + 4, so if fcntl() >> > actually succeeds then r[7] == f_owner_ex->pid. >> > >> > It _can_ be 1, but the reproducer doesn't work for me. If you can >> > reproduce, >> > could you try the patch below? >> >> Hi, >> >> I would like to understand why you were not able to reproduce it. I >> won't be sitting here all the time, and we are tracking hundreds of >> bugs across different linux kernels and other OSes, so it's >> problematic to do any extensive work on all of them. That's why we try >> to provide reproducers. >> >> I've just tried the repro on the latest upstream >> (39dae59d66acd86d1de24294bd2f343fd5e7a625) and it triggered the >> WARNING within a second. >> Did you use the config provided? Did you use qemu or real hardware? >> Can you try in qemu (with -smp>1)? > > I'm unable to reproduce the warning in qemu with SMP (on a 32 CPU VM). > Instead I get the following instant traceback which is different to what > you report when run as root:
Uh, it seems to be racy. I am getting either the WARNING or "attempt to kill init" in ~1/5 proportion. Please try this simplified program, it triggers the WARNING all the time for me: // autogenerated by syzkaller (http://github.com/google/syzkaller) #define _GNU_SOURCE #include <sys/syscall.h> #include <unistd.h> #include <stdint.h> #include <string.h> int main() { long r[11]; memset(r, -1, sizeof(r)); r[0] = syscall(__NR_mmap, 0x20000000ul, 0xfec000ul, 0x3ul, 0x32ul, 0xfffffffffffffffful, 0x0ul); r[1] = syscall(__NR_inotify_init1, 0x80000ul); *(uint32_t*)0x20feb000 = (uint32_t)0xc; r[3] = syscall(__NR_getsockopt, 0xfffffffffffffffful, 0x1ul, 0x11ul, 0x2003cff4ul, 0x20feb000ul); if (r[3] != -1) r[4] = *(uint32_t*)0x2003cff4; r[5] = syscall(__NR_fcntl, r[1], 0x8ul, r[4]); r[6] = syscall(__NR_fcntl, r[1], 0x10ul, 0x20f9cff8ul); if (r[6] != -1) r[7] = *(uint32_t*)0x20f9cffc; r[8] = syscall(__NR_ptrace, 0x10ul, r[7]); r[9] = syscall(__NR_ioctl, 0xfffffffffffffffful, 0x4b6aul, 0x20f9e000ul); r[10] = syscall(__NR_ptrace, 0x4200ul, r[7], 0x40000012ul, 0x100012ul); return 0; }