-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 01/13/2014 12:21 AM, Richard Weinberger wrote: > On Sat, Jan 11, 2014 at 11:47 AM, Toralf Förster <toralf.foers...@gmx.de> > wrote: > I do fuzz testing with trinity (latest git version) a stable 32 bit Gentoo > Linux user mode linux image. > The host is a stable 32 bit vanilla 3.12.7 kernel, the guest runs latest git > tree + 2 patches (attached). > > The trinity call in the UML guest is : > $> trinity -q -l off -N 10000 -C 2 -x move_pages -x mremap -v /mnt/ramdisk > > After a while there's no progress on the command line seen at the host system > - the trinity process seems to just hangs/idling. When this does occur I > cannot longer ssh into the system. The system however runs furthermore. In > another terminal I still see the output of this command: > >> Does it consume 100% CPU? > No. It just doesnt allow new ssh connections. Existing ssh conenctinos are still working.
> $> ssh root@trinity "tail -f /var/log/messages" > > That's why I do know that the system does not hang completely. The output of > top at the host system gives me the pid of the linux exe. A gdb call gives > for that pid : > > $ date; sudo gdb /home/tfoerste/devel/linux/linux 25224 -n -batch -ex 'bt > full' > Sat Jan 11 11:36:47 CET 2014 > > warning: Could not load shared library symbols for linux-gate.so.1. > Do you need "set solib-search-path" or "set sysroot"? > 0xb7800424 in __kernel_vsyscall () > #0 0xb7800424 in __kernel_vsyscall () > No symbol table info available. > #1 0x083d63ff in __nanosleep_nocancel () > No symbol table info available. > #2 0x0807266c in idle_sleep (nsecs=602496380195307520) at > arch/um/os-Linux/time.c:183 > ts = {tv_sec = 0, tv_nsec = 8436602} > #3 0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208 > No locals. > #4 0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98 > No locals. > #5 cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140 > No locals. > #6 0x084215e9 in rest_init () at init/main.c:402 > pid = -516 > __func__ = "rest_init" > #7 0x080487e1 in start_kernel () at init/main.c:656 > command_line = 0x85b8400 <command_line> "earlyprintk > ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap > eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts > rootfstype=ext4 root=98:0" > #8 0x08049e42 in start_kernel_proc (unused=0x0) at > arch/um/kernel/skas/process.c:48 > pid = -516 > __func__ = "start_kernel_proc" > #9 0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129 > fn = 0x0 > #10 0x00000000 in ?? () > No symbol table info available. > > > > Please note that BUG_ON was not triggered. For completeness here are the gdb > traces from all linux processes currently running at the host: > >> So let's forget the 516 issue for now. >> What we no for now is that you manage to trigger a lockup within UML. > Agreed, especially b/c I added this patch too : $ cat ~/devel/priv/uml/pid516_2.patch - --- init/main.c_orig 2014-01-12 16:43:48.585439158 +0100 +++ init/main.c 2014-01-12 16:44:01.706438453 +0100 @@ -389,6 +389,7 @@ BUG_ON(pid == -516); rcu_read_lock(); kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); + BUG_ON(pid == -516); rcu_read_unlock(); complete(&kthreadd_done); and this wasn't triggered (/me wonders if the -516 is somehow garbage). But I can narrow down the problem. In an still open ssh sessions I made : $ lsof | grep t3 bash 6129 tfoerste cwd DIR 98,0 4096 734 /home/tfoerste/t3 logger 6135 tfoerste cwd DIR 98,0 4096 734 /home/tfoerste/t3 (t3 is the ~/t3 directory where I cd into it bewfore I run trinity. And after killing the logger command the trinity batch continues : $ ps xf -eo pid,start_time,command | grep trinity 6412 20:48 | \_ grep --colour=auto trinity 6129 19:17 \_ bash -c cd ~; sudo su -c 'if [[ -d ./t3 ]]; then sudo chmod -R a+rwx ./t3; sudo rm -rf ./t3; fi'; mkdir ./t3; cd ./t3; logger "17#-1, M=/mnt/ramdisk"; if [[ -n /mnt/ramdisk ]]; then if [[ -d /mnt/ramdisk/victims/v1 ]]; then sudo chmod -R a+rwx /mnt/ramdisk/victims/v1; sudo rm -rf /mnt/ramdisk/victims/v1; fi; mkdir -p /mnt/ramdisk/victims/v1/v2; for i in $(seq -w 0 99); do touch /mnt/ramdisk/victims/v1/v2/f$i 2>/dev/null; mkdir /mnt/ramdisk/victims/v1/v2/d$i 2>/dev/null; done; fi; trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2 6390 20:46 \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2 6391 20:46 \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2 6392 20:46 \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2 6408 20:47 \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2 6410 20:48 \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2 FWIW a ssh into the UML guest is however still no longer possible. So I'm pretty sure that trinity damage there something really but I'd expect that such a damage should be seen somewhere in the logs, or ? And finally - now the the batch trinity command hangs again and now not even killing logger helps. And a shutdown ("sudo halt; exit") hangs too. > > > $ pgrep linux | xargs -n1 -I {} sudo gdb /home/tfoerste/devel/linux/linux {} > -n -batch -ex 'bt' > warning: process 1613 is already traced by process 25224 > ptrace: Operation not permitted. > /home/tfoerste/1613: No such file or directory. > No stack. > warning: process 21849 is already traced by process 25224 > ptrace: Operation not permitted. > /home/tfoerste/21849: No such file or directory. > No stack. > > warning: Could not load shared library symbols for linux-gate.so.1. > Do you need "set solib-search-path" or "set sysroot"? > 0xb7800424 in __kernel_vsyscall () > #0 0xb7800424 in __kernel_vsyscall () > #1 0x083d63ff in __nanosleep_nocancel () > #2 0x0807266c in idle_sleep (nsecs=602496380205307520) at > arch/um/os-Linux/time.c:183 > #3 0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208 > #4 0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98 > #5 cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140 > #6 0x084215e9 in rest_init () at init/main.c:402 > #7 0x080487e1 in start_kernel () at init/main.c:656 > #8 0x08049e42 in start_kernel_proc (unused=0x0) at > arch/um/kernel/skas/process.c:48 > #9 0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129 > #10 0x00000000 in ?? () > > warning: process 25231 is a cloned process > > warning: Could not load shared library symbols for linux-gate.so.1. > Do you need "set solib-search-path" or "set sysroot"? > 0xb7800424 in __kernel_vsyscall () > #0 0xb7800424 in __kernel_vsyscall () > #1 0x083da446 in syscall () > #2 0x0806e861 in io_getevents (events=<optimized out>, ctx_id=<optimized > out>, min_nr=<optimized out>, nr=<optimized out>, timeout=<optimized out>) at > arch/um/os-Linux/aio.c:49 > #3 aio_thread (arg=0x0) at arch/um/os-Linux/aio.c:109 > #4 0x083db56e in clone () > > warning: process 25232 is a cloned process > > warning: Could not load shared library symbols for linux-gate.so.1. > Do you need "set solib-search-path" or "set sysroot"? > 0xb7800424 in __kernel_vsyscall () > #0 0xb7800424 in __kernel_vsyscall () > #1 0x083d82c2 in __read_nocancel () > #2 0x0806f3ff in read (__nbytes=<optimized out>, __buf=<optimized out>, > __fd=<optimized out>) at /usr/include/bits/unistd.h:44 > #3 os_read_file (fd=-512, buf=0xfffffe00, len=-512) at > arch/um/os-Linux/file.c:253 > #4 0x0806bafc in io_thread (arg=0x0) at arch/um/drivers/ubd_kern.c:1482 > #5 0x083db56e in clone () > > warning: process 25233 is a cloned process > > warning: Could not load shared library symbols for linux-gate.so.1. > Do you need "set solib-search-path" or "set sysroot"? > 0xb7800424 in __kernel_vsyscall () > #0 0xb7800424 in __kernel_vsyscall () > #1 0x083d9132 in __poll_nocancel () > #2 0x08071114 in poll (__timeout=<optimized out>, __nfds=<optimized out>, > __fds=<optimized out>) at /usr/include/bits/poll2.h:46 > #3 write_sigio_thread (unused=0x0) at arch/um/os-Linux/sigio.c:61 > #4 0x083db56e in clone () > warning: process 25234 is a zombie - the process has already terminated > ptrace: Operation not permitted. > /home/tfoerste/25234: No such file or directory. > No stack. > ... > > > Please Cc: me I'm not subscribed. > >> Wouldn't it make sense to subscribe? >> You post very often on this list. :) > done ;) > > >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> User-mode-linux-devel mailing list >> User-mode-linux-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel >> > > > - -- MfG/Sincerely Toralf Förster pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlLURGIACgkQxOrN3gB26U44RQD+KUqGBeP6/nJk1K/1Wx6nz7ij /JXcjNN+ZBt8PsMWrV4A/jx7w7Xrl0RPWcwXVFYm+Ixo0dSbtr+zvh/2pdcCNU2c =uGid -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel