-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 01/13/2014 12:21 AM, Richard Weinberger wrote:
> On Sat, Jan 11, 2014 at 11:47 AM, Toralf Förster <toralf.foers...@gmx.de> 
> wrote:
> I do fuzz testing with trinity (latest git version) a stable 32 bit Gentoo 
> Linux user mode linux image.
> The host is a stable 32 bit vanilla 3.12.7 kernel, the guest runs latest git 
> tree + 2 patches (attached).
> 
> The trinity call in the UML guest is :
> $> trinity -q -l off -N 10000 -C 2 -x move_pages -x mremap -v /mnt/ramdisk
> 
> After a while there's no progress on the command line seen at the host system 
> - the trinity process seems to just hangs/idling. When this does occur I 
> cannot longer ssh into the system. The system however runs furthermore. In 
> another terminal I still see the output of this command:
> 
>> Does it consume 100% CPU?
> 
No.
It just doesnt allow new ssh connections. Existing ssh conenctinos are still 
working.

> $> ssh root@trinity "tail -f /var/log/messages"
> 
> That's why I do know that the system does not hang completely. The output of 
> top at the host system gives me the pid of the linux exe. A gdb call gives 
> for that pid :
> 
> $ date; sudo gdb /home/tfoerste/devel/linux/linux 25224 -n -batch -ex 'bt 
> full'
> Sat Jan 11 11:36:47 CET 2014
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> No symbol table info available.
> #1  0x083d63ff in __nanosleep_nocancel ()
> No symbol table info available.
> #2  0x0807266c in idle_sleep (nsecs=602496380195307520) at 
> arch/um/os-Linux/time.c:183
>         ts = {tv_sec = 0, tv_nsec = 8436602}
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> No locals.
> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
> No locals.
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> No locals.
> #6  0x084215e9 in rest_init () at init/main.c:402
>         pid = -516
>         __func__ = "rest_init"
> #7  0x080487e1 in start_kernel () at init/main.c:656
>         command_line = 0x85b8400 <command_line> "earlyprintk 
> ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap 
> eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts 
> rootfstype=ext4  root=98:0"
> #8  0x08049e42 in start_kernel_proc (unused=0x0) at 
> arch/um/kernel/skas/process.c:48
>         pid = -516
>         __func__ = "start_kernel_proc"
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
>         fn = 0x0
> #10 0x00000000 in ?? ()
> No symbol table info available.
> 
> 
> 
> Please note that BUG_ON was not triggered. For completeness here are the gdb 
> traces from all linux processes currently running at the host:
> 
>> So let's forget the 516 issue for now.
>> What we no for now is that you manage to trigger a lockup within UML.
> 
Agreed, especially b/c I added this patch too :
$ cat ~/devel/priv/uml/pid516_2.patch
- --- init/main.c_orig    2014-01-12 16:43:48.585439158 +0100
+++ init/main.c 2014-01-12 16:44:01.706438453 +0100
@@ -389,6 +389,7 @@
        BUG_ON(pid == -516);
        rcu_read_lock();
        kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
+       BUG_ON(pid == -516);
        rcu_read_unlock();
        complete(&kthreadd_done);

and this wasn't triggered (/me wonders if the -516 is somehow garbage).

But I can narrow down the problem. In an still open ssh sessions I made :

$ lsof | grep t3
bash      6129      tfoerste  cwd       DIR       98,0     4096    734 
/home/tfoerste/t3
logger    6135      tfoerste  cwd       DIR       98,0     4096    734 
/home/tfoerste/t3

(t3 is the ~/t3 directory where I cd into it bewfore I run trinity.

And after killing the logger command the trinity batch continues :

$ ps xf -eo pid,start_time,command | grep trinity
 6412 20:48  |           \_ grep --colour=auto trinity
 6129 19:17          \_ bash -c cd ~; sudo su -c 'if [[ -d ./t3 ]]; then sudo 
chmod -R a+rwx ./t3; sudo rm -rf ./t3; fi'; mkdir ./t3; cd ./t3; logger "17#-1, 
M=/mnt/ramdisk"; if [[ -n /mnt/ramdisk ]]; then if [[ -d 
/mnt/ramdisk/victims/v1 ]]; then sudo chmod -R a+rwx /mnt/ramdisk/victims/v1; 
sudo rm -rf /mnt/ramdisk/victims/v1; fi; mkdir -p /mnt/ramdisk/victims/v1/v2; 
for i in $(seq -w 0 99); do touch /mnt/ramdisk/victims/v1/v2/f$i 2>/dev/null; 
mkdir /mnt/ramdisk/victims/v1/v2/d$i 2>/dev/null; done; fi;  trinity -q -N 
10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
 6390 20:46              \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V 
/mnt/ramdisk/victims/v1/v2
 6391 20:46                  \_ trinity -q -N 10000 -C 2 -x move_pages -x 
mremap -V /mnt/ramdisk/victims/v1/v2
 6392 20:46                  \_ trinity -q -N 10000 -C 2 -x move_pages -x 
mremap -V /mnt/ramdisk/victims/v1/v2
 6408 20:47                      \_ trinity -q -N 10000 -C 2 -x move_pages -x 
mremap -V /mnt/ramdisk/victims/v1/v2
 6410 20:48                      \_ trinity -q -N 10000 -C 2 -x move_pages -x 
mremap -V /mnt/ramdisk/victims/v1/v2


FWIW a ssh into the UML guest is however still no longer possible. So I'm 
pretty sure that trinity damage there something really but I'd expect that such 
a damage should be seen somewhere in the logs, or ?

And finally - now the the batch trinity command hangs again and now not even 
killing logger helps.
And a shutdown ("sudo halt; exit") hangs too.

> 
> 
> $ pgrep linux | xargs -n1 -I {} sudo gdb /home/tfoerste/devel/linux/linux {} 
> -n -batch -ex 'bt'
> warning: process 1613 is already traced by process 25224
> ptrace: Operation not permitted.
> /home/tfoerste/1613: No such file or directory.
> No stack.
> warning: process 21849 is already traced by process 25224
> ptrace: Operation not permitted.
> /home/tfoerste/21849: No such file or directory.
> No stack.
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d63ff in __nanosleep_nocancel ()
> #2  0x0807266c in idle_sleep (nsecs=602496380205307520) at 
> arch/um/os-Linux/time.c:183
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> #6  0x084215e9 in rest_init () at init/main.c:402
> #7  0x080487e1 in start_kernel () at init/main.c:656
> #8  0x08049e42 in start_kernel_proc (unused=0x0) at 
> arch/um/kernel/skas/process.c:48
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
> #10 0x00000000 in ?? ()
> 
> warning: process 25231 is a cloned process
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083da446 in syscall ()
> #2  0x0806e861 in io_getevents (events=<optimized out>, ctx_id=<optimized 
> out>, min_nr=<optimized out>, nr=<optimized out>, timeout=<optimized out>) at 
> arch/um/os-Linux/aio.c:49
> #3  aio_thread (arg=0x0) at arch/um/os-Linux/aio.c:109
> #4  0x083db56e in clone ()
> 
> warning: process 25232 is a cloned process
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d82c2 in __read_nocancel ()
> #2  0x0806f3ff in read (__nbytes=<optimized out>, __buf=<optimized out>, 
> __fd=<optimized out>) at /usr/include/bits/unistd.h:44
> #3  os_read_file (fd=-512, buf=0xfffffe00, len=-512) at 
> arch/um/os-Linux/file.c:253
> #4  0x0806bafc in io_thread (arg=0x0) at arch/um/drivers/ubd_kern.c:1482
> #5  0x083db56e in clone ()
> 
> warning: process 25233 is a cloned process
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d9132 in __poll_nocancel ()
> #2  0x08071114 in poll (__timeout=<optimized out>, __nfds=<optimized out>, 
> __fds=<optimized out>) at /usr/include/bits/poll2.h:46
> #3  write_sigio_thread (unused=0x0) at arch/um/os-Linux/sigio.c:61
> #4  0x083db56e in clone ()
> warning: process 25234 is a zombie - the process has already terminated
> ptrace: Operation not permitted.
> /home/tfoerste/25234: No such file or directory.
> No stack.
> ...
> 
> 
> Please Cc: me I'm not subscribed.
> 
>> Wouldn't it make sense to subscribe?
>> You post very often on this list. :)
> 
done ;)

> 
> 
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> User-mode-linux-devel mailing list
>> User-mode-linux-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
>>
> 
> 
> 

- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLURGIACgkQxOrN3gB26U44RQD+KUqGBeP6/nJk1K/1Wx6nz7ij
/JXcjNN+ZBt8PsMWrV4A/jx7w7Xrl0RPWcwXVFYm+Ixo0dSbtr+zvh/2pdcCNU2c
=uGid
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to