so that the server either crashes (if it is a user mode linux image) or at 
least its reboot functionality got broken
- if the NFS server is hammered with scary NFS calls using a fuzzy tool running 
at a remote NFS client under a non-privileged user id.

It can re reproduced, if
        - the NFS share is an EXT3 or EXT4 directory
        - and it is created at file located at tempfs and mounted via loop 
device
        - and the NFS server is forced to umount the NFS share
        - and the server forced to restart the NSF service afterwards
        - and trinity is used

I could find a scenario for an automated bisect. 2 times it brought this commit 
commit 68a3396178e6688ad7367202cdf0af8ed03c8727
Author: J. Bruce Fields <bfie...@redhat.com>
Date:   Thu Mar 21 11:21:50 2013 -0400

    nfsd4: shut down more of delegation earlier


to be the one after which the user mode linux server crashes with a back trace 
like this:


$ cat /mnt/ramdisk/bt.v3.11-rc4-172-g8ae3f1d
[New LWP 14025]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb77ef424 in __kernel_vsyscall ()
#0  0xb77ef424 in __kernel_vsyscall ()
#1  0x083a33c5 in kill ()
#2  0x0807163d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08071925 in os_dump_core () at arch/um/os-Linux/util.c:138
#4  0x080613a7 in panic_exit (self=0x85a1518 <panic_exit_notifier>, unused1=0, 
unused2=0x85d6ce0 <buf.15904>) at arch/um/kernel/um_arch.c:240
#5  0x0809a3b8 in notifier_call_chain (nl=0x0, val=0, v=0x85d6ce0 <buf.15904>, 
nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809a503 in __atomic_notifier_call_chain (nr_calls=<optimized out>, 
nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, 
nh=<optimized out>) at kernel/notifier.c:182
#7  atomic_notifier_call_chain (nh=0x85d6cc4 <panic_notifier_list>, val=0, 
v=0x85d6ce0 <buf.15904>) at kernel/notifier.c:191
#8  0x08400ba8 in panic (fmt=0x0) at kernel/panic.c:128
#9  0x0818edf4 in ext4_put_super (sb=0x4a042690) at fs/ext4/super.c:818
#10 0x081010d2 in generic_shutdown_super (sb=0x4a042690) at fs/super.c:418
#11 0x0810209a in kill_block_super (sb=0x0) at fs/super.c:1028
#12 0x08100f6a in deactivate_locked_super (s=0x4a042690) at fs/super.c:299
#13 0x08101001 in deactivate_super (s=0x4a042690) at fs/super.c:324
#14 0x08118e0c in mntfree (mnt=<optimized out>) at fs/namespace.c:891
#15 mntput_no_expire (mnt=0x0) at fs/namespace.c:929
#16 0x0811a2f5 in SYSC_umount (flags=<optimized out>, name=<optimized out>) at 
fs/namespace.c:1335
#17 SyS_umount (name=134541632, flags=0) at fs/namespace.c:1305
#18 0x0811a369 in SYSC_oldumount (name=<optimized out>) at fs/namespace.c:1347
#19 SyS_oldumount (name=134541632) at fs/namespace.c:1345
#20 0x080618e2 in handle_syscall (r=0x49e919d4) at 
arch/um/kernel/skas/syscall.c:35
#21 0x08073c0d in handle_trap (local_using_sysemu=<optimized out>, 
regs=<optimized out>, pid=<optimized out>) at 
arch/um/os-Linux/skas/process.c:198
#22 userspace (regs=0x49e919d4) at arch/um/os-Linux/skas/process.c:431
#23 0x0805e65c in fork_handler () at arch/um/kernel/process.c:160
#24 0x00000000 in ?? ()



A real system however would not crash bug would give a kernel BUG as reported 
here:
http://article.gmane.org/gmane.comp.file-systems.ext4/38915
Furthermore the server won't be able any longer to reboot - it would hang 
infinitely in the reboot phase.
Just the magic sysrq keys still works then.



Steps to reproduce at two 32 bit Gentoo Linux user mode linux images:

1. prepare the server :
        <mount a tempfs onto /mnt/ramdisk>
        mkdir /mnt/ramdisk/victims
        dd if=/dev/zero of=/mnt/ramdisk/disk1 bs=1M count=257 2>/dev/null
        yes | mkfs.ext4 -q /mnt/ramdisk/disk1 1>/dev/null
        mount -o loop /mnt/ramdisk/disk1 /mnt/ramdisk/victims
        chmod 777 /mnt/ramdisk/victims
        /etc/init.d/nfs restart

2. prepare the client
        mount the NFS share onto the local mount point /mnt/ramdisk/victims/ 
with NFSv4
        
3. run the fuzzy tool trinity at the client:
        while [[ : ]]; do
                <(re-)create and fill /mnt/ramdisk/victims/v1/v2 with 100 empty 
files and 100 empty directories>
                trinity -V /mnt/ramdisk/victims/v1/v2 -C 1 -N 10000 -q
                sleep 3
        done

4. after 15 min kill the user mode linux client with -9

5. now run at the server
        umount /mnt/ramdisk/victims || /etc/init.d/nfs restart && umount 
/mnt/ramdisk/victims && echo ' no issue so far'


You might need this patch too from Oleg Nesterov <o...@redhat.com> (not in 
mainline currently) .

--- x/kernel/exit.c
+++ x/kernel/exit.c
@@ -783,8 +783,8 @@ void do_exit(long code)
        exit_shm(tsk);
        exit_files(tsk);
        exit_fs(tsk);
-       exit_task_namespaces(tsk);
        exit_task_work(tsk);
+       exit_task_namespaces(tsk);
        check_stack_usage();
        exit_thread();


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to