Hello.

I'm hitting another problematic case here, when unionfs
causes a kernel OOPS.

The setup is read-only nfs root union-mounted with a
read-write tmpfs space.  The filesystem gets prepared
in initramfs, and new init gets executed from a ready
root.  No other processes (udev etc) are started in
initramfs.

Relevant entries from /proc/mounts:

192.168.88.4:/usr/rb /.r nfs 
ro,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nolock,proto=udp,port=65535,timeo=7,retrans=3,sec=sys,mountport=65535,mountproto=,addr=192.168.88.4
 0 0
loctmp /.t tmpfs rw,noatime,mode=755 0 0
rootfs / unionfs rw,noatime,dirs=/.t=rw:/.r/root=ro 0 0

So far I can trigger the prob just by logging into
the machine over ssh.  When doing so, I see this:

  unionfs: new lower inode mtime (bindex=0, name=log)

I'm not sure why this one happens: the nfs root does
not change, but atime gets changed on the server.  All
writes are done by the unionfs itself, into a tmpfs.
So I'm not sure what it is talking about here.

Before logging in:

$ ls -ld /.t/var/log /.r/root/var/log /var/log
ls: cannot access /.t/var/log: No such file or directory
drwxr-xr-x 5 root root 4096 Apr 16 13:04 /.r/root/var/log
drwxr-xr-x 5 root root 4096 Apr 16 13:04 /var/log

After logging in:
$ ls -ld /.t/var/log /.r/root/var/log /var/log
drwxr-xr-x 5 root root 4096 Apr 16 13:04 /.r/root/var/log
drwxr-xr-x 2 root root   60 May  2 20:23 /.t/var/log
drwxr-xr-x 1 root root   60 May  2 20:23 /var/log

so it creates the writeup directory correctly.  That's
something wrong I think, but I'm not sure if this is
anyhow related to the actual problem.

Now, when I'm logging out, I'm getting this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: [<ffffffff81101274>] fput+0x4/0x20
PGD 6cf6067 PUD 6c53067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/virtual/net/lo/operstate
CPU 0
Modules linked in: parport_pc parport thermal processor thermal_sys floppy 
button sg hwmon i2c_piix4 psmouse evdev i2c_core fbcon fbdev font bitblit 
softcursor fb unionfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc 8139too 
8139cp mii sr_mod cdrom ata_piix pata_acpi libata scsi_mod

Pid: 704, comm: bash Not tainted 2.6.34-amd64 #2.6.34~rc6 /Bochs
RIP: 0010:[<ffffffff81101274>]  [<ffffffff81101274>] fput+0x4/0x20
RSP: 0018:ffff880006defcc0  EFLAGS: 00010282
RAX: 0000000000000030 RBX: ffff8800072af540 RCX: ffff8800073afbe0
RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000
RBP: 00000000fffffffb R08: 0600000000000000 R09: 8000000000000000
R10: 0000000000000000 R11: ffffffff810ce070 R12: ffff8800067f79c0
R13: ffff8800067f7cc0 R14: ffff8800071a56c0 R15: ffff8800073afb40
FS:  0000000000000000(0000) GS:ffff880001600000(0063) knlGS:00000000f76406c0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000030 CR3: 0000000006dea000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 704, threadinfo ffff880006dee000, task ffff88000746c100)
Stack:
  ffffffffa0194e5f ffff880006559c00 0000000000000018 0000000000000000
<0> 0000000000000001 ffff8800071a9680 ffff88000677b800 0000000100000001
<0> ffff8800072af540 0000000000000000 0000000000000000 0000000000000010
Call Trace:
  [<ffffffffa0194e5f>] ? copyup_dentry+0x42f/0x920 [unionfs]
  [<ffffffffa0195388>] ? copyup_file+0x38/0xc0 [unionfs]
  [<ffffffffa0199168>] ? unionfs_file_revalidate+0x7b8/0xf80 [unionfs]
  [<ffffffffa018e9d3>] ? unionfs_write+0xb3/0x1f0 [unionfs]
  [<ffffffff8110051b>] ? vfs_write+0xcb/0x180
  [<ffffffff811006d3>] ? sys_write+0x53/0xa0
  [<ffffffff81031d0b>] ? cstar_dispatch+0x7/0x32
Code: 88 00 00 00 48 85 c9 0f 84 f6 fe ff ff 31 d2 48 89 ee bf ff ff ff ff ff 
d1 e9 dc fe ff ff 66 0f 1f 84 00 00 00 00 00 48 8d 47 30 <3e> 48 ff 08 0f 94 c2 
84 d2 75 09 f3 c3 0f 1f 80 00 00 00 00 e9
RIP  [<ffffffff81101274>] fput+0x4/0x20
  RSP <ffff880006defcc0>
CR2: 0000000000000030
---[ end trace e16bdf4328bfa01d ]---

At this time the system is stuck.

It is bash doing logout.  It writes history file
(new writeup into /.t, there was no .bash_history
in the original root), and checks for ~/.bash_logout
and other similar things (neither of which exists).

Very important is that I can't trigger the problem
when running bach under strace (attaching strace to
the process and logging out as usual), only when
logging out normally.  So it looks like some timing
issue.

The above trace is from 2.6.34-rc6 kernel with
2.6.34-rc0 unionfs patch applied.  The same thing
happens on older kernels too (2.6.32 included),
under the same conditions.

Anything we can do about this?

Thanks!

/mjt
_______________________________________________
unionfs mailing list: http://unionfs.filesystems.org/
unionfs@mail.fsl.cs.sunysb.edu
http://www.fsl.cs.sunysb.edu/mailman/listinfo/unionfs

Reply via email to