Re: [systemd-devel] /proc and /sys get unmounted during boot from NFS, which results in boot error

2015-02-04 Thread Olaf Leidinger
Hey!

 Maybe something from your initrd is acting weird?

As far as I can tell, nothing from the initrd is running. It's
basically ArchLinux's mkinitcpio run on gentoo - thus, pretty standard stuff.

 It might be interesting to strace PID 1 while you are doing this. And
 if that doesn't help strace the other processes you run, to see if any
 of them unmounts anything.


Good idea! Actually, I found out some more stuff.

1. It's not calling ls /proc, it seems as if the auto-complete
   function of the running shell triggers the problem.

2. It's not restricted to auto-completing /proc, /home triggers it as
   well.

3. But it's not bash's fault, tcsh triggers it, too.

4. Auto-completing one level deeper, e.g. /usr/bin, doesn't trigger
   the unmount.


( note: I've two shells running: one started by the emergency shell and
  one started by the small dropbear server I manually started 
  for debugging purpose. Everything but dropbear is watched by strace.
  The problem, of course, also occurs without dropbear.)

$ strace -f -F -p 1 -p 179 -p191 -p177
Process 1 attached
Process 179 attached
Process 191 attached
Process 177 attached
[pid   177] wait4(-1,  unfinished ...
[pid   191] rt_sigsuspend([] unfinished ...
[pid   179] read(16,  unfinished ...
[pid 1] fstat(14, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 1] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0x7faf0bd52000
[pid 1] read(14, strace\n, 1024)  = 7
[pid 1] close(14)   = 0
[pid 1] munmap(0x7faf0bd52000, 4096) = 0
[pid 1] writev(3, [{31, 4}, {systemd, 7}, {[1]: , 5}, {Child 202 
(strace) died (code=ex..., 55}, {\n, 1}], 5) = 72
[pid 1] open(/proc/202/cgroup, O_RDONLY|O_CLOEXEC) = 14
[pid 1] fstat(14, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
[pid 1] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0x7faf0bd52000
[pid 1] read(14, 8:memory:/\n7:freezer:/\n6:devices..., 1024) = 101
[pid 1] close(14)   = 0
[pid 1] munmap(0x7faf0bd52000, 4096) = 0
[pid 1] waitid(P_PID, 202, {si_signo=SIGCHLD, si_code=CLD_EXITED, 
si_pid=202, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0
[pid 1] waitid(P_ALL, 0, {}, WNOHANG|WEXITED|WNOWAIT, NULL) = 0
[pid 1] epoll_wait(4,  unfinished ...
[pid   179] ... read resumed \n, 1) = 1

[...]

( this was the only output from systemd, so I guess it's not guilty).


 auto-completing /proc

[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, l, 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0, s, 1) = 1
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, s, 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0,  , 1) = 1
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2,  , 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0, /, 1) = 1
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, /, 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0, p, 1) = 1
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, p, 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0, r, 1) = 1
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, r, 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0, o, 1) = 1
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, o, 1)= 1
[pid   179] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   179] read(0, \t, 1)= 1

### tab is pushed here
[pid   179] openat(AT_FDCWD, /, O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
[pid   179] getdents(3, /* 22 entries */, 32768) = 560
[pid   179] getdents(3, /* 0 entries */, 32768) = 0
[pid   179] close(3)= 0
[pid   179] stat(/proc, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid   179] rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
[pid   179] write(2, c/, 2)  



# after a reboot: autocomplete, that doesn't trigger
[pid   179] read(0, \t, 1)= 1
[pid   179] stat(/usr, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid   179] openat(AT_FDCWD, /usr/, 
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
[pid   179] getdents(3, /* 16 entries */, 32768) = 464
[pid   179] getdents(3, /* 0 entries */, 

Re: [systemd-devel] /proc and /sys get unmounted during boot from NFS, which results in boot error

2015-02-04 Thread Lennart Poettering
On Wed, 04.02.15 15:29, Olaf Leidinger (ol...@mescharet.de) wrote:

 Hey!
 
  Maybe something from your initrd is acting weird?
 
 As far as I can tell, nothing from the initrd is running. It's
 basically ArchLinux's mkinitcpio run on gentoo - thus, pretty
 standard stuff.

ps should tell you what else is running when you run into this.

 So somehow these system calls must trigger the unmount. Probably some process 
 which 
 gets started by systemd performs similar system calls and thus
 spoils my boot.

The dumps you posted show no difference really...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] /proc and /sys get unmounted during boot from NFS, which results in boot error

2015-02-03 Thread Lennart Poettering
On Tue, 03.02.15 23:00, Olaf Leidinger (ol...@mescharet.de) wrote:

 Dear systemd-devel list,
 
 I'm trying to debug the following problem:
 
 For some unknown reason, /proc and /sys get unmounted during boot
 from a NFS mounted rootfs. Booting to an emergency shell, I can observe
 them disappear by first calling mount (which reads from /proc due to 
 /etc/mtab 
 being a symlink) and then calling ls /proc. After calling ls, mount 
 obviously
 complains about not being able to read mtab, instead of listing the mounts as 
 before.
 
 A 2nd or 3rd call of mount before listing /proc works fine, too.

That's certainly weird. The log you pasted below doesn't show any
other commands running, which is particularly weird.

Maybe something from your initrd is acting weird?

systemd itself mounts /proc and /sys very early during boot, and then
never touches them anymore. They are excluded from the usual .mount
unit logic in order to ensure they never accidentally disappear...

 Booting with debug options yields no further information, no new
 messages appear while the debug shell is running and the messages
 before are not very interesting (as far as I can tell) [-- end of
 mail].

It might be interesting to strace PID 1 while you are doing this. And
if that doesn't help strace the other processes you run, to see if any
of them unmounts anything.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel