** Description changed:

  When nbd-client is started from initramfs (during the process of a
  diskless boot), in order to provide the root filesystem, it is not
  persistent. This is because nbd-client is started before run-init, and
  later it is not able to find /dev/nbd0 and /sys/block/nbd0/pid anymore
  because run-init deleted them.
  
  The nbd scripts for initramfs starts the nbd-client like this:
  
  @sbin/nbd-client NBD_SERVER_IP -N root /dev/nbd0 -swap -persist
  -systemd-mark
  
  The -persist option should allow the nbd-client to reconnect if the tcp
  session is lost.
  
  Here is a sequence of steps to observe the behaviour;
  
  1. The system is booted ok. nbd-client is active:
  root       359  0.2  0.2   4372  2212 ?        SL   11:16   0:00 
@sbin/nbd-client 10.4.104.4 -N root /dev/nbd0 -swap -persist -systemd-mark
  root       362  0.0  0.0      0     0 ?        S<   11:16   0:00 [nbd0]
  
  /dev/nbd0 exists:
  brw-rw---- 1 root disk 43, 0 Nov 24 09:19 /dev/nbd0
  
  and the nbd-client process uses it:
  root@host:~# ls -l /proc/359/fd/
  total 0
  lr-x------ 1 root root 64 Nov 24 09:20 0 -> /dev/null
  lrwx------ 1 root root 64 Nov 24 09:20 1 -> /dev/console (deleted)
  lrwx------ 1 root root 64 Nov 24 09:20 2 -> /dev/console (deleted)
  lrwx------ 1 root root 64 Nov 24 09:20 3 -> socket:[9447]
  lrwx------ 1 root root 64 Nov 24 09:20 4 -> /dev/nbd0
  
  2. If I restart the nbd-server, the nbd-client dies/exits. The only way
  to know what is happening is to strace the nbd-client. This generates a
  side effect: when strace is attached, the ioctl exits and nbd-client
  tries to reconnect and dies. So by just stracing the nbd-client, i
  simulate/force a disconnect/reconnect without any need to restart nbd-
  server. My guess is that when i restart the nbd-server, the same happens
  (but I just cannot see it). Please observe the behavior:
  
- root@GTSRO-S-123456:~# strace -p 359
+ root@host:~# strace -p 359
  strace: Process 359 attached
  getpid()                                = 359
  write(2, "nbd,359: Kernel call returned: 1"..., 34) = 34
  close(3)                                = 0
  close(4)                                = 0
  write(2, " Reconnecting\n", 14)         = 14
  socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 3
  bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
  getsockname(3, {sa_family=AF_NETLINK, pid=359, groups=00000000}, [12]) = 0
  sendto(3, "\24\0\0\0\26\0\1\3Y\2606X\0\0\0\0\0\0\0\0", 20, 0, 
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
  recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"L\0\0\0\24\0\2\0Y\2606Xg\1\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 256
  recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"H\0\0\0\24\0\2\0Y\2606Xg\1\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 144
  recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\24\0\0\0\3\0\2\0Y\2606Xg\1\0\0\0\0\0\0", 4096}], 
msg_controllen=0, msg_flags=0}, 0) = 20
  close(3)                                = 0
  socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
  connect(3, {sa_family=AF_INET, sin_port=htons(10809), 
sin_addr=inet_addr("10.4.104.4")}, 16) = 0
  setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  
  !!!!! this is the problem:
  open("/dev/nbd0", O_RDWR)               = -1 ENOENT (No such file or 
directory)
  
  open("/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or 
directory)
  socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
  connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No 
such file or directory)
  close(4)                                = 0
  socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
  connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No 
such file or directory)
  close(4)                                = 0
  write(2, "Error: Cannot open NBD: No such "..., 59) = 59
  exit_group(1)                           = ?
  +++ exited with 1 +++
  
  3. Now, if I have cached the /sbin/nbd-client file before doing the strace 
(by doing a cat /sbin/nbd-client > /dev/null), i can restart it again:
  root@host:~# /sbin/nbd-client 10.4.104.4 -N root /dev/nbd0 -swap -persist 
-systemd-mark
  Negotiation: ..size = 32765MB
  bs=1024, sz=34357604352 bytes
  
  4. If I strace the newly launched nbd-client, strace will cause a
  disconnect, as above, but this time the nbd-client will be able to open
  /dev/nbd0 again so it will not die:
  
  strace: Process 1851 attached
  write(2, "nbd,1851: Kernel call returned: "..., 35) = 35
  close(3)                                = 0
  close(4)                                = 0
  write(2, " Reconnecting\n", 14)         = 14
  socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 3
  bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
  getsockname(3, {sa_family=AF_NETLINK, pid=1851, groups=00000000}, [12]) = 0
  sendto(3, "\24\0\0\0\26\0\1\3.\2646X\0\0\0\0\0\0\0\0", 20, 0, 
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
  recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"L\0\0\0\24\0\2\0.\2646X;\7\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 256
  recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"H\0\0\0\24\0\2\0.\2646X;\7\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 144
  recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\24\0\0\0\3\0\2\0.\2646X;\7\0\0\0\0\0\0", 4096}], 
msg_controllen=0, msg_flags=0}, 0) = 20
  close(3)                                = 0
  socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
  connect(3, {sa_family=AF_INET, sin_port=htons(10809), 
sin_addr=inet_addr("10.4.104.4")}, 16) = 0
  setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  
  !!!!! here it works, because the process' VFS is the same as the real /
  open("/dev/nbd0", O_RDWR)               = 4
  
  write(1, "Negotiation: ", 13)           = 13
  read(3, "NBDMAGIC", 8)                  = 8
  write(1, ".", 1)                        = 1
  read(3, "IHAVEOPT", 8)                  = 8
  write(1, ".", 1)                        = 1
  read(3, "\0\3", 2)                      = 2
  write(3, "\0\0\0\3", 4)                 = 4
  write(3, "IHAVEOPT", 8)                 = 8
  write(3, "\0\0\0\1", 4)                 = 4
  write(3, "\0\0\0\4", 4)                 = 4
  write(3, "root", 4)                     = 4
  read(3, "\0\0\0\7\377\337p\0", 8)       = 8
  write(1, "size = 32765MB", 14)          = 14
  read(3, "\0\3", 2)                      = 2
  write(1, "\n", 1)                       = 1
  ioctl(4, NBD_SET_BLKSIZE, 0x1000)       = 0
  ioctl(4, NBD_SET_SIZE_BLOCKS, 0x7ffdf7) = 0
  ioctl(4, NBD_SET_BLKSIZE, 0x400)        = 0
  write(2, "bs=1024, sz=34357604352 bytes\n", 30) = 30
  ioctl(4, NBD_CLEAR_SOCK, 0x7f7daa224770) = 0
  ioctl(4, NBD_SET_FLAGS, 0x3)            = 0
  ioctl(4, BLKROSET, [1])                 = 0
  ioctl(4, NBD_SET_SOCK, 0x3)             = 0
  mlockall(MCL_CURRENT|MCL_FUTURE)        = 0
  rt_sigprocmask(SIG_SETMASK, ~[KILL PIPE TERM RTMIN RT_1], ~[KILL PIPE TERM 
STOP RTMIN RT_1], 8) = 0
  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f7daa4419d0) = 2160
  ioctl(4, NBD_DO_IT
  [...]
  
  After discussing with somebody on the nbd-general mailing list, I have
  received from him the idea of cutting out the close(nbddev) and the open
  again - open(nbddev..) from the code. The process didn't die anymore,
  but the forked child which was trying to do some initialization magic
  was also not able to open the /sys filesystem, so this workaround was
  not good enough.
  
  There might be a nice/clean way to make the nbd-client to reattach to
  the correct VFS tree, but I don't know it. What came to my mind was to
  make nbd-client to mount again the devtmpfs and sysfs in its own VFS
  tree, so it can continue. I have created this small patch, which makes
  the nbd-client persistent when providing the root filesystem from
  initramfs. If there's any cleaner way to do it, I would gladly receive
  it. If not, please include this patch in future nbd-client releases. Not
  having persistence for the / filesystem on a diskless client is not nice
  at all. With this patch, my client switches from main to slave NBD
  server (in a corosync/pacemaker/drbd cluster) without any issue.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1645048

Title:
  nbd-client when started from initramfs (for diskless boot) is not
  persistent

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nbd/+bug/1645048/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to