Some observations:
* growpart uses sfdisk without --no-tell-kernel option, meaning that it does 
notify kernel about partition changes
* growpart later calls partx, which may be redundant / cause no changes or 
events
* as a side note, partprobe, blockdev --rereadpt can also be used to reread 
partition tables, I'm not sure the difference between them
* growpart does not take exclusive lock of the device, meaning sgdisk is known 
to be racy with udev events

Imho the sequency of commands should be:
* take flock on the device, to neutralise udev
* modify device with sfdisk
* reread partitions tables (i would say with blockdev --rereadpt, rather than 
partx/partprobe) 
* release the flock
* udevadm trigger --action=add --wait device (or trigger && settle)

This way it ensures that no udev events are processed for the device
whilst we are operating and rereading the device partitions, and then we
release the lock, at which point everything has to be quiet and steady,
trigger, settle, done.


See:
       sfdisk  uses  BLKRRPART (reread partition table) ioctl to make sure that 
the device is not used
       by system or another tools (see also --no-reread).  It's possible that 
this feature or  another
       sfdisk  activity  races with udevd.  The recommended way how to avoid 
possible collisions is to
       use exclusive flock for the whole-disk device to serialize device 
access.  The  exclusive  lock
       will cause udevd to skip the event handling on the device.  For example:

              flock /dev/sdc sfdisk /dev/sdc

       Note, this semantic is not currently supported by udevd for MD
and DM devices.

at http://manpages.ubuntu.com/manpages/eoan/en/man8/sfdisk.8.html

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1834875

Title:
  cloud-init growpart race with udev

Status in cloud-init:
  Incomplete
Status in cloud-utils:
  New
Status in linux-azure package in Ubuntu:
  New
Status in systemd package in Ubuntu:
  New

Bug description:
  On Azure, it happens regularly (20-30%), that cloud-init's growpart
  module fails to extend the partition to full size.

  Such as in this example:

  ========================================

  2019-06-28 12:24:18,666 - util.py[DEBUG]: Running command ['growpart', 
'--dry-run', '/dev/sda', '1'] with allowed return codes [0] (shell=False, 
capture=True)
  2019-06-28 12:24:19,157 - util.py[DEBUG]: Running command ['growpart', 
'/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
  2019-06-28 12:24:19,726 - util.py[DEBUG]: resize_devices took 1.075 seconds
  2019-06-28 12:24:19,726 - handlers.py[DEBUG]: finish: 
init-network/config-growpart: FAIL: running config-growpart with frequency 
always
  2019-06-28 12:24:19,727 - util.py[WARNING]: Running module growpart (<module 
'cloudinit.config.cc_growpart' from 
'/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
  2019-06-28 12:24:19,727 - util.py[DEBUG]: Running module growpart (<module 
'cloudinit.config.cc_growpart' from 
'/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 812, in 
_run_modules
      freq=freq)
    File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run
      return self._runners.run(name, functor, args, freq, clear_on_fail)
    File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 187, in run
      results = functor(*args)
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 
351, in handle
      func=resize_devices, args=(resizer, devices))
    File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2521, in 
log_time
      ret = func(*args, **kwargs)
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 
298, in resize_devices
      (old, new) = resizer.resize(disk, ptnum, blockdev)
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 
159, in resize
      return (before, get_size(partdev))
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 
198, in get_size
      fd = os.open(filename, os.O_RDONLY)
  FileNotFoundError: [Errno 2] No such file or directory: 
'/dev/disk/by-partuuid/a5f2b49f-abd6-427f-bbc4-ba5559235cf3'

  ========================================

  @rcj suggested this is a race with udev. This seems to only happen on
  Cosmic and later.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to