There was further discussion in #ubuntu-devel and #ubuntu-release on
this review. See irclogs.ubuntu.com for details.

I have reviewed the patch carefully. As far as I can tell, it does what
it is supposed to do, and I don't see any bugs.

However, this code, particularly the udev_event_apply_format function,
seems to have grown organically and could be significantly improved
IMHO. This patch makes the state of the code worse, and increasingly
fragile. It took hours for me to review the patches for correctness as a
consequence, and I think that this and future changes and reviews are
far more prone to error than they need to be.

Some specifics:

1) The mechanism employed of redirecting s and l (destination buffer and
length remaining count) means that the code inside the redirection is
incredibly fragile. The code inside the switch should be pulled out into
a separate function. Then the redirection would become unnecessary. On
IRC, Dan acknowledged that this may require a large number of parameters
to be provided to the factored out function, but this is in part my
point. Unintended interactions could happen through any of those
parameters. Isolating them would be helpful, since that would identify
and reduce the interactions that need to be checked.

2) IMHO, the patch should use the sizeof(x)/sizeof(x[0]) idiom to get
the buffer lengths, rather than assuming that they are defined by the
same macro. I think likely that someone will attempt to change the array
length by assuming it is only necessary to change it in one place. Dan
suggested on IRC that the opposite point is that changing it from a
buffer to a pointer would cause the sizeof idiom to break, but I think
it's far more likely that a developer will check how a variable is used
before changing its type.

3) The upstream PR refers to tests, but does not add a test for this bug
being fixed. Given the fragility of this fix, I think we should do that.

We have to maintain this code for another four years in Xenial. If we
have to touch this function again in that time, I think we'll need to
spend even more time fixing and reviewing it.

I realise that the fix has already been accepted upstream. But as we (in
Ubuntu) originated the patch, I think we have a responsibility to make
it cleaner. Please refactor it and send that upstream.

I understand that this SRU is necessary to fix an important regression
in Ubuntu, so I don't think it's appropriate to block this given that I
can't find a bug in it. However, I've written this up to register my
unhappiness with this patch.

I appreciate the previous state of the code made it difficult to fix in
a better way; IMHO, this is a reason to refactor it first when sending
the development fix upstream. Perhaps the SRU would then need to have
been a different, more minimal and hacky patch, but that could be
focused on minimising regression risk and fixing just the problem at
hand rather than the general case.

If you add a test for this bug, please bundle it with the next SRU of
this package in Xenial.

I note that Zesty is not Fix Released. However, since this is a
regression fix, I don't think it would be appropriate to wait. Please
get the fix landed in Zesty as soon as possible; I certainly expect the
Zesty fix to be able to land well before we are ready to release this in
to Xenial/Yakkety -updates.

Given that this is a regression fix, we might expect to cut short the
usual 7-day aging period. However, given the fragility of the code
involved and because this is not a straight revert, I don't think that is
warranted in this case.


** Changed in: systemd (Ubuntu Xenial)
       Status: Confirmed => Fix Committed

** Tags added: verification-needed

** Tags removed: verification-needed
** Tags added: regression-update

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1647485

Title:
  NVMe symlinks broken by devices with spaces in model or serial strings

Status in systemd:
  New
Status in systemd package in Ubuntu:
  Confirmed
Status in systemd source package in Trusty:
  Confirmed
Status in systemd source package in Xenial:
  Fix Committed
Status in systemd source package in Yakkety:
  Fix Committed
Status in systemd source package in Zesty:
  Confirmed
Status in systemd package in Debian:
  New

Bug description:
  [Impact]

  After including the patch from bug 1642903, NVMe devices that include spaces 
in their model or serial strings result in incorrect symlinks, e.g. if the 
model string is "XYZ Corp NVMe drive" then instead of creating:
  /dev/disk/by-id/nvme-XYZ Corp NVMe drive_SERIAL -> ../../nvme0n1
  it creates:
  /dev/disk/by-id/nvme-XYZ -> ../../nvme0n1
  /dev/Corp -> nvme0n1
  /dev/NVMe -> nvme0n1
  /dev/drive_SERIAL -> nvme0n1

  This is because of the way udev handles the SYMLINK value strings; by
  default, it does not do any whitespace replacement.  To enable
  whitespace replacement of a symlink value, the rule must also include
  OPTIONS+="string_escape=replace".  This is done for 'md' and 'dm'
  devices in their rules.  However, there are no rules that actually
  want to specify multiple symlinks, and defaulting to not replacing
  whitespace makes no sense; instead, the default should be to replace
  all whitespace in each symlink value, unless the rule explicitly
  specifies OPTIONS+="string_escape=none".

  [Test Case]

  This assumes using udev with the patch from bug 1642903.

  Without this patch, when using a NVMe drive that contains spaces in
  its model and/or serial strings, check the /dev/disk/by-id/ directory.
  It should contain a partially-correct symlink to the NVMe drive, with
  the name up to the first space.  All following space-separated parts
  of the mode/serial string should have symlinks in the /dev/ directory.
  This is the incorrect behavior.

  With this patch, check the /dev/disk/by-id/ directory.  It should
  contain a fully-correct symlink to the NVMe drive, and no part of the
  drive's model/serial number string should be a link in the /dev
  directory.

  An example of the correct/incorrect naming is in the Impact section.

  There should be no other changes to any of the symlinks under /dev
  before and after this patch.  Typical locations for symlinks are
  /dev/, /dev/disk/by-name/, /dev/disk/by-id/, /dev/disk/by-uuid/,
  /dev/disk/by-label/

  [Regression Potential]

  Errors in udev rules can lead to an unbootable or otherwise completely
  broken system if they unintentionally break or clobber existing
  /dev/disks/ symlinks.

  [Other Info]

  This is also tracked with upstream systemd (udev) bug 4833:
  https://github.com/systemd/systemd/issues/4833

  Also note, this can be worked around in individual rules ONLY (i.e.
  not fixed for all rules) by appending OPTIONS+="string_escape=replace"
  to each of the NVMe rules with SYMLINK+="..." assignment, e.g.:

  KERNEL=="nvme*[0-9]n*[0-9]", ENV{DEVTYPE}=="disk", ATTRS{model}=="?*",
  ENV{ID_SERIAL_SHORT}=="?*",
  ENV{ID_SERIAL}="$attr{model}_$env{ID_SERIAL_SHORT}", SYMLINK+="disk
  /by-id/nvme-$env{ID_SERIAL}", OPTIONS+="string_escape=replace"

  Related bugs:
   * bug 1642903: introduce disk/by-id (model_serial) symlinks for NVMe drives
   * bug 1651602: NVMe driver regression for non-smp/1-cpu systems
   * bug 1649635: export nvme drive model/serial strings via sysfs (trusty)

To manage notifications about this bug go to:
https://bugs.launchpad.net/systemd/+bug/1647485/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to