[Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

Nick Niehoff Fri, 10 Apr 2020 09:56:26 -0700

Ryan,
   We believe this is a bug as we expect curtin to wipe the disks.  In this 
case it's failing to wipe the disks and occasionally that causes issues with 
our automation deploying ceph on those disks.  This may be more of an issue 
with LVM and a race condition trying to wipe all of the disks sequentially 
simply with the large number of disks/vgs/lvs.
 
   To clarify from my previous testing, I was mistaken, I thought MAAS used the 
commissioning OS as the ephemeral OS from which to deploy from, this is not the 
case.  MAAS uses the specified deployment OS as the ephemeral image to deploy 
from.  Based on this all of my previous testing was done with Bionic using the 
4.15 kernel.  This proves it is a race condition somewhere as sometimes this 
error does not reproduce and it was just a coincidence that I was changing the 
commissioning OS.


   I have tested this morning and have been able to reproduce the issue
with bionic 4.15 and xenial 4.4 however I have yet to reproduce it using
either bionic or xenial hwe kernels.

   I will upload the curtin logs and config from my reproducer now.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1871874

Title:
  lvremove occasionally fails on nodes with multiple volumes and curtin
  does not catch the failure

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/curtin/+bug/1871874/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1871874] Re: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure

Reply via email to