The packetdump (comment #35) of MAAS not responding to grub's request for the mac specific grub.cfg before grub times out, and then responding immediately to the generic-amd64 grub cfg, clearly shows a race condition in MAAS.
MAAS's design of dynamically generating the interface specific grub config only after it receives the tftp request for it is susceptible to a race condition where grub times out before MAAS can respond. That design is not the only possible design. All the information required for the interface specific grub.cfg is available before the machine ever powers on, and could be made available on the rack controllers at that time too. Doing so would eliminate that race condition, or at least reduce the opportunity greatly, as we see MAAS has no problems immediately responding and serving files that it doesn't need to dynamically generate at request time. There is still some question around what in the environment is contributing to MAAS not responding faster, and what MAAS is doing while it takes 60+ seconds to respond to the request, but that doesn't change the fact that the current MAAS design is racy (and that's a bug). Whatever we change in the environment to reduce the likelihood of hitting this issue there doesn't solve the underlying race condition in MAAS, and leaves open the possibility of hitting the issue other places too. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1743249 Title: Failed Deployment after timeout trying to retrieve grub cfg To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1743249/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
