@Steve,
On Thu, Feb 1, 2018 at 1:49 PM, Steve Langasek <[email protected] > wrote: > On Thu, Feb 01, 2018 at 06:15:31PM -0000, Andres Rodriguez wrote: > > @Jason, > > > Packet 90573 doesn't seem to me as an indication of what you are > > describing. What I see is this: > > > 1. grub makes ~30 requests for PXE config on grub.cfg-<mac>, after which > it gives up because it didn't receive a response. > > 2. grub moves on and requests grub.cfg-default-amd64, and it receives a > response from MAAS. > > > Now, the difference between the above, is that 1 does *database* > > lookups, while 2 does not. In other words, 1 causes a request to obtain > > the 'node' object based on the MAC to provide, and if grub is making 30+ > > requests, then this can definitely flood the db with requests. > > Then as I've said on IRC, this is a bug in maas, because 30 udp retries > should not generate 30 requests to the database. > > GRUB is *not* wrong to retransmit its udp packets when it doesn't get a > response. If each of these increases the load in MAAS, then MAAS should be > fixed. > The case where GRUB retrieves the same file multiple times is a GRUB bug, > but I don't see any evidence linking this GRUB bug to the timeout and > fallback problem in Jason's latest trace. I agree with you if we are only considering this 1 system. Let's not forget that we have other systems booting at around the same time, each of which may be making at least 4 requests (for those grub systems) that may or may not be answered immediately after each request. But if requests are being served at the same time that more requests come in, I do see how making multiple requests can indeed be causing the degraded performance. Specially, now that we've learned that we have multiple VM's in the same host, all consuming 18 CPU's, on a 20 CPU system, and when MAAS alone, runs 5 processes that we typically recommend a dedicated CPU for each. > -- > You received this bug notification because you are subscribed to MAAS. > https://bugs.launchpad.net/bugs/1743249 > > Title: > Failed Deployment after timeout trying to retrieve grub cfg > > To manage notifications about this bug go to: > https://bugs.launchpad.net/maas/+bug/1743249/+subscriptions > > Launchpad-Notification-Type: bug > Launchpad-Bug: product=maas; milestone=2.4.x; status=Incomplete; > importance=Undecided; assignee=None; > Launchpad-Bug: distribution=ubuntu; sourcepackage=grub2; component=main; > status=New; importance=Undecided; assignee=None; > Launchpad-Bug-Tags: cdo-qa cdo-qa-blocker foundations-engine > Launchpad-Bug-Information-Type: Public > Launchpad-Bug-Private: no > Launchpad-Bug-Security-Vulnerability: no > Launchpad-Bug-Commenters: andreserl blake-rouse cgregan jason-hobbs vorlon > Launchpad-Bug-Reporter: Jason Hobbs (jason-hobbs) > Launchpad-Bug-Modifier: Steve Langasek (vorlon) > Launchpad-Message-Rationale: Subscriber (MAAS) > Launchpad-Message-For: andreserl > -- Andres Rodriguez (RoAkSoAx) Ubuntu Server Developer MSc. Telecom & Networking Systems Engineer -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1743249 Title: Failed Deployment after timeout trying to retrieve grub cfg To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1743249/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
