On 2018-05-14 09:57:17 -0600 (-0600), Wesley Hayutin wrote: > On Mon, May 14, 2018 at 10:36 AM Jeremy Stanley <fu...@yuggoth.org> wrote: [...] > > Couldn't a significant burst of new packages cause the same > > symptoms even without it being tied to a minor version increase? > > Yes, certainly this could happen outside of a minor update of the > baseos.
Thanks for confirming. So this is not specifically a CentOS minor version increase issue, it's just more likely to occur at minor version boundaries. > So the only thing out of our control is the package set on the > base nodepool image. If that suddenly gets updated with too many > packages, then we have to scramble to ensure the images and > containers are also udpated. It's still unclear to me why the packages on the test instance image (i.e. the "container host") are related to the packages in the container guest images at all. That would seem to be the whole point of having containers? > If there is a breaking change in the nodepool image for example > [a], we have to react to and fix that as well. I would argue that one is a terrible workaround which happened to show its warts. We should fix DIB's pip-and-virtualenv element rather than continue rely on side effects of pinning RPM versions. I've commented to that effect on https://launchpad.net/bugs/1770298 just now. > > It sounds like a problem with how the jobs are designed > > and expectations around distros slowly trickling package updates > > into the series without occasional larger bursts of package deltas. > > I'd like to understand more about why you upgrade packages inside > > your externally-produced container images at job runtime at all, > > rather than relying on the package versions baked into them. > > We do that to ensure the gerrit review itself and it's > dependencies are built via rpm and injected into the build. If we > did not do this the job would not be testing the change at all. > This is a result of being a package based deployment for better or > worse. [...] Now I'll risk jumping to proposing solutions, but have you considered building those particular packages in containers too? That way they're built against the same package versions as will be present in the other container images you're using rather than to the package versions on the host, right? Seems like it would completely sidestep the problem. > An enhancement could be to stage the new images for say one week > or so. Do we need the CentOS updates immediately? Is there a > possible path that does not create a lot of work for infra, but > also provides some space for projects to prep for the consumption > of the updates? [...] Nodepool builds new images constantly, but at least daily. Part of this is to prevent the delta of available packages/indices and other files baked into those images from being more than a day or so stale at any given point in time. The older the image, the more packages (on average) jobs will need to download if they want to test with latest package versions and the more strain it will put on our mirrors and on our bandwidth quotas/donors' networks. There's also a question of retention, if we're building images at least daily but keeping them around for 7 days (storage on the builders, tenant quotas for Glance in our providers) as well as the explosion of additional nodes we'd need since we pre-boot nodes with each of our images (and the idea as I understand it is that you would want jobs to be able to select between any of them). One option, I suppose, would be to switch to building images weekly instead of daily, but that only solves the storage and node count problem not the additional bandwidth and mirror load. And of course, nodepool would need to learn to be able to boot nodes from older versions of an image on record which is not a feature it has right now. > Understood, I suspect this will become a more widespread issue as > more projects start to use containers ( not sure ). I'm still confused as to what makes this a container problem in the general sense, rather than just a problem (leaky abstraction) with how you've designed the job framework in which you're using them. > It's my understanding that there are some mechanisms in place to > pin packages in the centos nodepool image so there has been some > thoughts generally in the area of this issue. [...] If this is a reference back to bug 1770298, as mentioned already I think that's a mistake in diskimage-builder's stdlib which should be corrected, not a pattern we should propagate. -- Jeremy Stanley
signature.asc
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev