It sounds like we merged a bunch last night thanks to the revert, so I went ahead and restored/rechecked everything that was out of the gate. I've checked and nothing was left over, but let me know in case I missed something. I'll keep updating this thread with the progress made to improve the situation etc. So from now, situation is back to "normal", recheck/+W is ok.
Thanks again for your patience, On Wed, Jun 13, 2018 at 10:39 PM, Emilien Macchi <emil...@redhat.com> wrote: > https://review.openstack.org/575264 just landed (and didn't timeout in > check nor gate without recheck, so good sigh it helped to mitigate). > > I've restore and rechecked some patches that I evacuated from the gate, > please do not restore others or recheck or approve anything for now, and > see how it goes with a few patches. > We're still working with Steve on his patches to optimize the way we > deploy containers on the registry and are investigating how we could make > it faster with a proxy. > > Stay tuned and thanks for your patience. > > On Wed, Jun 13, 2018 at 5:50 PM, Emilien Macchi <emil...@redhat.com> > wrote: > >> TL;DR: gate queue was 25h+, we put all patches from gate on standby, do >> not restore/recheck until further announcement. >> >> We recently enabled the containerized undercloud for multinode jobs and >> we believe this was a bit premature as the container download process >> wasn't optimized so it's not pulling the mirrors for the same containers >> multiple times yet. >> It caused the job runtime to increase and probably the load on docker.io >> mirrors hosted by OpenStack Infra to be a bit slower to provide the same >> containers multiple times. The time taken to prepare containers on the >> undercloud and then for the overcloud caused the jobs to randomly timeout >> therefore the gate to fail in a high amount of times, so we decided to >> remove all jobs from the gate by abandoning the patches temporarily (I have >> them in my browser and will restore when things are stable again, please do >> not touch anything). >> >> Steve Baker has been working on a series of patches that optimize the way >> we prepare the containers but basically the workflow will be: >> - pull containers needed for the undercloud into a local registry, using >> infra mirror if available >> - deploy the containerized undercloud >> - pull containers needed for the overcloud minus the ones already pulled >> for the undercloud, using infra mirror if available >> - update containers on the overcloud >> - deploy the containerized undercloud >> >> With that process, we hope to reduce the runtime of the deployment and >> therefore reduce the timeouts in the gate. >> To enable it, we need to land in that order: https://review.openstac >> k.org/#/c/571613/, https://review.openstack.org/#/c/574485/, >> https://review.openstack.org/#/c/571631/ and https://review.openstack.o >> rg/#/c/568403. >> >> In the meantime, we are disabling the containerized undercloud recently >> enabled on all scenarios: https://review.openstack.org/#/c/575264/ for >> mitigation with the hope to stabilize things until Steve's patches land. >> Hopefully, we can merge Steve's work tonight/tomorrow and re-enable the >> containerized undercloud on scenarios after checking that we don't have >> timeouts and reasonable deployment runtimes. >> >> That's the plan we came with, if you have any question / feedback please >> share it. >> -- >> Emilien, Steve and Wes >> > > > > -- > Emilien Macchi > -- Emilien Macchi
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev