On Mon, Jun 27, 2016 at 3:46 PM, Clay Gerrard <[email protected]> wrote: > There's probably some minimal gain in cross compatibility testing to > sticking with the status quo. The Swift API is old and stable, but I > believe there was some bug in recent history where some return value in > swiftclient changed from a iterable to a generator or something and some > aggressive non-duck type checking broke something somewhere.... > > I find that bug reports sorta interesting, the reported memory pressure > there doesn't make sense. Maybe there's some non- > essential middleware configured on that proxy that's causing the workers to > bloat up like that?
Swift proxy pipeline: pipeline = catch_errors healthcheck cache ratelimit bulk tempurl formpost authtoken keystone staticweb proxy-logging proxy-server Thanks for your help, > -clayg > > On Mon, Jun 27, 2016 at 12:30 PM, Emilien Macchi <[email protected]> wrote: >> >> Hi, >> >> Today we're re-investigating a CI failure that we had multiple times [1]: >> Swift memory usage grows until it is OOM-killed. >> >> The perimeter of this thread is about our CI and not production >> environments. >> Indeed, our CI is running limited resources while production >> environments should not hit this problem. >> >> After some investigation on #ŧripleo, we found out this scenario was >> happening almost every time since recently: >> >> * undercloud is deployed, glance and swift are running. Glance is >> configured with Swift backend to store images. >> * tripleo CI upload overcloud image into Glance, image is successfully >> uploaded. >> * when overcloud starts deploying, some nodes randomly fail to deploy >> because the undercloud OOM-kills swift-proxy-server that is still >> sending the ovecloud image requested by Glance API. Swift fails, >> Glance fails, overcloud deployment fails with a "No valid hosts >> found". >> >> It's likely due to performances issues in our CI, and there is nothing >> we can do but adding more resources or reducing the number of >> environments, something we won't do at this time, because our recent >> improvements in our CI (more ram, SSD, etc). >> >> As a first iteration, I propose [2] that we stop using Swift as a >> backend for Glance. Indeed, our undercloud is currently single-node, I >> see zero value of using Swift to store the overcloud image. >> If there is a value, then we can add the option to whether or not >> using it (and set it to False in our CI to use file backend, which >> won't lead to OOM). >> >> Note: on the overcloud: we currently support file, swift and rbd >> backends, that you can easily select during your deployment. >> >> [1] https://bugs.launchpad.net/tripleo/+bug/1595916 >> [2] https://review.openstack.org/#/c/334555/ >> -- >> Emilien Macchi >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Emilien Macchi __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
