I'm kind of hijacking Dan's e-mail but I would like to propose some technical improvements to stop having so much CI failures.
1/ Stop creating swap files. We don't have SSD, this is IMHO a terrible mistake to swap on files because we don't have enough RAM. In my experience, swaping on non-SSD disks is even worst that not having enough RAM. We should stop doing that I think. 2/ Split CI jobs in scenarios. Currently we have CI jobs for ceph, HA, non-ha, containers and the current situation is that jobs fail randomly, due to performances issues. Puppet OpenStack CI had the same issue where we had one integration job and we never stopped adding more services until all becomes *very* unstable. We solved that issue by splitting the jobs and creating scenarios: https://github.com/openstack/puppet-openstack-integration#description What I propose is to split TripleO jobs in more jobs, but with less services. The benefit of that: * more services coverage * jobs will run faster * less random issues due to bad performances The cost is of course it will consume more resources. That's why I suggest 3/. We could have: * HA job with ceph and a full compute scenario (glance, nova, cinder, ceilometer, aodh & gnocchi). * Same with IPv6 & SSL. * HA job without ceph and full compute scenario too * HA job without ceph and basic compute (glance and nova), with extra services like Trove, Sahara, etc. * ... (note: all jobs would have network isolation, which is to me a requirement when testing an installer like TripleO). 3/ Drop non-ha job. I'm not sure why we have it, and the benefit of testing that comparing to HA. Any comment / feedback is welcome, -- Emilien Macchi
signature.asc
Description: OpenPGP digital signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev