Hello, I think the various pieces around infra have stabilized enough for us to think about this. I suggest that we think about having a Gerrit replica in the cloud (whichever clouds the CI consumes). This gives us a fall back option in case the cage has problems. It also gives us a good way to reduce the CI related load on the main Gerrit server. In the near future, when we run distributed testing, we're going to clone 10x as much as we do now. Right now we clone over git to take the load away from Gerrit, but when we have a replica, I vote we clone over HTTP(s).
I would also recommend an offsite PostgreSQL replica that will let us be somewhat fault tolerant. In the event that cage has a multi-hour unexplained outage, we'd be able to bring back essential services. This is suggestion. We'll need to estimate the cost of work involved + cost of operating both these hot standbys. -- nigelb
_______________________________________________ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra