One of the ramifications of *all* of the proposed 'release features when they are ready' workflows is more production rollouts. As such I went over the proposed plan with James Troup looking for holes - we can't increase downtime - we spend too much time down already :)
As a result I've filed a few RT's to get redundant instances (probably on the same servers) of things like the xmlrpc server, codehosting server etc, so we can use one instance live and upgrade the other in parallel. There are three key things that are not yet prepped for highly available rollouts: - cronscripts (probably including the job system) - buildd master/slaves - importds I've filed a bug for the cronscripts as a whole and for the buildd's - I had the temerity to mark these as high since we're going to be impacting the ability for us to increase our velocity safely until those are fixed. I don't know enough about the job system or the importd system to sensibly talk about highly available upgrades there yet. I'd love it if someone were to just file bugs / RT's as appropriate to get such a process in place - but failing that, I hope to discuss them with whomever knows most in the next day or two. This effort ties into performance improvements as an enabler: the more quickly we can deploy improvements, the faster we can react to timeout issues, and thus the lower we can safely make the timeouts without causing extended downtime for users. Its all about cycle time :) -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

