====================== The State of the Soyuz ====================== Progress on Soyuz can be largely categorised into three items:
1. Feature work 2. Ongoing important bugs (most are tagged 'boobytrap') 3. Firefighting I shall report on each of these below. Feature Work ============ We've worked on two main features in the last few months. Buildd-manager scalability -------------------------- This feature is largely done barring any last-moment problems. The buildd- manager has been extensively and invasively re-written to be cleaner, clearer and most importantly fully asynchronous, which finally allows events from all the builders to overlap. We also moved the build upload processing to an external queue so it's not done in a blocking fashion inside the manager itself. The result is a lean, mean build farm which is rarely seeing the kind of massive build queues seen in the past. There's a peak in queue length around 23:30 UTC each day when the daily recipe builds kick off, but these are dealt with very swiftly now. Derived distributions --------------------- Derived distros are still in full swing. Approximately two thirds of the UI is done (mostly the page that shows the differences between child and parent series), but more changes are often being identified as necessary. A design decision in the LEP to simultaneously open and initialise a new distro series needs to be redesigned because the Ubuntu team wants to do these steps separately now. We also need to add UI parts to show progress indications of things like sync operations and diff requests. The backend for asynchronously initialising a distroseries from a parent is finished (thanks to Steve's hard work) and can be initiated from the API. Initiating from the web UI won't be possible until the above redesign is done and implemented. The backend for doing sync operations is nearly finished, and Jelmer assures me it will be done before he absconds to the Bazaar team in January! In progress is the very complicated code that we need to determine the differences between two distroseries. This necessitated some changes to Gina so that we have access to the changelog in the database so it can be probed for releases that were never separately imported. Booby Trap Bugs =============== Any bugs that will cause us to drop everything and brandish fire extinguishers if they go off are tagged with 'boobytrap'. We've been making fairly slow but steady progress fixing these (being a man down on the team has not helped). The main bugs that were fixed are to do with the publisher, which used to hate uninitialised distroseries (which has enabled Ubuntu to do early opening of future series), the buildd-manager (which was all part of its re-write), and package copying. Package copying bugs are a particular annoyance since we've had a few that have made the publisher completely fail and block all PPAs from getting publisher. There are a few more of these in progress now, such as preventing files from getting re-uploaded once they've been deleted (which has horrible knock-on effects when people then copy those packages to other PPAs) and some buildd- manager improvements to tolerate better transient builder/network failures. Finally, we've got a few publisher performance issues caused by a few different bugs that end up with superseded/deleted sources that can never be condemned for removal. We've got a good handle on those and they will be fixed soon. Firefighing =========== Soyuz has had an unfortunate number of production incidents over the last few months. These were all either buildfarm issues or PPA publisher issues, both of which are very high profile and high impact. * 2010-06-17 - PPA publisher complete failure. This was caused by it trying to write an OOPS file to somewhere it didn't have permission to. * 2010-08-12 - after the first stage of the buildd-manager re-write, it ended up not catching EINTR properly which caused the running job to be instantly failed. * 2010-10-07 - death row processing (removing condemned files) was failing and causing many PPAs to go over quota with no way of fixing that. It was caused by the Postgres 8.4 upgrade causing a particular query to be an order of magnitude slower. * 2010-10-28 - failure in the build farm to dispatch any builds, caused in part by the efforts to re-write the buildd-manager and getting problems that don't occur in the test environment * 2010-11-17 - Apache returning "500" error when accessing Private PPAs. This was caused by the .htaccess files being written with incorrect permissions. The Future ========== Who knows what the future holds, other than goodbye Soyuz team, hello Squad Red! _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

