I've got Compute functionality working with the OpenStack Jenkins plugin, so it can launch nova instances as on-demand slaves now, run builds on them, and archive the results into swift. I'd like to open GitHub issues to track your requirements, but I have a few questions.
> We need disposable machines that are only used for one test, which means spinning up and terminating hundreds of machines per day. Sounds like we want a function to terminate the machine after the job has run. https://github.com/platformlayer/openstack-jenkins/issues/1 > We need to use machines from multiple providers simultaneously so that we're resilient against errors with one provider. Label expressions should work here; you would apply a full set of axis labels to each machine ("rax oneiric python26") but then you would filter based only on the required axes ("oneric python26"). Are labels sufficient for this? > We need to pull nodes from a pool of machines that have been spun up ahead of time for speed. This sounds like a custom NodeProvisioner implementation. The current implementation is optimized around minimizing CPU hours, by doing load prediction. You have a different criteria, based on minimizing launch latency. It looks like it should be relatively easy to implement a new algorithm, although perhaps a bit tricky to figure out how to plug it in. https://github.com/platformlayer/openstack-jenkins/issues/2 > We need to be able to select from different kinds of images for certain tests. Are labels sufficient for this? > Machines need to be checked for basic functionality before being added to the pool (we frequently get nodes without a functioning network). I believe Jenkins does this anyway; a node which doesn't have networking won't be able to get the agent. And you can run your own scripts after the slave boots up ("apt-get install openjdk", for example). Those scripts can basically do any checks you want. Is that enough? > They need to be started from snapshots with cached data on them to avoid false negatives from network problems. Can you explain this a bit more? This is to protect against the apt repositories / python sources / github repos being down? Would an http proxy be enough? > We need to keep them around after failures to diagnose problems, and we need to delete those after a certain amount of time. >From the github docs, it sounds like you don't get access anyway because of the providers' policies. Would it not therefore be better to take a ZIP or disk snapshot after a failed test, and then shut down the machine as normal? Also... You currently auto-update your images, which is cool (devstack-update-vm-image). Do you think this is something a plugin should do, or do you think this is better done through scripts and a matrix job? I'm leaning towards keeping it in scripts. The one thing I think we definitely need here is some sort of 'best match' image launching, rather than hard-coding to a particular ID, so that the cloud plugin will always pick up the newest image. https://github.com/platformlayer/openstack-jenkins/issues/3 Justin
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp