[Openstack] [Continuous-Integration] What else is running on the Jenkins slaves?
Folks, A question for the CI side-of-the-house ... What else is running on the Jenkins slaves, concurrently with the gating CI tests? The background is the intermittent glance service launch failure - the recently added strace-on-failure logic reveals the issue to be an EADDRINUSE when the registry service listen socket is bound to a supposedly unused port. Two possible explanations for this: 1. A race whereby some other process jumps in grabs this port before the registry service is launched (the window of opportunity is not too narrow, as the API service is being launched in the meantime). 2. We identify the unused port by quickly opening a closing a socket on port zero - there could I guess be some lag in recycling the port, but this seems unlikely as no connections were established, hence no need for TIME_WAIT. Option #1 seems the more likely, so I wanted to confirm there is indeed other port-grabbing stuff running on the Jenkins slaves. Cheers, Eoghan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Continuous-Integration] What else is running on the Jenkins slaves?
Hi Eoghan, On 26/06/12 12:30, Eoghan Glynn wrote: A question for the CI side-of-the-house ... What else is running on the Jenkins slaves, concurrently with the gating CI tests? Very basic things, not much other than the Jenkins Slave service and SSH. Nothing that should cause conflicts that you are seeing. We also intentionally only run one test run per slave at a time. The background is the intermittent glance service launch failure - the recently added strace-on-failure logic reveals the issue to be an EADDRINUSE when the registry service listen socket is bound to a supposedly unused port. Are you closing ports with SO_REUSEADDR? If the registry service or something else isn't then I guess that could cause it. Kind Regards -- Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/ ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Continuous-Integration] What else is running on the Jenkins slaves?
Thanks for the quick response ... Very basic things, not much other than the Jenkins Slave service and SSH. Nothing that should cause conflicts that you are seeing. We also intentionally only run one test run per slave at a time. Interesting, seems the alternate explanation of a lag-on-closure is the more likely in that case. Are you closing ports with SO_REUSEADDR? If the registry service or something else isn't then I guess that could cause it. We do set SO_REUSEADDR on the registry server socket, but not on the dummy socket used to identify an unused port. But I think setting SO_REUSEADDR on the latter would defeat the purpose of the dummy socket, by breaking the constraint that the port should be previously unused. Cheers, Eoghan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp