Hi all, This is to provide an update on the using the managed Github windows runners for BuildStream tests.
The high level summary is that sadly, whilst they might be a great alternative for the future, in my opinion they're too unreliable for use currently. My suggestion would be that we look to provisioning our own runners, and registering them as self-hosted runners on github. For me there are 3 main issues. 2 are specific to the github managed windows runners, the third is more general. 1. When there are suspected failures in the runner (as opposed to the job itself) debugging output is minimal. For an example please see https://pipelines.actions.githubusercontent.com/S7UY0H0LfuguOprY6emum9WfprZwRFz37GpdRckXdprjGWJcsM/_apis/pipelines/1/runs/35/signedlogcontent/23?urlExpires=2020-08-21T10%3A11%3A11.3474245Z&urlSigningMethod=HMACV1&urlSignature=kEhRelOH2ean%2BD%2FpMG9tpvI%2BYSYCcZhDYjWJuQHu2FY%3D when attempting to debug the plugins tests running on wsl (https://github.com/cphang99/buildstream/runs/1008287749?check_suite_focus=true) This lack of logging output means debugging issues is difficult now, and I suspect difficult in the future. 2. Whatever platform is being used to provision and run the infrastructure the VMs are running under, the tests themselves are flaky. To determine whether test failures I was observing could be localised to a specific set of tests I split out the tests into parallel jobs. Unforuntately, this showed that the test run could be observed to sometimes work in 4-5 minutes, and other times silently fail after 30 minutes have elapsed, with the failure mode in 1) observed. For an example please compare: https://github.com/cphang99/buildstream/runs/1008374458?check_suite_focus=true#step:7:1 with "sources tests" completed in 5 minutes vs https://github.com/cphang99/buildstream/runs/1008395831?check_suite_focus=true which doesn't complete at all 3. There are UI issues with github actions. With gitlab-ci you can rerun individual runs in a pipeline, with full traceability on that run and the previous reruns. With github actions, you can only rerun the entire pipeline, with no more granularity with previous reruns getting overwriten. Chandan raised the point that a workaround for this is to wipe the whole config and add only the new job that you want to test. I agree that this is workflow that would allow for more effective debugging. In the meantime, I'll start work on porting over the overnight tests. On this or anything else, thoughts are welcome! Many thanks, Chris
