Re: MIgrating source code and CI

Christopher Phang Fri, 21 Aug 2020 03:46:54 -0700

Hi all,

This is to provide an update on the using the managed Github windows runners 
for BuildStream tests.

The high level summary is that sadly, whilst they might be a great alternative
for the future, in my opinion they're too unreliable for use currently. My
suggestion would be that we look to provisioning our own runners, and
registering them as self-hosted runners on github.

For me there are 3 main issues. 2 are specific to the github managed windows
runners, the third is more general.

1. When there are suspected failures in the runner (as opposed to the job
itself) debugging output is minimal.

For an example please see

https://pipelines.actions.githubusercontent.com/S7UY0H0LfuguOprY6emum9WfprZwRFz37GpdRckXdprjGWJcsM/_apis/pipelines/1/runs/35/signedlogcontent/23?urlExpires=2020-08-21T10%3A11%3A11.3474245Z&urlSigningMethod=HMACV1&urlSignature=kEhRelOH2ean%2BD%2FpMG9tpvI%2BYSYCcZhDYjWJuQHu2FY%3D

when attempting to debug the plugins tests running on wsl
(https://github.com/cphang99/buildstream/runs/1008287749?check_suite_focus=true)

This lack of logging output means debugging issues is difficult now, and I
suspect difficult in the future.

2. Whatever platform is being used to provision and run the infrastructure the
VMs are running under, the tests themselves are flaky.

To determine whether test failures I was observing could be localised to a
specific set of tests I split out the tests into parallel jobs. Unforuntately,
this showed that the test run could be observed to sometimes work in 4-5
minutes, and other times silently fail after 30 minutes have elapsed, with the
failure mode in 1) observed.

For an example please compare:

https://github.com/cphang99/buildstream/runs/1008374458?check_suite_focus=true#step:7:1
with "sources tests" completed in 5 minutes

https://github.com/cphang99/buildstream/runs/1008395831?check_suite_focus=true
which doesn't complete at all

3. There are UI issues with github actions.

With gitlab-ci you can rerun individual runs in a pipeline, with full
traceability on that run and the previous reruns.

With github actions, you can only rerun the entire pipeline, with no more
granularity with previous reruns getting overwriten.

Chandan raised the point that a workaround for this is to wipe the whole config
and add only the new job that you want to test. I agree that this is workflow
that would allow for more effective debugging.

In the meantime, I'll start work on porting over the overnight tests.

On this or anything else, thoughts are welcome!

Many thanks,

Chris

Re: MIgrating source code and CI

Reply via email to