On 2/23/21 7:09 PM, Cleber Rosa wrote: > On Tue, Feb 23, 2021 at 06:34:07PM +0100, Philippe Mathieu-Daudé wrote: >> On 2/23/21 6:24 PM, Philippe Mathieu-Daudé wrote: >>> On 2/23/21 5:47 PM, Cleber Rosa wrote: >>>> On Tue, Feb 23, 2021 at 05:37:04PM +0100, Philippe Mathieu-Daudé wrote: >>>>> On 2/23/21 12:25 PM, Thomas Huth wrote: >>>>>> On 19/02/2021 22.58, Cleber Rosa wrote: >>>>>>> As described in the included documentation, the "custom runner" jobs >>>>>>> extend the GitLab CI jobs already in place. One of their primary >>>>>>> goals of catching and preventing regressions on a wider number of host >>>>>>> systems than the ones provided by GitLab's shared runners. >>>>>>> >>>>>>> This sets the stage in which other community members can add their own >>>>>>> machine configuration documentation/scripts, and accompanying job >>>>>>> definitions. As a general rule, those newly added contributed jobs >>>>>>> should run as "non-gating", until their reliability is verified (AKA >>>>>>> "allow_failure: true"). >>>>>>> >>>>>>> Signed-off-by: Cleber Rosa <cr...@redhat.com> >>>>>>> --- >>>>>>> .gitlab-ci.d/custom-runners.yml | 14 ++++++++++++++ >>>>>>> .gitlab-ci.yml | 1 + >>>>>>> docs/devel/ci.rst | 28 ++++++++++++++++++++++++++++ >>>>>>> docs/devel/index.rst | 1 + >>>>>>> 4 files changed, 44 insertions(+) >>>>>>> create mode 100644 .gitlab-ci.d/custom-runners.yml >>>>>>> create mode 100644 docs/devel/ci.rst >>>>>>> >>>>>>> diff --git a/.gitlab-ci.d/custom-runners.yml >>>>>>> b/.gitlab-ci.d/custom-runners.yml >>>>>>> new file mode 100644 >>>>>>> index 0000000000..3004da2bda >>>>>>> --- /dev/null >>>>>>> +++ b/.gitlab-ci.d/custom-runners.yml >>>>>>> @@ -0,0 +1,14 @@ >>>>>>> +# The CI jobs defined here require GitLab runners installed and >>>>>>> +# registered on machines that match their operating system names, >>>>>>> +# versions and architectures. This is in contrast to the other CI >>>>>>> +# jobs that are intended to run on GitLab's "shared" runners. >>>>>>> + >>>>>>> +# Different than the default approach on "shared" runners, based on >>>>>>> +# containers, the custom runners have no such *requirement*, as those >>>>>>> +# jobs should be capable of running on operating systems with no >>>>>>> +# compatible container implementation, or no support from >>>>>>> +# gitlab-runner. To avoid problems that gitlab-runner can cause while >>>>>>> +# reusing the GIT repository, let's enable the recursive submodule >>>>>>> +# strategy. >>>>>>> +variables: >>>>>>> + GIT_SUBMODULE_STRATEGY: recursive >>>>>> >>>>>> Is it really necessary? I thought our configure script would take care >>>>>> of the submodules? >>>>> >>>> >>>> I've done a lot of testing on bare metal systems, and the problems >>>> that come from reusing the same system and failed cleanups can be very >>>> frustrating. It's unfortunate that we need this, but it was the >>>> simplest and most reliable solution I found. :/ >>>> >>>> Having said that, I noticed after I posted this series that this is >>>> affecting all other jobs. We don't need it that in the jobs based >>>> on containers (for obvious reasons), so I see two options: >>>> >>>> 1) have it enabled on all jobs for consistency >>>> >>>> 2) have it enabled only on jobs that will reuse the repo >>>> >>>>> Well, if there is a failure during the first clone (I got one network >>>>> timeout in the middle) >>> >>> [This network failure is pasted at the end] >>> >>>>> then next time it doesn't work: >>>>> >>>>> Updating/initializing submodules recursively... >>>>> Synchronizing submodule url for 'capstone' >>>>> Synchronizing submodule url for 'dtc' >>>>> Synchronizing submodule url for 'meson' >>>>> Synchronizing submodule url for 'roms/QemuMacDrivers' >>>>> Synchronizing submodule url for 'roms/SLOF' >>>>> Synchronizing submodule url for 'roms/edk2' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/esaxx' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/BaseTools/Source/C/BrotliCompress/brotli/research/libdivsufsort' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/MdeModulePkg/Library/BrotliCustomDecompressLib/brotli' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/MdeModulePkg/Universal/RegularExpressionDxe/oniguruma' >>>>> Synchronizing submodule url for >>>>> 'roms/edk2/UnitTestFrameworkPkg/Library/CmockaLib/cmocka' >> >> So far, beside the repository useful for QEMU, I cloned: >> >> - boringssl >> - krb5 >> - pyca-cryptography >> - esaxx >> - libdivsufsort >> - oniguruma >> - openssl >> - brotli >> - cmocka >> > > Hi Phil, > > I'm not following what you meant by "I cloned"... Are you experimenting > with this on a machine of your own and manually cloning the submodules?
I meant "my test runner has been cloning ..." >> But reach the runner time limit of 2h. The first failure was 1h, I raised the job limit to the maximum I could use for this runner, 2h. >> The directory reports 3GB of source code. >> >> I don't think the series has been tested enough before posting, > > Please take into consideration that this series, although simple in > content, touches and interacts with a lot of moving pieces, and > possibly with personal systems that I did not have, or will have, > access to. As far as public testing proof goes, you can see a > pipeline here with this version of this series here: > > https://gitlab.com/cleber.gnu/qemu/-/pipelines/258982039/builds Expand the timeout and retry the same job on the same runner various times: diff --git a/.gitlab-ci.d/custom-runners.yml b/.gitlab-ci.d/custom-runners.yml @@ -17,6 +17,7 @@ variables: # setup by the scripts/ci/setup/build-environment.yml task # "Install basic packages to build QEMU on Ubuntu 18.04/20.04" ubuntu-18.04-s390x-all-linux-static: + timeout: 2h 30m allow_failure: true needs: [] stage: build Each time it will clone more submodules. I stopped at the 3rd intent. > As I said elsewhere, I only noticed the recursive submodule being > applied to the existing jobs after I submitted the series. Mea culpa. > But: > > * none of the jobs took noticeably longer than the previous baseline > * there was one *container build failure* (safe to say it's not > related) > * all other jobs passed successfully I had less luck then (see the docker-dind jobs started on the custom runner commented elsewhere in this thread). > And, along with the previous versions, this series were tested on all > the previously included architectures and operating systems. It's > unfortunate that because of your experience at this time (my > apologies), you don't realize the amount of testing done so far. As I commented to Erik on IRC, the single difference I did is use the distribution runner, not the official one: $ sudo apt-get install gitlab-runner docker.io Then registered changing the path (/usr/bin/gitlab-runner instead of /usr/local/bin/gitlab-runner). Everything else left unchanged. >> I'm stopping here my experiments. >> >> Regards, >> >> Phil. >> > > I honestly appreciate your help here up to this point. > > Regards, > - Cleber. >