On 09/04/2019 19:44, Alexis King wrote: > Hi Paulo, > Hi Alexis, > The work you’re doing is really cool, though I admit most of it is over my > head. Thank you for putting in the time to set it all up. One thing I have > noticed, however, is that the GitLab pipeline seems to almost always fail or > timeout, which causes almost every commit on the commits page of the GitHub > repo[1] to be marked with a loud, red failure indicator. > Thanks for your email. It is correct what you say and this is an issue close to my heart that I wanted to see sorted. Finally this email is the poke that will get me to sort these out. Apologies I haven't done it earlier. > I don’t understand what you’re doing well enough to say whether or not this > is because something is going wrong in the CI scripts itself or because they > are (correctly) detecting that Racket doesn’t currently support some of the > tested architectures. But in either case, while the testing of those > architectures is very nice to have, it seems extreme to cause the whole > commit to be marked as a failure every time for things that (correct me if > I’m wrong) seem unlikely to be changed/fixed in the immediate future. > There are a few issues with compiling in other archs that's what these jobs capture. #2018 is one of the main issues and I have been looking at it with Matthew and Sam, but it's turning out to be a major pain. Other archs reveal similar behaviour. As you say, this shouldn't cause commits to get the red-cross. > For the Travis builds, we have a job that tests RacketCS, which currently > always fails, but we have the CI configured to ignore the failure of that > particular job when deciding whether or not to say the overall commit passed. > Is there some way something similar could be done with the GitLab pipeline? > Running all those jobs is valuable, in the same way that the RacketCS build > is, it’d just be nice to avoid making the at-a-glance commit status > meaningless. And just as we will surely promote the RacketCS job from an > “allowed failure” to an ordinary job once it passes consistently, we would of > course do the same for the various architecture jobs as well. > Yes, that's partially the solution. Currently I don't have enough machines or AWS time to dedicate to Racket builds so I will instead do the following straight away: - Regularly failing jobs will be marked as 'can fail', until they don't fail anymore and then I will remove the flag. - Move long running jobs or jobs for which I don't have straightaway enough machines available, to run nightly only. In the long term I would like CI jobs to finish in a respectable time: <1h or even <30mins. I would like all archs tested and no failures. This will take some time but we'll get there. > Thanks, > Alexis > Thanks for the suggestions and the poke. Now I am off to make racket green again. > [1]: https://github.com/racket/racket/commits/master > >> On Apr 2, 2019, at 02:59, 'Paulo Matos' via Racket Developers >> <[email protected]> wrote: >> >> Hello, >> >> Short Summary: I have added in 35d269c29 [1] cross architectural testing >> using virtualized qemu machines. There are problems - we need to fix those. >> >> Long Story: >> >> For months now, I have been wishing I could get cross-arch testing done >> on a regular basis on Racket. Initially I had something setup privately >> for RISC-V but I quickly noticed that the framework could be extended to >> other architectures. >> >> Thanks to Sam I got permission to get gitlab.com/racket/racket setup and >> get things moving. It took a couple of months to get everything right. >> Not necessarily due to inherent CI problems but I had to report a couple >> of Gitlab issues first, debug qemu as well and setup a few of my >> machines for this. >> >> The important things are: >> - with testing running on gitlab, people who would like to contribute >> CPU time to Racket can do so by setting up a gitlab runner on said >> machine (contact me for help). Because Gitlab CI free machines have a >> maximum timeout that's enough for normal testing but not enough for >> virtualization I needed to add some extra machines to do these specific >> jobs. Besides the Gitlab CI machines, we have a 4 CPU x86_64, a 16 CPU >> x86_64 and a rpi3 running in my server room. Of course, with more >> machines, more tests can run simultaneously and therefore provide >> quicker feedback. >> - Matthew pointed to me a few archs Racket should support so I added those: >> Testing added for Racket: >> Native: armv7l (running on rpi3), x86_64 >> Emulated: arm64, armel, armhf, i386, mips, mips64el, mipsel, ppc64el, >> s390x >> >> Testing added for Racket CS: >> Native: x86_64 >> Emulation: i386 >> >> - There are problems and initially because so many of the architectures >> fail either to compile or to test I assumed that this was a qemu bug. >> Since I am not a virtualization expert it took me a few days and some >> help from the qemu people to setup an environment to debug qemu inside a >> chroot inside a docker container running racket in a different arch. >> Afer some analysis, it turned out the segfault during compilation was >> definitely coming from Racket [5]. In a discussion with Matthew he >> proposed I could disable generational GC to ease debugging of the >> problem. Turns out disabling it, caused the sigsegv not to occur any >> more. So, at this point I think we are in the realm of a problem in >> Racket. I haven't gotten to the bottom of this yet, but hopefully when I >> do we can get all the lights green in the cross-arch testing. >> >> There are a few things I would like to do in the future like running >> benchmarks on a regular basis on Racket and RacketCS and have these >> displayed on a dashboard but these will come later. First I would like >> look into these failures which might be related to #2018 [2] and #1749 [3]. >> >> Lastly, this is another opportunity to help fix some Racket issues and >> get involved. If you are into different archs, debugging and >> contributing take a look at the logs coming out of the pipelines [4]. >> >> If you need some help or clarification on any of this, let me know. >> >> [1] >> https://github.com/racket/racket/commit/35d269c29eee6f6f7f3f83ea6f01b92ae1db180a >> [2] https://github.com/racket/racket/issues/2018 >> [3] https://github.com/racket/racket/issues/1749 >> [4] https://gitlab.com/racket/racket/pipelines/ >> [5] https://gitlab.com/racket/racket/-/jobs/188658454 >> >> -- >> Paulo Matos > -- Paulo Matos -- You received this message because you are subscribed to the Google Groups "Racket Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-dev/e8c850b5-1c06-945b-1449-3c62796b71a3%40linki.tools. For more options, visit https://groups.google.com/d/optout.
Re: [racket-dev] CI improved for Racket
'Paulo Matos' via Racket Developers Wed, 10 Apr 2019 00:15:37 -0700
- [racket-dev] CI improved for Racket 'Paulo Matos' via Racket Developers
- Re: [racket-dev] CI improved for ... Alexis King
- Re: [racket-dev] CI improved ... 'Paulo Matos' via Racket Developers
- Re: [racket-dev] CI impro... 'Paulo Matos' via Racket Developers
- Re: [racket-dev] CI impro... jackhfirth
- Re: [racket-dev] CI i... 'Paulo Matos' via Racket Developers
- Re: [racket-dev]... Jack Firth
- Re: [racket-... 'Paulo Matos' via Racket Developers
- Re: [rac... 'Paulo Matos' via Racket Developers
- Re: [rac... Jack Rosenthal
- Re: [rac... 'Paulo Matos' via Racket Developers
