Bug#930487: lintian: speed up test suite CI
[2019-08-22 11:14] "Chris Lamb" > Felix Lechner wrote: > > > We are still working on this bug for you. Meanwhile, the running time > > was reduced by about 20 minutes. > > FYI I hope to finish/refine my CI caching stuff over the next week or > so and get that onto master. Almost 1/3. Impressive. Thank you for you work. -- Note, that I send and fetch email in batch, once in a few days. Please, mention in body of your reply when you add or remove recepients.
Bug#930487: lintian: speed up test suite CI
Felix Lechner wrote: > We are still working on this bug for you. Meanwhile, the running time > was reduced by about 20 minutes. FYI I hope to finish/refine my CI caching stuff over the next week or so and get that onto master. Best wishes, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org 🍥 chris-lamb.co.uk `-
Bug#930487: lintian: speed up test suite CI
Hi, On Thu, Jun 13, 2019 at 8:51 AM Dmitry Bogatov wrote: > > Gitlab CI jobs take very long to complete: around 1.5 hours. We are still working on this bug for you. Meanwhile, the running time was reduced by about 20 minutes. The change is described here: https://lists.debian.org/debian-lint-maint/2019/08/msg00280.html Kind regards Felix Lechner
Bug#930487: lintian: speed up test suite CI
Hi Dmitry, On Sat, Jul 6, 2019 at 11:27 PM Dmitry Bogatov wrote: > > > I think the most common use-case is that I make some change — possibly > > extremely minor — in, say, checks/foo.pm and I want to re-run the > > testsuite to check that I've fixed whatever false-positive or edge-case > > in a tag that I am in the process of implementing. We should optimise > > for kind of pattern. What do you think? In that scenario, I usually restrict the run with "--onlyrun=check:manpages" (after editing checks/manpages.pm). You can also use the selectors 'test:' or 'tag:'. > Ideally, I want test suite automatically detect: > > * test directory edited, test.deb needs to be rebuilt > + I do not want re-run perl-critic and perltidy checks aganist files, >that were not changed I have not found a way to implement that. An incremental builder I wrote based on file modification times (in lib/Test/StagedFileProducer.pm) cannot tell the difference between two files that were built within one second of another. That is the granularity of Linux filesystems. The templating mechanism relies on many intermediate build products. There should be a way to rebuild a test case if any of the inputs have changed, but I haven't found it. Patches are welcome. > In short, I want test suite to be dependable build system from source > files and test files into tarball of test logs. (~950 files or so) I think it's more or less dependable. Our issue is speed. You can find logs for each test case in ./debian/test-out/.../log. > Right now it feels like build system written in Make -- undercontrained > in one cases (you can easily get stalled results), overconstrained in > another cases (you build more then necessary -- perlcritic tests, for > example). I think the test suite builds exactly what is necessary. It just takes a long time. :) Perlcritic (or more often perltidy) is something we impose on ourselves. There are maddening moments, particularly when the software corrects itself, but on balance I am in favor. > To get feeling what is perfect (in sense of dependencies correctness) > build system, take a look at `tup'[1]. Read the tutorial, it is worthy > reading. How does 'tup' deal with the timestamp granularity? Is that solved with a 'directed acyclic graph (DAG)'? > That is why I talk about Shake -- it is more complicated, compared to > tup, but it supports dynamic (monadic) rules, so previous example with > `foo.tar' is basic example of Shake. Perl is a generalized programming language (and Debian runs on it). Everyone can contribute. I think the best path forward is to try caching, as Chris suggested. If that does not work, we should consider building the test packages separately. Thank you for your interest in the Lintian test suite. The long build times bother many people. I would very much like to reduce them.
Bug#930487: lintian: speed up test suite CI
@Felix I think, that your proposal of caching test binary package separately from test output would be improvement. Maybe it would even be "good enough". I will take a look (no promises made) into this direction. But ideally... [2019-07-04 18:00] "Chris Lamb" > I think the most common use-case is that I make some change — possibly > extremely minor — in, say, checks/foo.pm and I want to re-run the > testsuite to check that I've fixed whatever false-positive or edge-case > in a tag that I am in the process of implementing. We should optimise > for kind of pattern. What do you think? Ideally, I want test suite automatically detect: * new test directory created, test.deb needs to be built * test directory edited, test.deb needs to be rebuilt * this configuration was already checked, nothing to do + I do not want re-run perl-critic and perltidy checks aganist files, that were not changed + ... In short, I want test suite to be dependable build system from source files and test files into tarball of test logs. (~950 files or so) Right now it feels like build system written in Make -- undercontrained in one cases (you can easily get stalled results), overconstrained in another cases (you build more then necessary -- perlcritic tests, for example). To get feeling what is perfect (in sense of dependencies correctness) build system, take a look at `tup'[1]. Read the tutorial, it is worthy reading. Unfortunately for our case, `tup' only supports static dependencies and outputs (applicative), so you can't do following with it: * foo.txt contains list of files * foo.tar contains files, listed in foo.txt * foo.tar perfectly depends on foo.txt and files, listed in it. That is why I talk about Shake -- it is more complicated, compared to tup, but it supports dynamic (monadic) rules, so previous example with `foo.tar' is basic example of Shake. In theory, it is possible to implement all features I mentioned with ad-hoc Perl code, but I wouldn't volonteer to do so. @Felix: As to question of maturity, Shake is Haskell library, implemented and maintainer by well-known and well-esteemed developer. I do not expect it disappear anytime soon. Of course, I do not plan (or propose) to re-implement `dpkg-buildpackage', I just want it to be called when and only when it is necessary. [1] http://gittup.org/tup -- Note, that I send and fetch email in batch, once in a few days. Please, mention in body of your reply when you add or remove recepients.
Bug#930487: lintian: speed up test suite CI
Hi Felix, > > I'm very much in favour of us exhausting all caching opportunies, both > > on our CI system and locally, > > Would it help to separate the build product for each test package, > which will presumably be cached, from the test output (tags.actual, > tagdiff and friends)? Let's zoom out here so we are not asking "Xy questions" of each other. I think the most common use-case is that I make some change — possibly extremely minor — in, say, checks/foo.pm and I want to re-run the testsuite to check that I've fixed whatever false-positive or edge-case in a tag that I am in the process of implementing. We should optimise for kind of pattern. What do you think? What this means in terms of the implementation detail of the testsuite (which I'm afraid I have let get beyond my insight) I would have to leave up to you, alas. Regards, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org 🍥 chris-lamb.co.uk `-
Bug#930487: lintian: speed up test suite CI
Hi Chris, On Thu, Jul 4, 2019 at 12:31 PM Chris Lamb wrote: > > I'm very much in favour of us exhausting all caching opportunies, both > on our CI system and locally, Would it help to separate the build product for each test package, which will presumably be cached, from the test output (tags.actual, tagdiff and friends)? Kind regards Felix
Bug#930487: lintian: speed up test suite CI
Hi Dmitry, > What is Lintian policy on usage of languages other then Perl? I'm very much in favour of us exhausting all caching opportunies, both on our CI system and locally, before we introduce another language and all the complications that would entail. Regards, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org 🍥 chris-lamb.co.uk `-
Bug#930487: lintian: speed up test suite CI
Hi Dmitry, On Thu, Jul 4, 2019 at 10:41 AM Dmitry Bogatov wrote: > > It feels that build system for test suite is overconstrained now > (rebuilds more then necessary), and I consider to make some experiments > with Shake[1]. Should the experiments succeed, will they be accepted? The large majority of test packages is built using 'dpkg-buildpackage'. Do you plan to rewrite it? I looked into using golang for the test suite but, for the things we are doing, Perl is quite fast. It's also mature and available everywhere. How does Shake compare? Kind regards, Felix
Bug#930487: lintian: speed up test suite CI
[2019-06-13 21:29] "Chris Lamb" > Gitlab has a support for saving various parts of a successful build > for the next one. I believe the idea is that we would build the test > packages and then push them to this cache re-using them on any subsequent > test runs. People often use this to cache "pip" Python dependencies > but I don't see any obvious reason why we can't use it here. What is Lintian policy on usage of languages other then Perl? It feels that build system for test suite is overconstrained now (rebuilds more then necessary), and I consider to make some experiments with Shake[1]. Should the experiments succeed, will they be accepted? [1] https://shakebuild.com -- Note, that I send and fetch email in batch, once in a few days. Please, mention in body of your reply when you add or remove recepients.
Bug#930487: lintian: speed up test suite CI
On Mon, Jun 17, 2019 at 4:03 PM Chris Lamb wrote: > > Chris Lamb wrote: > > The naïve solution here might be to save & restore debian/test-out > between runs. I do not think that will work for very long, although there is a better chance if we separate the expected tags from the artifact directory. (The expected tags change relatively often a new checks are implemented or old ones are tweaked.) > > There's currently no magical command in the > fancy test runner of yours that will rebuild any missing or otherwise > changed test packages is there…? We used filesystem timestamps for a while, but the standard resolution (1 sec) was not granular enough. AFAIR, we now generate everything every time. We instead split the generation of test packages from the test runs, although they currently just run consecutively. We could probably skip the generation of test packages if they are already present and nothing in t/ has changed. > In other > words, are we barking up the wrong tree here and what we need to do is > use different GitLab CI stage altogether and pass "artifacts" around > instead? > > https://docs.gitlab.com/ee/ci/caching/index.html#cache-vs-artifacts Artifacts may work, but uploading them separately without a dependency scheme seems to invite other problems. Also, your local build architecture and environment—which may figure into the artifacts you upload—may not match what the the runner needs. (I am thinking about stable or ubuntu-devel.)
Bug#930487: lintian: speed up test suite CI
Chris Lamb wrote: > > Gitlab has a support for saving various parts of a successful build > > for the next one. I believe the idea is that we would build the test > > packages and then push them to this cache re-using them on any subsequent > > test runs. [..] The naïve solution here might be to save & restore debian/test-out between runs. However, I'm not sure whether this would result in changes to the testsuite itself resulting in the changed versions being tested, resulting in all manner of false positive and false negatives. Felix, any insight here? There's currently no magical command in the fancy test runner of yours that will rebuild any missing or otherwise changed test packages is there…? Or, before we implement that, am I asking an XY question? In other words, are we barking up the wrong tree here and what we need to do is use different GitLab CI stage altogether and pass "artifacts" around instead? https://docs.gitlab.com/ee/ci/caching/index.html#cache-vs-artifacts I'm not entirely sure, alas. (Although that magical command might be useful locally...) Regards, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org 🍥 chris-lamb.co.uk `-
Bug#930487: lintian: speed up test suite CI
[2019-06-13 21:29] "Chris Lamb" > Felix Lechner wrote: > > > For Lintian, however, I would prefer to upload the test packages > > separately to Debian's regular build infrastructure. > > Hm, I think you are talking at cross-purposes to Dmitry here. Nobody > was suggesting we upload the test packages to Debian; that would > surely be impossible. > > > Gitlab has a support for saving various parts of a successful build > for the next one. I believe the idea is that we would build the test > packages and then push them to this cache re-using them on any subsequent > test runs. People often use this to cache "pip" Python dependencies > but I don't see any obvious reason why we can't use it here. Thank you. That is exactly what I meant. -- Note, that I send and fetch email in batch, once in a few days.
Bug#930487: lintian: speed up test suite CI
Felix Lechner wrote: > For Lintian, however, I would prefer to upload the test packages > separately to Debian's regular build infrastructure. Hm, I think you are talking at cross-purposes to Dmitry here. Nobody was suggesting we upload the test packages to Debian; that would surely be impossible. Gitlab has a support for saving various parts of a successful build for the next one. I believe the idea is that we would build the test packages and then push them to this cache re-using them on any subsequent test runs. People often use this to cache "pip" Python dependencies but I don't see any obvious reason why we can't use it here. (This is indexed by some cache key so that changing the testsuite itself could/would rebuild the packages as appropriate.) Regards, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org 🍥 chris-lamb.co.uk `-
Bug#930487: lintian: speed up test suite CI
On Thu, Jun 13, 2019 at 8:51 AM Dmitry Bogatov wrote: > > It would be great to ... avoid rebuiling package at every job run. I would be nice to see how other projects deal with this issue. There is some support for uploading the test packages separately. [1] For Lintian, however, I would prefer to upload the test packages separately to Debian's regular build infrastructure. The test packages can and do have conflicting build dependencies and architectures. These conflicts are not addressed currently, and may require separate chroot build environments. It would be difficult to implement even in a single bulk package. Debian's infrastructure, on the other hand, is designed to build the packages. At the same time, separate uploads would place an undue burden on the archive's namespace and on the NEW queue. There would also be delays for new tags, as Lintian may at some point require that tags are tested. All test packages would have to be in the archive before the lintian source is uploaded. Right now, my favorite solution would be for the archive to offer dependent namespaces for source packages (such as lintian/...). Such internal packages could be uploaded separately and would not have to go through the NEW queue. Outside packages could not depend on them, but they would be installed if their source package requires them. This idea will likely generate much opposition. Let me just say that I am not sure my suggestion is worth the effort, or useful for anyone else. Kind regards, Felix [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926409#42
Bug#930487: lintian: speed up test suite CI
Package: lintian Version: 2.15.0 Severity: wishlist Dear Maintainer, Gitlab CI jobs take very long to complete: around 1.5 hours. No wonder -- test suite builds generates and builds more then 900 source packages, and after that it runs Lintian in those source packages. As can be seen here[1], build of source packages takes quite significant portion of total run time: around 33 minutes. It would be great to use Gitlab caching to avoid rebuiling package at every job run. [1] https://salsa.debian.org/kaction/lintian/-/jobs/195835 pgppI_V0mukwe.pgp Description: PGP signature