Bug#930487: lintian: speed up test suite CI

2019-08-24 Thread Dmitry Bogatov


[2019-08-22 11:14] "Chris Lamb" 
> Felix Lechner wrote:
>
> > We are still working on this bug for you. Meanwhile, the running time
> > was reduced by about 20 minutes.
>
> FYI I hope to finish/refine my CI caching stuff over the next week or
> so and get that onto master.

Almost 1/3. Impressive. Thank you for you work.
-- 
Note, that I send and fetch email in batch, once in a few days.
Please, mention in body of your reply when you add or remove recepients.



Bug#930487: lintian: speed up test suite CI

2019-08-22 Thread Chris Lamb
Felix Lechner wrote:

> We are still working on this bug for you. Meanwhile, the running time
> was reduced by about 20 minutes.

FYI I hope to finish/refine my CI caching stuff over the next week or
so and get that onto master.


Best wishes,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org 🍥 chris-lamb.co.uk
   `-



Bug#930487: lintian: speed up test suite CI

2019-08-22 Thread Felix Lechner
Hi,

On Thu, Jun 13, 2019 at 8:51 AM Dmitry Bogatov  wrote:
>
> Gitlab CI jobs take very long to complete: around 1.5 hours.

We are still working on this bug for you. Meanwhile, the running time
was reduced by about 20 minutes. The change is described here:

https://lists.debian.org/debian-lint-maint/2019/08/msg00280.html

Kind regards

Felix Lechner



Bug#930487: lintian: speed up test suite CI

2019-07-07 Thread Felix Lechner
Hi Dmitry,

On Sat, Jul 6, 2019 at 11:27 PM Dmitry Bogatov  wrote:
>
> > I think the most common use-case is that I make some change — possibly
> > extremely minor — in, say, checks/foo.pm and I want to re-run the
> > testsuite to check that I've fixed whatever false-positive or edge-case
> > in a tag that I am in the process of implementing. We should optimise
> > for kind of pattern. What do you think?

In that scenario, I usually restrict the run with
"--onlyrun=check:manpages" (after editing checks/manpages.pm). You can
also use the selectors 'test:' or 'tag:'.

> Ideally, I want test suite automatically detect:
>
>  * test directory edited, test.deb needs to be rebuilt
>  + I do not want re-run perl-critic and perltidy checks aganist files,
>that were not changed

I have not found a way to implement that. An incremental builder I
wrote based on file modification times (in
lib/Test/StagedFileProducer.pm) cannot tell the difference between two
files that were built within one second of another. That is the
granularity of Linux filesystems. The templating mechanism relies on
many intermediate build products. There should be a way to rebuild a
test case if any of the inputs have changed, but I haven't found it.
Patches are welcome.

> In short, I want test suite to be dependable build system from source
> files and test files into tarball of test logs. (~950 files or so)

I think it's more or less dependable. Our issue is speed.

You can find logs for each test case in ./debian/test-out/.../log.

> Right now it feels like build system written in Make -- undercontrained
> in one cases (you can easily get stalled results), overconstrained in
> another cases (you build more then necessary -- perlcritic tests, for
> example).

I think the test suite builds exactly what is necessary. It just takes
a long time. :)

Perlcritic (or more often perltidy) is something we impose on
ourselves. There are maddening moments, particularly when the software
corrects itself, but on balance I am in favor.

> To get feeling what is perfect (in sense of dependencies correctness)
> build system, take a look at `tup'[1]. Read the tutorial, it is worthy
> reading.

How does 'tup' deal with the timestamp granularity? Is that solved
with a 'directed acyclic graph (DAG)'?

> That is why I talk about Shake -- it is more complicated, compared to
> tup, but it supports dynamic (monadic) rules, so previous example with
> `foo.tar' is basic example of Shake.

Perl is a generalized programming language (and Debian runs on it).
Everyone can contribute. I think the best path forward is to try
caching, as Chris suggested. If that does not work, we should consider
building the test packages separately.

Thank you for your interest in the Lintian test suite. The long build
times bother many people. I would very much like to reduce them.



Bug#930487: lintian: speed up test suite CI

2019-07-06 Thread Dmitry Bogatov


@Felix

  I think, that your proposal of caching test binary package separately
  from test output would be improvement. Maybe it would even be "good
  enough".

I will take a look (no promises made) into this direction. But
ideally...

[2019-07-04 18:00] "Chris Lamb" 
> I think the most common use-case is that I make some change — possibly
> extremely minor — in, say, checks/foo.pm and I want to re-run the
> testsuite to check that I've fixed whatever false-positive or edge-case
> in a tag that I am in the process of implementing. We should optimise
> for kind of pattern. What do you think?

Ideally, I want test suite automatically detect:

 * new test directory created, test.deb needs to be built
 * test directory edited, test.deb needs to be rebuilt
 * this configuration was already checked, nothing to do

 + I do not want re-run perl-critic and perltidy checks aganist files,
   that were not changed
 + ...

In short, I want test suite to be dependable build system from source
files and test files into tarball of test logs. (~950 files or so)

Right now it feels like build system written in Make -- undercontrained
in one cases (you can easily get stalled results), overconstrained in
another cases (you build more then necessary -- perlcritic tests, for
example).

To get feeling what is perfect (in sense of dependencies correctness)
build system, take a look at `tup'[1]. Read the tutorial, it is worthy
reading.

Unfortunately for our case, `tup' only supports static dependencies and
outputs (applicative), so you can't do following with it:

 * foo.txt contains list of files
 * foo.tar contains files, listed in foo.txt
 * foo.tar perfectly depends on foo.txt and files, listed in it.

That is why I talk about Shake -- it is more complicated, compared to
tup, but it supports dynamic (monadic) rules, so previous example with
`foo.tar' is basic example of Shake.

In theory, it is possible to implement all features I mentioned with
ad-hoc Perl code, but I wouldn't volonteer to do so.

@Felix:

  As to question of maturity, Shake is Haskell library, implemented and
  maintainer by well-known and well-esteemed developer. I do not expect
  it disappear anytime soon.

  Of course, I do not plan (or propose) to re-implement `dpkg-buildpackage',
  I just want it to be called when and only when it is necessary.

[1] http://gittup.org/tup
-- 
Note, that I send and fetch email in batch, once in a few days.
Please, mention in body of your reply when you add or remove recepients.



Bug#930487: lintian: speed up test suite CI

2019-07-04 Thread Chris Lamb
Hi Felix,

> > I'm very much in favour of us exhausting all caching opportunies, both
> > on our CI system and locally,
> 
> Would it help to separate the build product for each test package,
> which will presumably be cached, from the test output (tags.actual,
> tagdiff and friends)?

Let's zoom out here so we are not asking "Xy questions" of each other.

I think the most common use-case is that I make some change — possibly
extremely minor — in, say, checks/foo.pm and I want to re-run the
testsuite to check that I've fixed whatever false-positive or edge-case
in a tag that I am in the process of implementing. We should optimise
for kind of pattern. What do you think?

What this means in terms of the implementation detail of the testsuite
(which I'm afraid I have let get beyond my insight) I would have to
leave up to you, alas.


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org 🍥 chris-lamb.co.uk
   `-



Bug#930487: lintian: speed up test suite CI

2019-07-04 Thread Felix Lechner
Hi Chris,

On Thu, Jul 4, 2019 at 12:31 PM Chris Lamb  wrote:
>
> I'm very much in favour of us exhausting all caching opportunies, both
> on our CI system and locally,

Would it help to separate the build product for each test package,
which will presumably be cached, from the test output (tags.actual,
tagdiff and friends)?

Kind regards
Felix



Bug#930487: lintian: speed up test suite CI

2019-07-04 Thread Chris Lamb
Hi Dmitry,

> What is Lintian policy on usage of languages other then Perl?

I'm very much in favour of us exhausting all caching opportunies, both
on our CI system and locally, before we introduce another language and
all the complications that would entail.


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org 🍥 chris-lamb.co.uk
   `-



Bug#930487: lintian: speed up test suite CI

2019-07-04 Thread Felix Lechner
Hi Dmitry,

On Thu, Jul 4, 2019 at 10:41 AM Dmitry Bogatov  wrote:
>
> It feels that build system for test suite is overconstrained now
> (rebuilds more then necessary), and I consider to make some experiments
> with Shake[1]. Should the experiments succeed, will they be accepted?

The large majority of test packages is built using
'dpkg-buildpackage'. Do you plan to rewrite it?

I looked into using golang for the test suite but, for the things we
are doing, Perl is quite fast. It's also mature and available
everywhere. How does Shake compare?

Kind regards,
Felix



Bug#930487: lintian: speed up test suite CI

2019-07-04 Thread Dmitry Bogatov


[2019-06-13 21:29] "Chris Lamb" 
> Gitlab has a support for saving various parts of a successful build
> for the next one. I believe the idea is that we would build the test
> packages and then push them to this cache re-using them on any subsequent
> test runs. People often use this to cache "pip" Python dependencies
> but I don't see any obvious reason why we can't use it here.

What is Lintian policy on usage of languages other then Perl?
It feels that build system for test suite is overconstrained now
(rebuilds more then necessary), and I consider to make some experiments
with Shake[1]. Should the experiments succeed, will they be accepted?

 [1] https://shakebuild.com
-- 
Note, that I send and fetch email in batch, once in a few days.
Please, mention in body of your reply when you add or remove recepients.



Bug#930487: lintian: speed up test suite CI

2019-06-17 Thread Felix Lechner
On Mon, Jun 17, 2019 at 4:03 PM Chris Lamb  wrote:
>
> Chris Lamb wrote:
>
> The naĂŻve solution here might be to save & restore debian/test-out
> between runs.

I do not think that will work for very long, although there is a
better chance if we separate the expected tags from the artifact
directory. (The expected tags change relatively often a new checks are
implemented or old ones are tweaked.)

>
> There's currently no magical command in the
> fancy test runner of yours that will rebuild any missing or otherwise
> changed test packages is there…?

We used filesystem timestamps for a while, but the standard resolution
(1 sec) was not granular enough. AFAIR, we now generate everything
every time. We instead split the generation of test packages from the
test runs, although they currently just run consecutively. We could
probably skip the generation of test packages if they are already
present and nothing in t/ has changed.

> In other
> words, are we barking up the wrong tree here and what we need to do is
> use different GitLab CI stage altogether and pass "artifacts" around
> instead?
>
>   https://docs.gitlab.com/ee/ci/caching/index.html#cache-vs-artifacts

Artifacts may work, but uploading them separately without a dependency
scheme seems to invite other problems. Also, your local build
architecture and environment—which may figure into the artifacts you
upload—may not match what the the runner needs. (I am thinking about
stable or ubuntu-devel.)



Bug#930487: lintian: speed up test suite CI

2019-06-17 Thread Chris Lamb
Chris Lamb wrote:

> > Gitlab has a support for saving various parts of a successful build
> > for the next one. I believe the idea is that we would build the test
> > packages and then push them to this cache re-using them on any subsequent
> > test runs.
[..]

The naĂŻve solution here might be to save & restore debian/test-out
between runs. However, I'm not sure whether this would result in
changes to the testsuite itself resulting in the changed versions
being tested, resulting in all manner of false positive and false
negatives.

Felix, any insight here? There's currently no magical command in the
fancy test runner of yours that will rebuild any missing or otherwise
changed test packages is there…?

Or, before we implement that, am I asking an XY question? In other
words, are we barking up the wrong tree here and what we need to do is
use different GitLab CI stage altogether and pass "artifacts" around
instead?

  https://docs.gitlab.com/ee/ci/caching/index.html#cache-vs-artifacts

I'm not entirely sure, alas. (Although that magical command might be
useful locally...)


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org 🍥 chris-lamb.co.uk
   `-



Bug#930487: lintian: speed up test suite CI

2019-06-16 Thread Dmitry Bogatov


[2019-06-13 21:29] "Chris Lamb" 
> Felix Lechner wrote:
>
> > For Lintian, however, I would prefer to upload the test packages
> > separately to Debian's regular build infrastructure.
>
> Hm, I think you are talking at cross-purposes to Dmitry here. Nobody
> was suggesting we upload the test packages to Debian; that would
> surely be impossible.
>
>
> Gitlab has a support for saving various parts of a successful build
> for the next one. I believe the idea is that we would build the test
> packages and then push them to this cache re-using them on any subsequent
> test runs. People often use this to cache "pip" Python dependencies
> but I don't see any obvious reason why we can't use it here.

Thank you. That is exactly what I meant.
-- 
Note, that I send and fetch email in batch, once in a few days.



Bug#930487: lintian: speed up test suite CI

2019-06-13 Thread Chris Lamb
Felix Lechner wrote:

> For Lintian, however, I would prefer to upload the test packages
> separately to Debian's regular build infrastructure.

Hm, I think you are talking at cross-purposes to Dmitry here. Nobody
was suggesting we upload the test packages to Debian; that would
surely be impossible.

Gitlab has a support for saving various parts of a successful build
for the next one. I believe the idea is that we would build the test
packages and then push them to this cache re-using them on any subsequent
test runs. People often use this to cache "pip" Python dependencies
but I don't see any obvious reason why we can't use it here.

(This is indexed by some cache key so that changing the testsuite
itself could/would rebuild the packages as appropriate.)


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org 🍥 chris-lamb.co.uk
   `-



Bug#930487: lintian: speed up test suite CI

2019-06-13 Thread Felix Lechner
On Thu, Jun 13, 2019 at 8:51 AM Dmitry Bogatov  wrote:
>
> It would be great to ... avoid rebuiling package at every job run.

I would be nice to see how other projects deal with this issue. There
is some support for uploading the test packages separately. [1]

For Lintian, however, I would prefer to upload the test packages
separately to Debian's regular build infrastructure. The test packages
can and do have conflicting build dependencies and architectures.
These conflicts are not addressed currently, and may require separate
chroot build environments. It would be difficult to implement even in
a single bulk package. Debian's infrastructure, on the other hand, is
designed to build the packages.

At the same time, separate uploads would place an undue burden on the
archive's namespace and on the NEW queue. There would also be delays
for new tags, as Lintian may at some point require that tags are
tested. All test packages would have to be in the archive before the
lintian source is uploaded.

Right now, my favorite solution would be for the archive to offer
dependent namespaces for source packages (such as lintian/...). Such
internal packages could be uploaded separately and would not have to
go through the NEW queue. Outside packages could not depend on them,
but they would be installed if their source package requires them.

This idea will likely generate much opposition. Let me just say that I
am not sure my suggestion is worth the effort, or useful for anyone
else.

Kind regards,
Felix

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926409#42



Bug#930487: lintian: speed up test suite CI

2019-06-13 Thread Dmitry Bogatov

Package: lintian
Version: 2.15.0
Severity: wishlist

Dear Maintainer,

Gitlab CI jobs take very long to complete: around 1.5 hours. No wonder
-- test suite builds generates and builds more then 900 source packages,
and after that it runs Lintian in those source packages.

As can be seen here[1], build of source packages takes quite significant
portion of total run time: around 33 minutes. It would be great to use
Gitlab caching to avoid rebuiling package at every job run.

 [1] https://salsa.debian.org/kaction/lintian/-/jobs/195835



pgppI_V0mukwe.pgp
Description: PGP signature