Re: reprotest: inadvertent misconfiguration in salsa-ci config

2024-03-05 Thread Chris Lamb
James Addison wrote:

> I've opened a merge request[1] to explore this error-treatment approach; it
> lacks useful error messaging so far, but I'll attempt to add that soon.

In your enthusiasm I think you neglected to included the actual "[1]"
URL later in your mail. However, allow me to do that for you:

  [1] https://salsa.debian.org/reproducible-builds/reprotest/-/merge_requests/23


Best wishes,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 💠
⬊   ⬋
  o



Re: reprotest: inadvertent misconfiguration in salsa-ci config

2024-03-04 Thread James Addison via rb-general
Hi Chris, Vagrant,

On Tue, 27 Feb 2024 at 17:44, Vagrant Cascadian
 wrote:
>
> On 2024-02-27, Chris Lamb wrote:
> >> * Update reprotest to handle a single-disabled-varations-value as a
> >>   special case - treating it as vary and/or emitting a warning.
>
> Well, I would broaden this to include an arbitrary number of negating
> options:
>
>   --variations=-time,-build_path
>
> That seems just as invalid.
>
> The one special case I could see is "--variations=-all" where you might
> want to be normalizing as much as possible.

Hmm, yep.  So when there are only subtractions, we _could_ imply that
there is an
implicit '+all' at the beginning of the 'variations' argument.

And along that line of thinking, we could emit a warning to stderr:

  $ reprotest auto --dry-run --variations=-timezone
  Implicitly expanding variations '-timezone' to '+all,-timezone'
  ...

> > On whether to magically/transparently fix this, needless to say, it's
> > considered bad practice to change the behaviour of software that has
> > already been released — I would, as a rule, subscribe to that idea.
> > However, we should bear in mind that this idea revolves around what
> > users are *expecting*, not necessarily what the software actually
> > does.
> >
> > I say that because I hazard that all 400 usages are indeed expecting
> > that `--variations=-foo` functions the same as `--variations=all,-foo`
> > (or `--vary=-foo`), and so this proposed change would merely be
> > modifying reprotest to reflect their existing expectations. It would
> > not therefore be a violation of the "don't break existing
> > functionality" dictum.
> >
> > (Saying that, the addition of a warning that we are doing so would
> > definitely not go amiss.)
>
> Hrm. Less inclined toward this approach; expectations can shift with
> time and context and culture and whatnot. That said, I agree the current
> behavior is confusing, and we should change something explicitly, rather
> than implicitly...

Changing-existing-behaviours could arguably be even more problematic for
cases like this where we're talking about continuous integration checks.

Breaking/unbreaking unrelated CI pipelines seems like something we should be
careful to avoid.

> >> * Treat removal of a variance factor from an already-empty-context
> >> as an error.
> >
> > I'm also tempted by this as well. :)  How would this be experienced by
> > most DDs? Would their new pushes to Salsa now suddenly fail in the
> > reprotest job of the pipeline? If so, that's not too awful, given that
> > the prominent error message would presumably let them know precisely
> > how to fix it.
>
> I would much prefer an error message if we can correctly identify this.

That'd be nice - perhaps something like:

  Failed to parse variations: '-timezone'; did you mean '+all,-timezone'?

I've opened a merge request[1] to explore this error-treatment approach; it
lacks useful error messaging so far, but I'll attempt to add that soon.

> Some possible expected behaviors to consider treating as invalid, and
> issue an error:
>
>   --variations=-build_path
>
>   --variations=-time,-build_path
>
> This almost makes me want to entirely deprecate --variations, and switch
> to recommending "--vary=-all,+whatever" or "--vary=-all
> --vary=+whatever" instead of ever using --variations.
>
> I'm not sure the variations syntax enables much that cannot be more
> unambiguously expressed with --vary.

I do think that supporting two command-line argument names that provide
similar operations (and use similar names!) is confusing.

However I'm inclined to limit the effect of any behaviour changes here to the
specific cases that we know are problematic (ref previous thoughts about CI
infrastructure).

> That said, the reprotest code is a bit hairy, and I am not sure what
> sort of refactoring will be needed to make this possible. In particular,
> how --auto-build is implemented, where it systematically tests each
> variation one at a time. That said, Refactoring might be needed
> regardless. :)

That's a neat bit of functionality in auto-build.  As far as I can tell, it
seems agnostic of whether the build specifications are provided by 'vary' or
'variations' -- but test coverage would be better at confirming that.

Regards,
James


Re: reprotest: inadvertent misconfiguration in salsa-ci config

2024-02-28 Thread Chris Lamb
Vagrant Cascadian wrote:

> This almost makes me want to entirely deprecate --variations, and switch
> to recommending "--vary=-all,+whatever" or "--vary=-all
> --vary=+whatever" instead of ever using --variations.

This is also a very tempting option. I mean, if we're going to emit an
error (ie. break some existing configurations), then we might as
properly fix the core of this UI issue. And this would also save us
working out which --variations invocations are "bad" and which are
acceptable.

> I'm not sure the variations syntax enables much that cannot be more
> unambiguously expressed with --vary.

Indeed. And, y'know, if there was a call for it, we could add a new
and less confusing version of the --variations option under a
different, unambiguous name — perhaps something like --only-vary=a,b.

> I am not sure what sort of refactoring will be needed to make this
> possible. In particular, how --auto-build is implemented […]

(I think this is implemented internally to reprotest, on a different
abstraction layer to the command-line argument handling.)

Any strong opinions from elsewhere...?


Best wishes,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 💠
⬊   ⬋
  o


Re: reprotest: inadvertent misconfiguration in salsa-ci config

2024-02-27 Thread Vagrant Cascadian
On 2024-02-27, Chris Lamb wrote:
>> * Update reprotest to handle a single-disabled-varations-value as a
>>   special case - treating it as vary and/or emitting a warning.

Well, I would broaden this to include an arbitrary number of negating
options:

  --variations=-time,-build_path

That seems just as invalid.

The one special case I could see is "--variations=-all" where you might
want to be normalizing as much as possible.


> On whether to magically/transparently fix this, needless to say, it's
> considered bad practice to change the behaviour of software that has
> already been released — I would, as a rule, subscribe to that idea.
> However, we should bear in mind that this idea revolves around what
> users are *expecting*, not necessarily what the software actually
> does.
>
> I say that because I hazard that all 400 usages are indeed expecting
> that `--variations=-foo` functions the same as `--variations=all,-foo`
> (or `--vary=-foo`), and so this proposed change would merely be
> modifying reprotest to reflect their existing expectations. It would
> not therefore be a violation of the "don't break existing
> functionality" dictum.
>
> (Saying that, the addition of a warning that we are doing so would
> definitely not go amiss.)

Hrm. Less inclined toward this approach; expectations can shift with
time and context and culture and whatnot. That said, I agree the current
behavior is confusing, and we should change something explicitly, rather
than implicitly...


>> * Treat removal of a variance factor from an already-empty-context
>> as an error.
>
> I'm also tempted by this as well. :)  How would this be experienced by
> most DDs? Would their new pushes to Salsa now suddenly fail in the
> reprotest job of the pipeline? If so, that's not too awful, given that
> the prominent error message would presumably let them know precisely
> how to fix it.

I would much prefer an error message if we can correctly identify this.

Some possible expected behaviors to consider treating as invalid, and
issue an error:

  --variations=-build_path

  --variations=-time,-build_path

This almost makes me want to entirely deprecate --variations, and switch
to recommending "--vary=-all,+whatever" or "--vary=-all
--vary=+whatever" instead of ever using --variations.

I'm not sure the variations syntax enables much that cannot be more
unambiguously expressed with --vary.

That said, the reprotest code is a bit hairy, and I am not sure what
sort of refactoring will be needed to make this possible. In particular,
how --auto-build is implemented, where it systematically tests each
variation one at a time. That said, Refactoring might be needed
regardless. :)


live well,
  vagrant


signature.asc
Description: PGP signature


Re: reprotest: inadvertent misconfiguration in salsa-ci config

2024-02-27 Thread Chris Lamb
Hi James,

Great post, thank you. So, I'm in two minds re. the way forward:

> * Update reprotest to handle a single-disabled-varations-value as a
>   special case - treating it as vary and/or emitting a warning.

On whether to magically/transparently fix this, needless to say, it's
considered bad practice to change the behaviour of software that has
already been released — I would, as a rule, subscribe to that idea.
However, we should bear in mind that this idea revolves around what
users are *expecting*, not necessarily what the software actually
does.

I say that because I hazard that all 400 usages are indeed expecting
that `--variations=-foo` functions the same as `--variations=all,-foo`
(or `--vary=-foo`), and so this proposed change would merely be
modifying reprotest to reflect their existing expectations. It would
not therefore be a violation of the "don't break existing
functionality" dictum.

(Saying that, the addition of a warning that we are doing so would
definitely not go amiss.)

> * Treat removal of a variance factor from an already-empty-context
> as an error.

I'm also tempted by this as well. :)  How would this be experienced by
most DDs? Would their new pushes to Salsa now suddenly fail in the
reprotest job of the pipeline? If so, that's not too awful, given that
the prominent error message would presumably let them know precisely
how to fix it.


Best wishes,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 💠
⬊   ⬋
  o



reprotest: inadvertent misconfiguration in salsa-ci config

2024-02-26 Thread James Addison via rb-general
Hello,

A few hundred packages that use reprotest in Salsa-CI appear to be
misconfigured; the remainder of this message explains the problem, and
asks for help figuring out what to do.

Context
---
The reprotest[1] utility tests reproducibility of .deb package builds
by performing two comparative builds with selective differences in the
environment.

As documented[2], the extent of build-env difference can be customized
using the 'variations' command-line argument, that has a default value
of 'all', or similarly the 'vary' argument.  These arguments can be
used together, and they support plus-or-minus symbols as value
prefixes (+/-) to indicate whether a variance factor is being added or
removed.

The reprotest commandline is parsed in sequence from left-to-right,
with each 'vary' argument applied like a patch -- amending existing
settings -- while in contrast each 'variations' argument performs a
complete reset of the variance context.

To examine/confirm reprotest's behaviour locally I can recommend its
'--dry-run' argument, instructing it to print what it would do without
performing any build actions.

Problem: misconfiguration case
--
Although the single argument '--variations=-timezone' could reasonably
be expected to disable a single form of variance (timezone) during a
test, in fact it resets the variance context to empty (it does not
contain 'all', begins with an empty context, and then attempts
performs a no-op removal of timezone from that).

This could allow packages to succeed when they would otherwise fail if
the intended level of build variation was enabled.

This misconfiguration has occurred in practice, and based on some code
searches (example[3]) I believe that around 400 Debian packages are
affected by this.

Resolution
--
My working assumption is that packages that have a single
negative-variations entry (like the -timezone example above) intended
to disable solely the named factor during reprotest testing.

To resolve this it seems that we could:

  * Update the salsa-ci.yml files in each affected case to replace
'--variations=-' with '--vary=-'.
  * Update reprotest to handle a single-disabled-varations-value as a
special case - treating it as vary and/or emitting a warning.
  * Treat removal of a variance factor from an already-empty-context
as an error.
  * Radically, remove the ability for packages to customize their
reprotest arguments at all.

To readers of these lists: does this analysis and set of assumptions
make sense, and if so: do you prefer/recommend any of the suggested
approaches, or have alternative suggestions of your own?

Thank you,
James

[1] - https://salsa.debian.org/reproducible-builds/reprotest

[2] - 
https://salsa.debian.org/reproducible-builds/reprotest/-/blob/6cb0328ea422e12d115737714627850745f93a71/README.rst?plain=1#L299-311

[3] - 
https://codesearch.debian.net/search?q=path%3Asalsa-ci.yml+SALSA_CI_REPROTEST_ARGS%3A+%27--variations%3D-build-path%27&literal=1