Hi,

I agree that we have problems in our release verification
script. But I think that it's useful for us.

> what the verification script is testing for

1. Verify cryptographic signatures of all artifacts
2. Compile source codes and run unit tests
3. Install binary packages and test them

> *why* it is testing for what it is testing

It's for legal protection. (1. only focus on it. I noticed
that license check by RAT isn't executed in the verification
script.)

https://www.apache.org/legal/release-policy.html#why

> The purpose of a clear line is to inform our legal
> strategy of providing protection for formal participants
> involved in producing releases

---

> * platform compatibility is (supposed to be) exercised on Continuous
>   Integration; there is no understandable reason why it should be
>   ceremoniously tested on each developer's machine before the release

Our CI covers many cases but we can't cover all cases. For
example, my environment (Debian GNU/Linux sid with GPU
without conda) isn't covered.

If a verifier's environment is covered by CI, I agree with
you. It's duplicated. The verifier will verify only
signatures and licenses.

Note that the verification script is used for CI too. So I
think that it's useful.

> * just before a release is probably the wrong time to be testing
>   platform compatibility, and fixing compatibility bugs (though, of
>   course, it might still be better than not noticing?)

It's better that we catch it in a timely manner.

I think that verifiers should verify an RC with their
downstream projects too. For example, I'm verifying an RC
with Groonga, a full-text search engine.

Some downstream projects such as pandas and Dask are covered
by our CI but we can't cover all downstream projects.

> * home environments are unstable, and not all developers run the
>   verification script for each release, so each release is actually
>   verified on different, uncontrolled, platforms

I think that it's an useful property for testing. We can
cover more cases by it. (Rare environments such as my
environment may not be valuable...) Users who use a release
use many different environments.

But it's better that we can catch it in timely manner.

> * maintaining the verification scripts is a thankless task, in part due
>   to their nature (they need to track and mirror changes made in each
>   implementation's build chain), in part due to implementation choices

I agree with you.

I think that one of useful tests in the verification script
is integration test without conda. (Our CI always uses conda
for integration test.) I hope that we can implement it with
low maintenance cost.

> * due to the existence of the verification scripts, the release vote is
>   focussed on getting the script to run successfully (a very contextual
>   and non-reproducible result), rather than the actual *contents* of the
>   release

Oh, really?
I added some checks to the verification script when I found
missing checks before we release a new version. For example,
I added .deb/.rpm install check.

But others may don't think about changing the verification
script...


Thanks,
-- 
kou

In <58ed893d-1e4d-0a67-c037-15a70eb33...@python.org>
  "[Discuss] Do we need a release verification script?" on Tue, 22 Aug 2023 
10:54:36 +0200,
  Antoine Pitrou <anto...@python.org> wrote:

> 
> Hello,
> 
> Abiding by the Apache Software Foundation's guidelines, every Arrow
> release is voted on and requires at least 3 "binding" votes to be
> approved.
> 
> Also, every Arrow release vote is accompanied by a little ceremonial
> where contributors and core developers run a release verification
> script on their machine, wait for long minutes (sometimes an hour) and
> report the results.
> 
> This ceremonial has gone on for years, and it has not really been
> questioned. Yet, it's not obvious to me what it is achieving
> exactly. I've been here since 2018, but I don't really understand what
> the verification script is testing for, or, more importantly, *why* it
> is testing for what it is testing. I'm probably not the only one?
> 
> I would like to bring the following points:
> 
> * platform compatibility is (supposed to be) exercised on Continuous
> * Integration; there is no understandable reason why it should be
> * ceremoniously tested on each developer's machine before the release
> 
> * just before a release is probably the wrong time to be testing
> * platform compatibility, and fixing compatibility bugs (though, of
> * course, it might still be better than not noticing?)
> 
> * home environments are unstable, and not all developers run the
> * verification script for each release, so each release is actually
> * verified on different, uncontrolled, platforms
> 
> * as for sanity checks on binary packages, GPG signatures, etc., there
> * shouldn't be any need to run them on multiple different machines, as
> * they are (should be?) entirely deterministic and platform-agnostic
> 
> * maintaining the verification scripts is a thankless task, in part due
> * to their nature (they need to track and mirror changes made in each
> * implementation's build chain), in part due to implementation choices
> 
> * due to the existence of the verification scripts, the release vote is
> * focussed on getting the script to run successfully (a very contextual
> * and non-reproducible result), rather than the actual *contents* of the
> * release
> 
> The most positive thing I can personally say about the verification
> scripts is that they *may* help us trust the release is not broken?
> But that's a very unqualified statement, and is very close to
> cargo-culting.
> 
> Regards
> 
> Antoine.

Reply via email to