git quirk: core.autocrlf

2024-04-21 Thread James Addison via rb-general
Hi folks,

This message isn't _directly_ related to reproducible builds, but it
does relate to unexpected differences in text (including, potentially,
source code) checked out from git repositories, and I think that that
could be relevant to the audience here.

Some of the code within the Sphinx documentation generator removes
carriage-return ('\r' in Python string literal escape code notation)
characters from input documents before checksumming them, and that
part of the code puzzled me - generally any kind of content
modification before checksumming seems like a code smell to me.

The relevant code removes those carriage-returns so that the checksums
produced are in a sense cross-platform compatible; that is, the 'same
content' produces the same checksum whether the platform uses CRLF or
LF line-endings.

Now, Python itself does include some functionality[1] to handle what
it refers to as 'universal newlines'; newlines in strings are
generally represented using a single '\n' character, that is
serialized and deserialized to CRLF or LF as platform-appropriate.
This is stable, mature and well-established behaviour at this point.

That universal newline handling may cause problems in some cases if
not handled carefully, but surprisingly -- at least to me -- 'git'
itself also automatically converts the line-endings of files to the
local platform's standard.

I suppose this makes sense so that developer tooling designed for each
platform works as-expected with text stored in git repositories
(which, internally, store the newlines using LF).

However it does mean that the checksums of files checked out from the
same origin git repository can differ on different OS platforms.

Overriding this behaviour on a per-file basis is possible using
.gitattributes config[2] file(s) within the repository, or
alternatively a git client system system can use the 'core.autocrlf'
configuration setting[3] to specify the desired line-ending-conversion
method.

Again: this is probably slightly off-topic and perhaps not of direct
relevance to anyone on the list today.  However, it seems like the
kind of issue that is useful to be aware of if-and-when puzzling over
unexpected git content / checksum issues (situations that I _do_
expect people on this list encounter from time-to-time).

Regards,
James

[1] - https://docs.python.org/3.12/glossary.html#term-universal-newlines

[2] - https://git-scm.com/docs/gitattributes

[3] - 
https://git-scm.com/docs/git-config#Documentation/git-config.txt-coresafecrlf

PS: For anyone concerned that this might inadvertently expose some
kind of checksumming vulnerability; I briefly worried about that after
determining the line-ending behaviour to be the cause.  Padding of
source files with carriage-returns could be a way for bad actors to
attempt to find checksum collisions, yes; but equally, newlines -- or
spaces -- are available to achieve the same.  Are there any languages
that attempt to prevent arbitrary source code padding so that
checksum-space-exploration from a known code plaintext is constrained?
 Golang and other languages that require or support autoformatting may
be the safest bets.


Re: Arch Linux minimal container userland 100% reproducible - now what?

2024-04-02 Thread James Addison via rb-general
Hi John,

On Fri, 29 Mar 2024 at 19:29, John Gilmore  wrote:
>
> kpcyrd  wrote:
> > 1) There's currently no way to tell if a package can be built offline
> > (without trying yourself).
>
> Packages that can't be built offline are not reproducible, by
> definition.  They depend on outside events and circumstances
> in order for a third party to reproduce them successfully.
>
> So, fixing that in each package would be a prerequisite to making a
> reproducible Arch distro (in my opinion).

This perspective is valuable because it is certainly true that unreliable
or unexpected responses from a network adapter could cause software builds to
fail, be delayed, or contain errors.

However I fail to see why any of those circumstances would not be
equally possible
in the case of equivalent responses from physically or locally attached I/O
devices.

A storage device could be considered a node on a local network that no other
host is able to communicate with directly; and to my knowledge it's rarely the
case that traffic to-and-from local storage devices is inspected for integrity
by hardware/software outside of the device that it is connected to (which
isn't necessarily the place that it makes sense to run those checks).

My guess is that we could get into near-unsolvable philosophical territory
along this path, but I think it's worth being skeptical of the notions that
local-storage is always trustworthy and that the network should always be
avoided.

Regards,
James


Re: Two questions about build-path reproducibility in Debian

2024-04-02 Thread James Addison via rb-general
Thanks, Chris,

On Sun, 31 Mar 2024 at 13:01, Chris Lamb  wrote:
>
> Hi James,
>
> > Approximately thirty are still set to other severity levels, and I plan to
> > update those with the following adjusted messaging […]
>
> Looks good to me. :)
>
> Completely out of interest, are any of those 30 bugs tagged both
> "buildpath" and "toolchain"? It's written nowhere in Policy (and I
> can't remember if it's ever been discussed before), but if package X
> is causing package Y to be unreproducible, I feel that has some
> bearing on the severity of the bug for that issue filed against X…
> completely independent of whether package X is reproducible itself or
> not.  :)

None of the remaining thirty-or-so (and in fact, none of the 66 updated so far)
are usertagged both 'buildpath' and 'toolchain'.

I would say that a few of them _are_ 'toolchain packages' -- mono, binutils-dev
and a few others -- but for these bugs the buildpath issues are internal to
each package at build-time and do not affect the construction of other
packages in their ecosystem.

> Just to underscore that this is simply my curiosity before you
> reassign: in the particular case of *buildpath* AND toolchain, these
> should almost certainly be wishlist anyway because, as discussed, we
> "aren't testing buildpath".

Mostly agree.  Of the bugs in Debian that _are_ usertagged both buildpath and
also toolchain, a few of them appear to have possible known/tested fixes, but in
some cases are awaiting maintainer/upstream support.  Using a static buildpath
seems like it should mitigate most concern there, but if that were not the case,
then the severity of those could perhaps be re-argued based on the quantity,
popularity and importance of affected software (packaged or otherwise).

Regards,
James


Re: Two questions about build-path reproducibility in Debian

2024-03-29 Thread James Addison via rb-general
Hi again,

On Mon, 11 Mar 2024 at 18:24, James Addison  wrote:
>
> Hi folks,
>
> On Wed, 6 Mar 2024 at 01:04, James Addison  wrote:
> > [ ... snip ...]
> >
> > The Debian bug severity descriptions[1] provide some more nuance, and that
> > reassures me that wishlist should be appropriate for most of these bugs
> > (although I'll inspect their contents before making any changes).
>
> Please find below a draft of the message I'll send to each affected bugreport.
>
> Note: I confused myself when writing this; in fact Salsa-CI reprotest _does_
> continue to test build-path variance, at least until we decide otherwise.
>
> --- BEGIN DRAFT ---
> Because Debian builds packages from a fixed build path, customized build paths
> are _not_ currently evaluated by the 'reprotest' utility in Salsa-CI, or 
> during
> package builds on the Reproducible Builds team's package test infrastructure
> for Debian[1].
>
> This means that this package will pass current reproducibility tests; however
> we still believe that source code and/or build steps embed the build path into
> binary package output, making it more difficult that necessary for independent
> consumers to confirm whether their local compilations produce identical binary
> artifacts.
>
> As a result, this bugreport will remain open and be assigned the 'wishlist'
> severity[2].
>
> ...
>
> [1] - https://tests.reproducible-builds.org/debian/reproducible.html
>
> [2] - https://www.debian.org/Bugs/Developer#severities
> --- END DRAFT ---

Most of the remaining buildpath bugs have been updated to severity 'wishlist'.

Approximately thirty are still set to other severity levels, and I plan to
update those with the following adjusted messaging:

--- BEGIN DRAFT ---
Control: severity -1 wishlist

Dear Maintainer,

Currently, Debian's buildd and also the Reproducible Builds team's testing
infrastructure[1] both use a fixed build path when building binary packages.

This means that your package will pass current reproducibility tests; however
we believe that varying the build path still produces undesirable changes in
the binary package output, making it more difficult than necessary for
independent consumers to check the integrity of those packages by rebuilding
them themselves.

As a result, this bugreport will remain open and be re-assigned the 'wishlist'
severity[2].

You can use the 'reprotest' package build utility - either locally, or as
provided in Debian's Salsa continuous integration pipelines - to assist
uncovering reproducibility failures due build-path variance.

For more information about build paths and how they can affect reproducibility,
please refer to: https://reproducible-builds.org/docs/build-path/

...

[1] - https://tests.reproducible-builds.org/debian/reproducible.html

[2] - https://www.debian.org/Bugs/Developer#severities
--- END DRAFT ---

Thanks for your feedback and suggestions,
James


Re: Two questions about build-path reproducibility in Debian

2024-03-12 Thread James Addison via rb-general
Hi folks,

On Wed, 6 Mar 2024 at 01:04, James Addison  wrote:
> [ ... snip ...]
>
> The Debian bug severity descriptions[1] provide some more nuance, and that
> reassures me that wishlist should be appropriate for most of these bugs
> (although I'll inspect their contents before making any changes).

Please find below a draft of the message I'll send to each affected bugreport.

Note: I confused myself when writing this; in fact Salsa-CI reprotest _does_
continue to test build-path variance, at least until we decide otherwise.

--- BEGIN DRAFT ---
Because Debian builds packages from a fixed build path, customized build paths
are _not_ currently evaluated by the 'reprotest' utility in Salsa-CI, or during
package builds on the Reproducible Builds team's package test infrastructure
for Debian[1].

This means that this package will pass current reproducibility tests; however
we still believe that source code and/or build steps embed the build path into
binary package output, making it more difficult that necessary for independent
consumers to confirm whether their local compilations produce identical binary
artifacts.

As a result, this bugreport will remain open and be assigned the 'wishlist'
severity[2].

...

[1] - https://tests.reproducible-builds.org/debian/reproducible.html

[2] - https://www.debian.org/Bugs/Developer#severities
--- END DRAFT ---


Re: Two questions about build-path reproducibility in Debian

2024-03-06 Thread James Addison via rb-general
Hi Vagrant,

Narrowing in on (or perhaps nitpicking) a detail:

On Mon, 4 Mar 2024 at 20:41, Vagrant Cascadian
 wrote:
>
> On 2024-03-04, John Gilmore wrote:
> > Vagrant Cascadian wrote:
> >> > > to make it easier to debug other issues, although deprioritizing them
> >> > > makes sense, given buildd.debian.org now normalizes them.
> >
> > James Addison via rb-general  
> > wrote:
> >> Ok, thank you both.  A number of these bugs are currently recorded at 
> >> severity
> >> level 'normal'; unless told not to, I'll spend some time to double-check 
> >> their
> >> details and - assuming all looks OK - will bulk downgrade them to 
> >> 'wishlist'
> >> severity a week or so from now.
>
> Well, I think we should change it to "minor" rather than "wishlist"
> severity, but that may be splitting hairs; I do not find a huge amount
> of difference between debian bug severities... they are pretty much
> either critical/serious/grave and thus must be fixed, or
> normal/minor/wishlist and fixed when someone feels like it.

The Debian bug severity descriptions[1] provide some more nuance, and that
reassures me that wishlist should be appropriate for most of these bugs
(although I'll inspect their contents before making any changes).

Regards,
James

[1] - https://www.debian.org/Bugs/Developer#severities


Re: Two questions about build-path reproducibility in Debian

2024-03-04 Thread James Addison via rb-general
On Wed, 28 Feb 2024 at 12:06, Chris Lamb  wrote:
>
> Vagrant Cascadian wrote:
>
> > There are real-world build path issues, and while it is possible to work
> > around them in various ways, I think they are still issues worth fixing
> > to make it easier to debug other issues, although deprioritizing them
> > makes sense, given buildd.debian.org now normalizes them.
>
> +1.
>
> And for this reason, I think we should keep the buildpath-related
> bugs as well. They should all be 'wishlist' priority anyway, and I
> wouldn't like to bet my hat that the usertag metadata is accurate and
> comprehensive enough to blindly close them in the first place. (We
> only really used the usertags to do some rough-and-ready statistics
> on broad issue categories.)

Ok, thank you both.  A number of these bugs are currently recorded at severity
level 'normal'; unless told not to, I'll spend some time to double-check their
details and - assuming all looks OK - will bulk downgrade them to 'wishlist'
severity a week or so from now.


Re: reprotest: inadvertent misconfiguration in salsa-ci config

2024-03-04 Thread James Addison via rb-general
Hi Chris, Vagrant,

On Tue, 27 Feb 2024 at 17:44, Vagrant Cascadian
 wrote:
>
> On 2024-02-27, Chris Lamb wrote:
> >> * Update reprotest to handle a single-disabled-varations-value as a
> >>   special case - treating it as vary and/or emitting a warning.
>
> Well, I would broaden this to include an arbitrary number of negating
> options:
>
>   --variations=-time,-build_path
>
> That seems just as invalid.
>
> The one special case I could see is "--variations=-all" where you might
> want to be normalizing as much as possible.

Hmm, yep.  So when there are only subtractions, we _could_ imply that
there is an
implicit '+all' at the beginning of the 'variations' argument.

And along that line of thinking, we could emit a warning to stderr:

  $ reprotest auto --dry-run --variations=-timezone
  Implicitly expanding variations '-timezone' to '+all,-timezone'
  ...

> > On whether to magically/transparently fix this, needless to say, it's
> > considered bad practice to change the behaviour of software that has
> > already been released — I would, as a rule, subscribe to that idea.
> > However, we should bear in mind that this idea revolves around what
> > users are *expecting*, not necessarily what the software actually
> > does.
> >
> > I say that because I hazard that all 400 usages are indeed expecting
> > that `--variations=-foo` functions the same as `--variations=all,-foo`
> > (or `--vary=-foo`), and so this proposed change would merely be
> > modifying reprotest to reflect their existing expectations. It would
> > not therefore be a violation of the "don't break existing
> > functionality" dictum.
> >
> > (Saying that, the addition of a warning that we are doing so would
> > definitely not go amiss.)
>
> Hrm. Less inclined toward this approach; expectations can shift with
> time and context and culture and whatnot. That said, I agree the current
> behavior is confusing, and we should change something explicitly, rather
> than implicitly...

Changing-existing-behaviours could arguably be even more problematic for
cases like this where we're talking about continuous integration checks.

Breaking/unbreaking unrelated CI pipelines seems like something we should be
careful to avoid.

> >> * Treat removal of a variance factor from an already-empty-context
> >> as an error.
> >
> > I'm also tempted by this as well. :)  How would this be experienced by
> > most DDs? Would their new pushes to Salsa now suddenly fail in the
> > reprotest job of the pipeline? If so, that's not too awful, given that
> > the prominent error message would presumably let them know precisely
> > how to fix it.
>
> I would much prefer an error message if we can correctly identify this.

That'd be nice - perhaps something like:

  Failed to parse variations: '-timezone'; did you mean '+all,-timezone'?

I've opened a merge request[1] to explore this error-treatment approach; it
lacks useful error messaging so far, but I'll attempt to add that soon.

> Some possible expected behaviors to consider treating as invalid, and
> issue an error:
>
>   --variations=-build_path
>
>   --variations=-time,-build_path
>
> This almost makes me want to entirely deprecate --variations, and switch
> to recommending "--vary=-all,+whatever" or "--vary=-all
> --vary=+whatever" instead of ever using --variations.
>
> I'm not sure the variations syntax enables much that cannot be more
> unambiguously expressed with --vary.

I do think that supporting two command-line argument names that provide
similar operations (and use similar names!) is confusing.

However I'm inclined to limit the effect of any behaviour changes here to the
specific cases that we know are problematic (ref previous thoughts about CI
infrastructure).

> That said, the reprotest code is a bit hairy, and I am not sure what
> sort of refactoring will be needed to make this possible. In particular,
> how --auto-build is implemented, where it systematically tests each
> variation one at a time. That said, Refactoring might be needed
> regardless. :)

That's a neat bit of functionality in auto-build.  As far as I can tell, it
seems agnostic of whether the build specifications are provided by 'vary' or
'variations' -- but test coverage would be better at confirming that.

Regards,
James


reprotest: inadvertent misconfiguration in salsa-ci config

2024-02-26 Thread James Addison via rb-general
Hello,

A few hundred packages that use reprotest in Salsa-CI appear to be
misconfigured; the remainder of this message explains the problem, and
asks for help figuring out what to do.

Context
---
The reprotest[1] utility tests reproducibility of .deb package builds
by performing two comparative builds with selective differences in the
environment.

As documented[2], the extent of build-env difference can be customized
using the 'variations' command-line argument, that has a default value
of 'all', or similarly the 'vary' argument.  These arguments can be
used together, and they support plus-or-minus symbols as value
prefixes (+/-) to indicate whether a variance factor is being added or
removed.

The reprotest commandline is parsed in sequence from left-to-right,
with each 'vary' argument applied like a patch -- amending existing
settings -- while in contrast each 'variations' argument performs a
complete reset of the variance context.

To examine/confirm reprotest's behaviour locally I can recommend its
'--dry-run' argument, instructing it to print what it would do without
performing any build actions.

Problem: misconfiguration case
--
Although the single argument '--variations=-timezone' could reasonably
be expected to disable a single form of variance (timezone) during a
test, in fact it resets the variance context to empty (it does not
contain 'all', begins with an empty context, and then attempts
performs a no-op removal of timezone from that).

This could allow packages to succeed when they would otherwise fail if
the intended level of build variation was enabled.

This misconfiguration has occurred in practice, and based on some code
searches (example[3]) I believe that around 400 Debian packages are
affected by this.

Resolution
--
My working assumption is that packages that have a single
negative-variations entry (like the -timezone example above) intended
to disable solely the named factor during reprotest testing.

To resolve this it seems that we could:

  * Update the salsa-ci.yml files in each affected case to replace
'--variations=-' with '--vary=-'.
  * Update reprotest to handle a single-disabled-varations-value as a
special case - treating it as vary and/or emitting a warning.
  * Treat removal of a variance factor from an already-empty-context
as an error.
  * Radically, remove the ability for packages to customize their
reprotest arguments at all.

To readers of these lists: does this analysis and set of assumptions
make sense, and if so: do you prefer/recommend any of the suggested
approaches, or have alternative suggestions of your own?

Thank you,
James

[1] - https://salsa.debian.org/reproducible-builds/reprotest

[2] - 
https://salsa.debian.org/reproducible-builds/reprotest/-/blob/6cb0328ea422e12d115737714627850745f93a71/README.rst?plain=1#L299-311

[3] - 
https://codesearch.debian.net/search?q=path%3Asalsa-ci.yml+SALSA_CI_REPROTEST_ARGS%3A+%27--variations%3D-build-path%27=1


Two questions about build-path reproducibility in Debian

2024-02-26 Thread James Addison via rb-general
Hi folks,

A quick recap: in July 2023, Debian's package build infrastructure
(buildd) intentionally began using a fixed directory path during
package builds (bug #1034424).  Previously, some string randomness
existed within each source build directory path.

I've two questions related to buildpaths - one relevant to the
Salsa-CI team, and the other a RB-team housekeeping question:

  1. [Salsa] Recently Debian's CI pipeline was reconfigured[1] to
enable more variance in builds.  However: I think that change also
(inadvertently?) enabled buildpath variation.  Is that useful and/or
aligned with Debian package migration incentives[2] -- or should we
disable that buildpath variance?

  2. [RB] Housekeeping: we use Debian's bugtracker to record packages
with buildpath-related build problems[3].  Do we want to keep those
bugs open, or should we close them?

Thanks,
James

[1] - https://salsa.debian.org/salsa-ci-team/pipeline/-/merge_requests/468

[2] - "Reproducibility migration policy" @
https://lists.debian.org/debian-devel-announce/2023/12/msg3.html

[3] - 
https://udd.debian.org/bugs/?release=any=ign=ign=ign=7=7=only=buildpath=reproducible-builds%40lists.alioth.debian.org=1=id=asc=html#results


Re: Introducing: Semantically reproducible builds

2023-05-28 Thread James Addison via rb-general
Hi David,

Thanks for sharing this.

I think that the problem with this idea and name are:

- That it does not allow two or more people to share and confirm that
they have the same build of some software.
- That it does not allow tests to fail-early, catching and preventing
reproducibility  regressions (semantic or otherwise).
- That the naming terminology conflates with true reproducible builds,
therefore creating the potential for misunderstanding to consumers.

Cheers,
James


Re: Sphinx: localisation changes / reproducibility

2023-04-26 Thread James Addison via rb-general
On Wed, 26 Apr 2023 at 18:48, Vagrant Cascadian
 wrote:
>
> On 2023-04-26, James Addison wrote:
> > On Tue, 18 Apr 2023 at 18:51, Vagrant Cascadian
> >  wrote:
> >> > James Addison  wrote:
> >> This is why in the reproducible builds documentation on timestamps,
> >> there is a paragraph "Timestamps are best avoided":
> >>
> >>   https://reproducible-builds.org/docs/timestamps/
> >>
> >> Or as I like to say "There are no timestamps quite like NO timestamps!"
> >
> > I see a parallel between the use of timestamps as a key for
> > data-lookup (as in Holger's developers-reference package), and the use
> > of locale as a similar data-lookup key (as in the case of localised
> > documentation builds).
>
> > I'm not sure what the equivalent approach is for localisation, though.
> > Command-line software, for example, requires at least one written
> > natural-language to be usable, and as a second use case, providing
> > natural-language documentation with software is highly recommended (is
> > it part of the software?  maybe not.  but a sufficiently-confusing
> > poorly-translated error message could be as serious as a code-related
> > bug, I think?).
> >
> > Linking back to my recent experience with Sphinx, and from the
> > perspective of allowing-users-to-verify-their-software, I'd tend to
> > think that an ideally-produced, reproducible, localised software would
> > include _all_ available translations in the build artifact.  Some of
> > that could be retrieved at runtime (gettext, for example), and some
> > could be static (file-backed HTML documentation, where runtime lookups
> > might not be so straightforward).
>
> I struggle to see the parallel. A timestamp is an arbitrary value based
> on when you built it, whereas the locale-rendered document should be
> reproducibly translated based on the translations you have available at
> the time you run whatever process generates the translated version of
> the document/binary, and regardless of the locale of the build
> environment.

Ok, I think I understand.  Please check my understanding, though: I
interpret your perspective as matching the ideal-world scenario that
John outlined, where the SOURCE_DATE_EPOCH value has no effect at all
on the output of the build

Until then, I see both the build-time (SOURCE_DATE_EPOCH) and
build-locale as inputs that do affect the output of software build
systems, and believe that relevant guidance could help projects
migrate towards reproducibility.

> With runtime translation, you would be desiring translation from the
> source language to the operating locale of the environment you've called
> it in... but that should still be systematic, no?

Runtime translation should be systematic, yes.  So recommending that
projects use runtime translation (instead of compiling-in separate
source files for each language) is good advice.

> While there almost certainly might be more than one legitimate
> translation for a given work, your process for rendering it should
> really only have one particular output given a particular input
> (e.g. the source language input and the descriptions of how to translate
> it to the desired language)... barring, of course, bugs in the system
> ... or am i missing something entirely?

No, I don't think you missed anything, and I think we have the same
understanding of the components.  We're likely arriving from different
perspectives on the problem space.

My question is approximately this: for some source software developed
in a natural language that I don't read or understand, and that
includes statically-built documentation (say, HTML files for example),
could I determine that the distributed software (an installer file
downloaded from the web, for example) recommended to me because it
includes support for a natural language that I _do_ understand is
identical to the one in the developers' own natural language?

(and I think that yes, it's possible: build the source to include the
content from all available languages, and distribute that single copy;
the translations may be better or worse in some areas, but we can all
agree that it is not only the same source, but the same build of that
source)

> Unless, I guess, you're using some Machine Learning model to produce
> your translations?

... well, in honesty I think that Machine Learning could -- and in
many cases, perhaps should -- be encouraged towards
deterministic/repeatable behaviour.  But that's probably a
conversation for another thread.


Re: Sphinx: localisation changes / reproducibility

2023-04-18 Thread James Addison via rb-general
On Sun, 16 Apr 2023 at 00:25, John Gilmore  wrote:
>
> James Addison via rb-general  wrote:
> >  In general, we should be able to
> > pick two times, "s" and "t", s <= t, where "s" is the
> > source-package-retrieval time, and "t" is the build-time, and using
> > those, any two people should be able to create exactly the same
> > (bit-for-bit) documentation.  I think that SOURCE_DATE_EPOCH generally
> > refers to "t".
>
> I think that SOURCE_DATE_EPOCH generally refers to the check-IN time of
> each of the source package(s) being rebuilt.  You can retrieve the
> packages anytime later than that, and you can do the build at any time
> later, and SOURCE_DATE_EPOCH should not change (and the built binaries
> and docs should also not change).

When the goal is to build the software as it was available to the
author at the time of code commit/check-in - and I think that that is
a valid use case - then that makes sense.

(this ignores a subtlety in the hypothetical case where multiple
independent software authors could be tasked with writing to a
specification but without access to each others' source dependencies
-- in that case there is no notion of the "software as it was
available to [each] author".  using FOSS licensing and publication
solves most of that problem, excepting situations where individual
authors are out-of-contact during software development)

Inverting the question somewhat: if a single source-base is rebuilt
using two different SOURCE_DATE_EPOCH values (let's say, 1970-01-01
and 2023-04-18), then what are expected/valid differences in the
resulting output?


Re: Sphinx: localisation changes / reproducibility

2023-04-15 Thread James Addison via rb-general
On Fri, 14 Apr 2023 at 19:51, Holger Levsen  wrote:
>
> Dear James,
>
> many thanks also from me for your work on this and sharing your findings here.
>
> I'm another happy sphinx user affected by those problems. :)

Thanks, Holger - I think I made a bit of a (verbose) mess of this
particular bugfix attempt, but it's a learning experience I suppose :)

> somewhat related:
>
> i'm wondering whether distro-info should respect SOURCE_DATE_EPOCH:
> src:developers-reference builds different content based on the build
> date, due to using distro-info and distro-info knows that in 398 days
>  trixie will be released :)))
> see  
> https://tests.reproducible-builds.org/debian/rb-pkg/bookworm/arm64/diffoscope-results/developers-reference.html

Although it's probably an alternative understanding of the goal, I
(possibly mis)interpret the end result of reproducible builds as:
reliable and complete support for time-traveling software users.

So: it's a slow, rainy Tuesday in early Y2045, and there's a report
that someone's saved game file from Y2019 doesn't load correctly.

Well, we could start by retrieving the sources and building the game
as it existed back in Y2019 - does it load in that version?

In practice, built software could depend on the software and tools
installed locally (in a true-ish sense of location), not only time --
but by using time as a universal and ordered clothesrack (?) onto
which software can be unambiguously placed (and then non-destructively
and freely copied-from), we can all achieve matching software costumes
if and when we want to.

To return more directly to your question, though: the release date
information seems to be in distro-info-data, and I guess that that
itself is also updated over time.  In general, we should be able to
pick two times, "s" and "t", s <= t, where "s" is the
source-package-retrieval time, and "t" is the build-time, and using
those, any two people should be able to create exactly the same
(bit-for-bit) documentation.  I think that SOURCE_DATE_EPOCH generally
refers to "t".  In practice some variance of "s" is allowed for the
build dependencies, in the absence of output-affecting bugs/changes
relevant to the build.

If something is producing different results across builds that share
the same fixed "s" and "t" (and I think that the RB dashboard
indicates that that is happening for developers-reference?), then
either something isn't respecting SOURCE_DATE_EPOCH, or something else
in the build environment is affecting the build output.

Long story short: that's probably not very helpful -- I have retrieved
a few of the sources, but I don't know where the problem is yet.  I'll
take more of a look soon.

(also: although it's not DFSG-compatible, a game that springs to mind
in my case, albeit Y2009 or so probably, would be the original Knights
of The Old Republic game.  not so much a load-game problem, more of a
logic error in the game data that meant that I was stuck on some
planet without being able to achieve the assigned objectives.  I would
_probably_ still remember enough of the details to figure out
replication details for that bug, given a refresher on some of the
levels and objectives involved.  that'll have to be a very rainy day)

Cheers,
James


Re: Sphinx: localisation changes / reproducibility

2023-04-09 Thread James Addison via rb-general
A follow-up: after doing more work to try to confirm the behaviour of
the fix -- something I should have done before even starting
development! -- I was confused that I couldn't replicate the original
problem when using a version of the codebase _before_ my proposed fix
pr#10949 was applied.

I now believe that the issue had, in fact, already been fixed between
versions v4.5.0 and v5.0.0 of Sphinx - so I've offered a revert of my
changes.

Details about tracing the location of the existing fix (using 'git
bisect') can be found here:
https://github.com/sphinx-doc/sphinx/issues/9778#issuecomment-1501172176

Attempting to learn from this: I find the experience interesting
because it seems that I deluded myself into believing that I'd
resolved a bug -- and unfortunately drew in a bunch of other people's
time doing that -- when, in fact, following good time-and-vesion-aware
engineering practices (not always easy when keen to contribute to a
project, but as demonstrated here, potentially very important) could
have avoided much of that work in the first place.

The original bugreport was detailed and quite clear, so I think that
much of the fault here was from me not carefully reading and
considering how to proceed.  There could be other learnings /
recommendations (I'm mulling over some ideas related to
continuous-bug-checking), but I'm unable to make any clear
recommendations there yet, partly because I think that
bug-checking-and-verification, while valuable, is not always
considered desirable or economically-beneficial work -- and partly
because writing test cases to reproduce bugs (which could make
continuous evaluation of stale bugs easier) often performs 80%+
(guestimate, in my opinion) of the work to find the cause of the bug
-- meaning that it's frequently worthwhile to combine with fixing the
bug (in other words: there's often a significant overlap between
developing a bug reproduction test case and fixing the bug).

Probably nothing new to many of the folks on this mailing list and/or
seasoned software engineers generally, but I figured I'd try to
document my findings :)

On Sat, 8 Apr 2023 at 11:10, James Addison  wrote:
>
> Hi folks,
>
> A set of reproducible-build-related changes[1] that I've developed for
> sphinx (a documentation project generator) have been accepted for
> inclusion in v6.2.0 of sphinx.
>
> I'm optimistic that those changes can address a sizable category[2] of
> reproducible build failures related to translation of documentation
> during software builds (reproducible build testing intentionally
> varies the host LANGUAGE setting to shake out unintended sources of
> build variation, and some sphinx projects fail rb-tests due to that).
>
> However.. with the changes merged (although not yet released) I'm
> beginning to have some doubts about them.
>
> The positive effect of the changes is that I expect they'll help to
> confirm and achieve reproducibility for a good chunk of remaining
> non-reproducible software.
>
> The downside is: disabling localization -- or perhaps more accurately:
> emitting all documentation for each project using 'null'[3]
> translation -- seems like a fairly blunt, and perhaps unwelcome (for
> consumers) way to achieve reproducibility.
>
>
> A longer, better path to achieve reproducibility would be to support
> building documentation in _all_ available translated locales during
> Sphinx project builds (something that is not yet supported -- and with
> at least one component that I'm aware of (objects.inv) that doesn't
> seem to support multi-language content).  Doing that should produce
> output artifacts that users of any supported locale can access in a
> relevant localised way, and that can be made bit-for-bit consistent.
>
> In summary: I'm writing partly optimistically because I think the
> merged changes could improve and help to confirm reproducibility of
> software during testing.  But I also feel a bit conflicted about the
> way I've approached the changes and their implications, so I'm also
> keen to gather feedback and thoughts.
>
> Thank you,
> James
>
> [1] - https://github.com/sphinx-doc/sphinx/pull/10949
>
> [2] - 
> https://tests.reproducible-builds.org/debian/issues/unstable/sphinxdoc_translations_issue.html
>
> [3] - 
> https://docs.python.org/3/library/gettext.html#the-nulltranslations-class


Sphinx: localisation changes / reproducibility

2023-04-08 Thread James Addison via rb-general
Hi folks,

A set of reproducible-build-related changes[1] that I've developed for
sphinx (a documentation project generator) have been accepted for
inclusion in v6.2.0 of sphinx.

I'm optimistic that those changes can address a sizable category[2] of
reproducible build failures related to translation of documentation
during software builds (reproducible build testing intentionally
varies the host LANGUAGE setting to shake out unintended sources of
build variation, and some sphinx projects fail rb-tests due to that).

However.. with the changes merged (although not yet released) I'm
beginning to have some doubts about them.

The positive effect of the changes is that I expect they'll help to
confirm and achieve reproducibility for a good chunk of remaining
non-reproducible software.

The downside is: disabling localization -- or perhaps more accurately:
emitting all documentation for each project using 'null'[3]
translation -- seems like a fairly blunt, and perhaps unwelcome (for
consumers) way to achieve reproducibility.


A longer, better path to achieve reproducibility would be to support
building documentation in _all_ available translated locales during
Sphinx project builds (something that is not yet supported -- and with
at least one component that I'm aware of (objects.inv) that doesn't
seem to support multi-language content).  Doing that should produce
output artifacts that users of any supported locale can access in a
relevant localised way, and that can be made bit-for-bit consistent.

In summary: I'm writing partly optimistically because I think the
merged changes could improve and help to confirm reproducibility of
software during testing.  But I also feel a bit conflicted about the
way I've approached the changes and their implications, so I'm also
keen to gather feedback and thoughts.

Thank you,
James

[1] - https://github.com/sphinx-doc/sphinx/pull/10949

[2] - 
https://tests.reproducible-builds.org/debian/issues/unstable/sphinxdoc_translations_issue.html

[3] - https://docs.python.org/3/library/gettext.html#the-nulltranslations-class


Re: alembic / sphinx puzzler

2023-02-18 Thread James Addison via rb-general
On Thu, Feb 16, 2023 at 6:17 PM Chris Lamb 
wrote:

> Thanks. Please feel free to quote my previous email, as well as link
> to my WIP patch.



> Let us know when you have an issue number/URL.


D'oh - unfortunately I only read these after filing the issue, thanks
though.  It is reported at:
https://github.com/sphinx-doc/sphinx/issues/11198 (and I see you've added
context there)


> Hm, isn't this just probability at work? As in, because there is 50%
> chance that the 2-item set is serialised in any given order, it's only
> going to be detected as unreproducible 50% of the time:
>
>  +-+-++
>  | Build A | Build B | Result |
>  +-+-++
>  |  a, b   |  a, b   | "Reproducible" |
>  +-+-++
>  |  b, a   |  a, b   | Unreproducible |
>  +-+-++
>  |  a, b   |  b, a   | Unreproducible |
>  +-+-++
>  |  b, a   |  b, a   | "Reproducible" |
>  +-+-++
>

 I'm not sure; for an event that is truly a random binary choice, that
would make sense.  In this case, though, I think there may be something
about the system initialization prior to the object description code
running that produces a predictable, yet differing, result based on
environmental factor(s).

(I'd prefer to be replying with some detailed findings as a result of
experimenting with repeated attempts to generate the documentation during
from-scratch builds.. I haven't gotten around to that here, though)


Re: alembic / sphinx puzzler

2023-02-16 Thread James Addison via rb-general
Hey Chris,

On Wed, Feb 15, 2023 at 7:27 PM Chris Lamb 
wrote:

> This change to Sphinx makes alembic reproducible:
>
>
> https://github.com/lamby/sphinx/commit/4ad7670c1df00f82e758aaa8a7b9aaea83b8eaba
>
> Does this patch work for you?
>

Yes!  Thank you - that's a much better patch than an alternative approach I
was working on that attempted to sort the string-typed results of
processing the AST.  It produces stable results for me when I vary the
order of the set-within-a-tuple's elements in the input.

I'll file a bug on sphinx's GitHub repository about the original issue
within the next few hours.


> Why it hasn't been a problem before is rather curious to me, though.
> It may be just that it hasn't come up, but it may be because this
> value ultimately comes from a Python typing annotation. Yet at that
> point in the code, there doesn't seem to be anything special
> whatsoever about this tuple & set: it's really just a regular tuple
> and set.
>

 Could it be that other Sphinx documentation-generation issues tend to have
occluded this one?

I also have to admit that I still don't understand what it is that varies
(and frequently varies, apparently) across builds that exposed the problem
in the first place.  I'm not convinced that it's likely to be related to
the memory addresses of the datastructures.. I had a vague theory that
perhaps filesystem choice/layout could be a cause (perhaps a strange theory
at first: my rationale would be that it could cause the AST to read and
process files in a different order, and that that could affect the parser's
state in subtle ways.  I haven't even convinced myself of that possibility
entirely yet though).

Thanks again,
James


alembic / sphinx puzzler

2023-02-14 Thread James Addison via rb-general
Hi folks,

I noticed what _seemed_ like a quick reproducible-build fix for alembic (a
database migration framework written in Python).  A few hours later though,
I'm still puzzled.

The problem appears in a similar pattern across various architectures in
the diffoscope results for alembic -- both amd64[1] and arm64[2], for
example.  It looks like the variations are in sphinx-generated
documentation where the ordering of collection elements -- like the items
in this attribute definition[3] -- differs in the output.

My understanding is that sphinx is using Python 3.11's built-in AST parser,
which doesn't provide parse-tree traversal order guarantees, during these
builds - and that'd seem to make sense as a cause.

Also: this might be the same issue as described in
'randomness_in_property_annotations_generated_by_sphinx'[4].

Does anyone have suggestions about how to proceed?  I'll likely take more
of a look again tomorrow.

Thanks,
James

(note: posting from my work email, because some of my work infrastructure
uses alembic, and so I think there's a clear, if small, work-related
motivation for this)

[1] -
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/alembic.html

[2] -
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/arm64/diffoscope-results/alembic.html

[3] -
https://github.com/sqlalchemy/alembic/blob/a968c9d2832173ee7d5dde50c7573f7b99424c38/alembic/ddl/impl.py#L90

[4] -
https://tests.reproducible-builds.org/debian/issues/unstable/randomness_in_property_annotations_generated_by_sphinx_issue.html


Re: buildinfo question

2022-12-15 Thread James Addison via rb-general
Ah, typical: while trying to figure out where functionality like this
could fit into Debian, I learned that it already exists there.

The 'dpkg-depcheck' and 'dpkg-genbuilddeps' utilities (both included
in the 'devscripts' package) provide this kind of functionality in
Debian.

On Wed, 14 Dec 2022 at 00:38, James Addison  wrote:
>
> On Tue, 13 Dec 2022 at 18:15, Vagrant Cascadian
>  wrote:
> >
> > It would be interesting to do something more systematic like your
> > suggestion, though I'm not aware of anything at the moment.
>
> Thanks Vagrant, that's good to know (it matches my understanding too,
> from searching around).
>
> Roughly speaking, the reason I ask is to see whether it'd be possible
> to reduce migration and maintenance burden (in terms of maintainer
> time, primarily, although perhaps arguably also compute resources) by
> providing information about no-longer-required build dependencies.
>
> I'll spend some time at the metaphorical drawing board to see whether
> something like this could make sense and how to integrate in a
> non-disruptive way if so.
>
> James


Re: buildinfo question

2022-12-14 Thread James Addison via rb-general
On Tue, 13 Dec 2022 at 18:15, Vagrant Cascadian
 wrote:
>
> It would be interesting to do something more systematic like your
> suggestion, though I'm not aware of anything at the moment.

Thanks Vagrant, that's good to know (it matches my understanding too,
from searching around).

Roughly speaking, the reason I ask is to see whether it'd be possible
to reduce migration and maintenance burden (in terms of maintainer
time, primarily, although perhaps arguably also compute resources) by
providing information about no-longer-required build dependencies.

I'll spend some time at the metaphorical drawing board to see whether
something like this could make sense and how to integrate in a
non-disruptive way if so.

James


buildinfo question

2022-12-13 Thread James Addison via rb-general
Hi folks,

As Debian's buildinfo[1] wiki page hints, it's difficult to determine
whether a build dependency is genuinely required at build-time,
compared to: it was required in the past, but has become dependency
cruft.

I was wondering: are there reproducible-builds efforts underway (in
Debian or other ecosystems) to determine the packages that were
involved (first-pass approximation: at least one file belonging to the
package was read from the filesystem by a child of the build process
-- anything else?) during a reproducible package build?

Thanks,
James

[1] - https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles