Re: git quirk: core.autocrlf

2024-04-23 Thread David A. Wheeler via rb-general



> On Apr 21, 2024, at 8:00 PM, James Addison via rb-general 
>  wrote:
> 
> ...
> That universal newline handling may cause problems in some cases if
> not handled carefully, but surprisingly -- at least to me -- 'git'
> itself also automatically converts the line-endings of files to the
> local platform's standard.

This is configurable, and I recommend turning it off. Today you typically
don't need to try to force data to the local convention.

It made sense when on Windows, because some tools - especially Notepad -
couldn't handle Unix lines. This created *endless* problems.
However, as of 2018, Notepad added support for Unix line endings:
https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/

Most other tools quietly accept either format.
E.g., vim prefers Unix line endings, but if a file has only MSDOS|Windows|CP/M 
line endings
of \r\n, it will accept it and save revisions in that format:
https://vim.fandom.com/wiki/File_format#:~:text=File%20format%20detection,-The%20'fileformats'%20option=Vim%20will%20look%20for%20both,ff'%20option%20will%20be%20dos.

I prefer creating stuff using Unix line-endings (\n), but if I clone a Windows 
repo,
most of the tools will quietly use that other format, and I wouldn't normally 
even notice.

This kind of thing *can* create mysterious reproduction problems, so I think 
it's in scope
for this mailing list.

--- David A. Wheeler



Re: Please review the draft for March's report

2024-04-10 Thread David A. Wheeler via rb-general



> On Apr 10, 2024, at 7:42 AM, kpcyrd  wrote:
> 
> On 4/10/24 12:58 PM, Chris Lamb wrote:
>>   https://reproducible-builds.org/reports/2024-03/?draft
> 
> > Reproducible builds developer kpcyrd reported that that the Arch Linux 
> > "minimal container userland" is now 100% reproducible after work by 
> > developers dvzv and Foxboron on the one remaining package. The post, which 
> > kpcyrd suffixed with the question "now what?", continues on to outline some 
> > potential next steps, including validating whether the container image 
> > itself could be reproduced bit-for-bit. The post generated a significant 
> > number of replies.
> 
> Thanks for the kind words :) maybe it should be listed higher though, in its 
> own section, as "major accomplishment within the community"?

I agree, this one is HUGE news. There's been a lot of awesome work related to 
reproducible builds, but "minimal container userland is a 100% reproducible 
build in a real-world widely-used distro" is a big step forward and should be 
widely announced. Like press release level.

I routinely hear "reproducible builds are impractical". Yes, in some cases 
they're hard, but clearly there are cases where it's practical.

--- David A. Wheeler



Re: Arch Linux minimal container userland 100% reproducible - now what?

2024-04-04 Thread David A. Wheeler via rb-general



> On Apr 2, 2024, at 1:11 PM, John Gilmore  wrote:
> 
> For me, the distinction is that the local storage is under the direct
> control of the person trying to rebuild, while the network and the
> servers elsewhere in the network are not.  If local storage is
> unreliable, you can fix or replace it, and continue with your work.

There are obviously many advantages to local storage.

However, if you locally record cryptographic hashes, and re-download the
bits for (say) a compiler, you could still reproduce the results
*if* the information is still available where you're downloading it from
(or can find an alternative source). The key is that "if" condition.

The risk of not having local copies is the risk of loss of availability.
However, many sites are fairly reliable. I'd hate to tell someone they
can't verify reproducible builds just because they don't (currently)
have a local copy of everything. Indeed, you want multiple verifications
of reproducible builds, and they'll have to get their data from somewhere.

It's sometimes much easier to send the source including build instructions,
information on how to download the rest, and the cryptographic hashes for
what is not bundled.

--- David A. Wheeler



Re: Arch Linux minimal container userland 100% reproducible - now what?

2024-03-20 Thread David A. Wheeler via rb-general



> On Mar 20, 2024, at 8:42 AM, kpcyrd  wrote:
> 
> hello,
> 
> in last week's email to the reproducible-builds email list[1] about 
> reproducible Arch Linux I mentioned there's only one unreproducible package 
> left in docker.io/library/archlinux.
> 
> [1]: 
> https://lists.reproducible-builds.org/pipermail/rb-general/2024-March/003291.html
> 
> Due to amazing work by dvzrv and Foxboron this package is now also 
> reproducible!

That is fantastic, congratulations!!

But you know what I'm going to ask :-). What steps are left, if any, before the 
"normal" Arch Linux packages that people install are reproducible (at least in 
core Arch Linux)? Has that milestone been achieved? Will it be achieved once 
some package updates are released? Or is there something more, and if so, what 
is it?

Sorry, it wasn't clear to me if this was some sort of special set of "test 
packages" or if they were the normal Arch Linux packages.

--- David A. Wheeler



Re: Two questions about build-path reproducibility in Debian

2024-03-13 Thread David A. Wheeler via rb-general



> On Mar 12, 2024, at 11:45 AM, Vagrant Cascadian 
>  wrote:
> 
> On 2024-03-12, Holger Levsen wrote:
>> On Mon, Mar 11, 2024 at 06:24:22PM +, James Addison via rb-general wrote:
>>> Please find below a draft of the message I'll send to each affected 
>>> bugreport.
>> 
>> looks good to me, thank you for doing this!
>> 
>>> Note: I confused myself when writing this; in fact Salsa-CI reprotest _does_
>>> continue to test build-path variance, at least until we decide otherwise.
>> 
>> this is in fact a bug and should be fixed with the next reprotest release.
> 
> That is not a reprotest bug, but an infrastructure issue for the
> debian-specific salsa-ci configuration. Reprotest is not a
> debian-specific tool.
> 
> Reprotest should continue to vary build paths by default; reprotest
> historically and currently defaults to enabling all variations and
> making an exception does not seem worth the opinionated change of
> behavior. By design, reprotest is easy to configure which variations to
> enable and disable as needed.

This makes sense. If programs can build reproducibly while varying the 
build-path,
reproducible builds are easier to create, and that's a good thing even if not
strictly required.

--- David A. Wheeler






Re: Two questions about build-path reproducibility in Debian

2024-03-04 Thread David A. Wheeler via rb-general



> On Mar 4, 2024, at 3:37 PM, Holger Levsen  wrote:
> 
> On Mon, Mar 04, 2024 at 11:52:07AM -0800, John Gilmore wrote:
>> Why would these become "wishlist" bugs as opposed to actual reproducibility 
>> bugs
>> that deserve fixing, just because one server at Debian no longer invokes this
>> bug because it always uses the same build directory?
> 
> because it's "not one server at Debian" but what many ecosystems do: build in 
> an
> deterministic path (eg /$pkg/$version or whatever) or record the path as part
> of the build environment, to have it deterministic as well.
> 
> in the distant past, before namespacing become popular, using a random path
> was a solution to allow parallel builds of the same software & version.
> 
> and yes, this is a shortcut and a tradeoff, similar to demanding to build 
> in a certain locale. also it makes reproducibilty from around 80-85% of all 
> packages to >95%, IOW with this shortcut we can have meaningful 
> reproducibility
> *many years* sooner, than without.
> 
> and I'd really rather like to see Debian 100% reproducible in 2030, than in 
> 2038.
> and some subsets today, or much sooner.

I agree with Holger (and Vagrant).

It'd be *nice* if a build was reproducible regardless of the directory used to 
build it.
But today, if you're building an executable for others, it's common to build 
using a
container/chroot or similar that makes it easy to implement "must compile with 
these paths",
while *fixing* this is often a lot of work.

I suggest focusing on ensuring everyone knows what the executable files 
contain, first.
if people can add more flexibility to their build process, all the better, but 
that added flexibility
comes at a cost of time and effort that is NOT as important.

--- David A. Wheeler