Re: git quirk: core.autocrlf

2024-04-23 Thread David A. Wheeler via rb-general



> On Apr 21, 2024, at 8:00 PM, James Addison via rb-general 
>  wrote:
> 
> ...
> That universal newline handling may cause problems in some cases if
> not handled carefully, but surprisingly -- at least to me -- 'git'
> itself also automatically converts the line-endings of files to the
> local platform's standard.

This is configurable, and I recommend turning it off. Today you typically
don't need to try to force data to the local convention.

It made sense when on Windows, because some tools - especially Notepad -
couldn't handle Unix lines. This created *endless* problems.
However, as of 2018, Notepad added support for Unix line endings:
https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/

Most other tools quietly accept either format.
E.g., vim prefers Unix line endings, but if a file has only MSDOS|Windows|CP/M 
line endings
of \r\n, it will accept it and save revisions in that format:
https://vim.fandom.com/wiki/File_format#:~:text=File%20format%20detection,-The%20'fileformats'%20option=Vim%20will%20look%20for%20both,ff'%20option%20will%20be%20dos.

I prefer creating stuff using Unix line-endings (\n), but if I clone a Windows 
repo,
most of the tools will quietly use that other format, and I wouldn't normally 
even notice.

This kind of thing *can* create mysterious reproduction problems, so I think 
it's in scope
for this mailing list.

--- David A. Wheeler



Tracking source code: whatsrc.org

2024-04-23 Thread kpcyrd

hello list,

I built a website and imported source code inputs from:

- Arch Linux
- Debian sid and stable-security
- Fedora rawhide
- Alpine edge

into a common database. I keep track of the tarball content, and the 
checksums both before and after compression.


This allows lookups like:

https://whatsrc.org/artifact/sha256:981a75f8291020d9f6632c6160ee3651f376bdf354373bea00506a220e355134

```
Build input of:

- Alpine: cmatrix 2.0-r2 
(https://github.com/abishekvashok/cmatrix/archive/v2.0.tar.gz) 
sha512:1aeecd8e8abb6f87fc54f88a8c25478f69d42d450af782e73c0fca7f051669a415c0505ca61c904f960b46bbddf98cfb3dd1f9b18917b0b39e95d8c899889530
- Arch Linux: cmatrix 2.0-3 
(https://github.com/abishekvashok/cmatrix/archive/v2.0.tar.gz) 
sha256:ad93ba39acd383696ab6a9ebbed1259ecf2d3cf9f49d6b97038c66f80749e99a
- Debian: cmatrix 2.0-6 (cmatrix_2.0.orig.tar.gz) 
sha256:ad93ba39acd383696ab6a9ebbed1259ecf2d3cf9f49d6b97038c66f80749e99a
- Fedora: cmatrix 2.0-9.fc40 (cmatrix-2.0.tar.gz) 
sha256:ad93ba39acd383696ab6a9ebbed1259ecf2d3cf9f49d6b97038c66f80749e99a

```

In this case, there's consensus between the 4 distributions about what's 
the source code of cmatrix 2.0. They may still use different build 
instructions or apply patches (and definitely have different build 
environments), but they seem to agree what the upstream source code is 
(even despite absence of code signing by upstream.. maybe this hill was 
never worth dying on in the first place).


There's a search feature you can use (prefix-based with a btree index):

https://whatsrc.org/search?q=htop

In this case we find two different tarballs for htop 3.3.0:

- 
https://whatsrc.org/artifact/sha256:5971ba79fcb5e5effe182362f1dc29edfe4cfccb8389a8160e161b061e7db473
- 
https://whatsrc.org/artifact/sha256:487fbce5bc6f92a3fa9283ea1eb5f70f85bf31fe0bbee92a692f9c3f0f96f7d4


You can diff the two using:

https://whatsrc.org/diff-sorted/sha256:5971ba79fcb5e5effe182362f1dc29edfe4cfccb8389a8160e161b061e7db473/sha256:487fbce5bc6f92a3fa9283ea1eb5f70f85bf31fe0bbee92a692f9c3f0f96f7d4

--- is Debian and Fedora, +++ is Alpine

From the diff (but also from the infobox of the second link), we can 
tell this is before/after autotools pre-processing. The -sorted is 
necessary because the pre-processed dist-tarball also has ordering 
issues, making the diff very hard to read without it.


It's importing code from git repositories too, as seen on this page:

https://whatsrc.org/artifact/sha256:494fa0b23697967ab99faa8eb07f4e24e9f431ac7ab771cfd8f3dda068590b7b

It's using `git archive` (without compression) to generate a 
deterministic tar representation for a given git tree object. These are 
always free of ordering issues.


When looking at the PKGBUILD for this package:

https://gitlab.archlinux.org/archlinux/packaging/packages/ncmpcpp/-/blob/816dbe564554c1c4f772e84a49faf3708fa62a29/PKGBUILD

You can find this line:
```
b2sums=('babc1506eca6dc5bd48e58fabfd42502d33b506b2e600b7aa98126a6deb0d68e14dc692abb0ef5079e3ccf710648f0b82fe1b404303d932f2156104c479442ec'
```

Since both Arch Linux PKGBUILDs and whatsrc are content-addressed you 
can convert it into this link:


https://whatsrc.org/artifact/blake2b:babc1506eca6dc5bd48e58fabfd42502d33b506b2e600b7aa98126a6deb0d68e14dc692abb0ef5079e3ccf710648f0b82fe1b404303d932f2156104c479442ec

I'm interested in adding NixOS as a 5th distribution, but I'm not sure 
how to get the relevant data. Help welcome in 
https://github.com/kpcyrd/what-the-src/issues/12. The existing rpm 
tooling may also work for OpenSUSE but I haven't tried yet.


The site operates fairly co2 efficient (due to my Rust proficiency), I 
showed a friend what kind of specs I run this on and they were ✨stunned✨.


The purpose of this site is to give a better understanding of which line 
we need to defend in regards of source code (hint: it's the source code 
we ingest into reproducible builds, for the binaries we then put into 
our computers).


Hopefully this helps people with reasoning about said source code.

cheers,
kpcyrd


Re: Which conferences are folks attending these days?

2024-04-23 Thread Bernhard M. Wiedemann via rb-general

On 18/04/2024 15.45, Chris Lamb wrote:


To that end, what conferences are folks on this list still  going  to,
and, hopefully, still getting something from?  I mean, there  must  be
some exceptions other than FOSDEM… :)


My list has become rather short:
rb conf (if within Europe)
openSUSE conf, Nuremberg
and a mini-openSUSE conf in Berlin, co-located with SUSECon