On Tue, Sep 2, 2025 at 4:37 AM Guillem Jover <[email protected]> wrote: > > Hi! > > On Thu, 2025-08-21 at 01:01:32 -0700, Fangrui Song wrote: > > On Wed, Aug 20, 2025 at 11:38 PM Jan Beulich <[email protected]> wrote: > > > Hasn't there been an extension to cover that for many years, using > > > "!<arch64>\n" > > > as file signature? I do not know, however, for well formalized that > > > extension is, > > > which solely differs from traditional archives by having a 20-byte size > > > field (in > > > place of the 10-byte one). > > If this variant of the format only covers the length (although that's > pretty much what would be needed for .deb support), that seems a bit > limiting given that at least the uid/gid and potentially the mode > might not be big enough either. > > I think if this was to be considered (but where I'm tending to think > this is really not my preferred path forward, see below) then something > like this struct… > > ``` > #define AR64MAG "!<arch64>\n" > #define SAR64MAG 10 > > struct ar64_hdr { > char ar_name[16]; /* Member file name, may be '/'-terminated. */ > char ar_time[12]; /* File seconds, ASCII decimal since Epoch. */ > char ar_uid[10]; /* User ID, in ASCII decimal. */ > char ar_gid[10]; /* Group ID, in ASCII decimal. */ > char ar_mode[10]; /* File mode, in ASCII octal. */ > char ar_size[20]; /* File size, in ASCII decimal. */ > char ar_fmag[2]; /* File magic terminator. */ > }; > ``` > > …might be better, but if that is not even going to be potentially > compatible with a pre-existing format, then it might not be worth it? > (Also going from the original 60 bytes, to this new 80 bytes seems > like a nice round bump. :) > > Is there an !<arch64>\n extension? I can't find !<arch64>\n in > > binutils, libarchive, FreeBSD's elftoolchain, or LLVM. > > AIX has a big archive extension that supports a larger size field, but > > we likely don't want to use an AIX extension. > > I also tried a search on codesearch.debian.net and also on DuckDuckGo, > Google and github.com, but nothing relevant seems to pop up. Checked > file(1) and it didn't have any knowledge of that format either. > > > The /SYM64/ extension supports 64-bit symbol table offsets, and the > > 10-byte decimal size field in the header could be easily expanded (for > > parser, bfd/archive.c:538 ` scan = sscanf (hdr.ar_size, "%" SCNu64, > > &parsed_size);` alreads supports larger size IIRC) > > I don't think this can currently handle anything larger than the current > 10-byte decimal size though (~ 9536 MiB), as the sscanf ends up using > something like "%llu" or similar? (But maybe I misunderstood your > parenthetical comment.)
My point is that the reader is likely already compatible with 64-bit size as it uses SCNu64. We just need to allow 64-bit size for the writer. Of course large archives can only be read by newer archive readers. ar_date/ar_uid/ar_gid are not very useful nowadays, as we prefer build determinism. > On Thu, 2025-08-21 at 10:41:23 +0200, Jan Beulich wrote: > > On 21.08.2025 10:01, Fangrui Song wrote: > > > Is there an !<arch64>\n extension? > > > > 15 or more years ago, when I came across this, I didn't write down its > > origin. It may be a Windows world extension. > > It would be nice to know though, otherwise we might be breaking an > existing format variant, if we ended up wanting to go into that > direction. > > > > I can't find !<arch64>\n in > > > binutils, libarchive, FreeBSD's elftoolchain, or LLVM. > > > > Right, that's what may need adding there. Or whatever else extension we > > may want to use. > > I've been pondering about the base-256 extension vs the "!<arch64>" > format, and I think I'm leaning towards the base-256 extension, > because although the field parsing might be slightly more complex (but > not too much really), it ends up being overall a way less intrusive > modification to existing code bases, where you only need to hook into > whatever is parsing the field, and do not need to touch much else. > In contrast adding a new "!<arch64>" variant might imply new entire > parsing functions, or refactoring them to support the different struct > sizes, and also the detection of the new magic value and its length. > It would also imply that things like file(1) would be completely > unaware of this new format. > > For the base-256 extension I've implemented extraction support already > in dpkg-deb (need creation support and testing whether it works, > although it's based on its existing tar base-256 support :). > > (See for example: > <https://git.hadrons.org/cgit/debian/dpkg/dpkg.git/commit/?h=next/libdpkg-ar-large-meta-base256>) > > Thanks, > Guillem

