Re: tar vs device special files

2023-10-29 Thread Mouse
>> It appears to be specifying pax's behaviour, not tar's.  [...]
> POSIX used to specify tar, long ago, but there were (as I understand
> it) too many incompat variants, so it was dropped.

Not entirely surprising.

> You should have been expecting that as the link you were given ended
> in pax.html#some-tag-or-other

Yes, I noticed that...after the fact.

I got the file, thank you!

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: tar vs device special files

2023-10-28 Thread Robert Elz
Date:Sat, 28 Oct 2023 23:32:56 -0400 (EDT)
From:Mouse 
Message-ID:  <202310290332.xaa23...@stone.rodents-montreal.org>

  | It appears to be specifying pax's behaviour, not tar's.  Is tar
  | specified to use the same format by reference, or is tar not specified
  | but everyone just implements it to use pax's ustar format, or what?

POSIX used to specify tar, long ago, but there were (as I understand it)
too many incompat variants, so it was dropped.  There is no standard for
tar (which makes it, as an interchange format, essentially useless).
However, as I understand things, most modern tar implementations are
in effect a variation on pax but only support pax's ustar format
(and not the others that pax also supports).

You should have been expecting that as the link you were given ended
in pax.html#some-tag-or-other

kre

ps: if I managed to somehow spam the list with a copy of the POSIX pax
spec (in PDF format), I apologies - I intended to send it just to mouse@
but didn't delete tech-userlevel ... I tried to kill it, but the network
was faster than I am, I believe.   Hopefully some list sanity checking
will have dropped the message, or something (it has not returned here).




Re: tar vs device special files

2023-10-28 Thread Mouse
>> So there _is_ a POSIX spec for tarchives?  [...]
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06

I got that fetched and have been going through it.

It appears to be specifying pax's behaviour, not tar's.  Is tar
specified to use the same format by reference, or is tar not specified
but everyone just implements it to use pax's ustar format, or what?

It also seems to me that significant fractions of it are
unimplementable on NetBSD because they demand recoding to or from UTF-8
for things that NetBSD handles as octet strings, not character strings
(which therefore cannot be recoded to or from UTF-8 even in principle),
such as user names in the system user database (/etc/{master.,}passwd
for NetBSD).  Is there a canonical way of handling such things?

passwd(5) on 9.0 and on 5.2 specify that /etc/passwd contains ASCII
records, but 5.2 vipw does not complain when I put a 0xe5 octet in a
record's username and homedir fields - and it appears to work just
fine, so any such restriction is not enforced.  This means software has
to do _something_ with faced with such things.  Is there some kind of
system-wide locale setting used for non-user-specific things like
usernames, or what?  I'm moderately sure there isn't any such thing on
5.2 and earlier.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: tar vs device special files

2023-10-28 Thread Mouse
>> So there _is_ a POSIX spec for tarchives?  [...]
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06

I'll have to scare up a work machine to fetch that from, since
apparently pubs.opengroup.org is not interested in serving content over
HTTP.  But that should be doable; work these days tends to inflict
recent Linux on me, and, as unpleasant as I find that for most
purposes, it does mean things like curl with HTTPS support.

Thank you!

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: tar vs device special files

2023-10-28 Thread Joerg Sonnenberger
On Sunday, October 29, 2023 2:29:47 AM CET Mouse wrote:
> > I don't think any one else cares about pre-ustar.  Pretty much any
> > reader and writer around uses at least ustar and generally wants to
> > have extended POSIX as well when caring about large files.
> 
> So there _is_ a POSIX spec for tarchives?  Is the spec available, or is
> this yet another pay-to-play "standard"?  I've gone looking for specs
> for tar before, but each time I have, I've been unable to find anything
> that isn't behind a paywall of one sort or another (and thus a total
> nonstarter for me).

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/
pax.html#tag_20_92_13

Joerg




Re: tar vs device special files

2023-10-28 Thread Taylor R Campbell
> Date: Sat, 28 Oct 2023 21:29:47 -0400 (EDT)
> From: Mouse 
> 
> So there _is_ a POSIX spec for tarchives?  Is the spec available, or is
> this yet another pay-to-play "standard"?  I've gone looking for specs
> for tar before, but each time I have, I've been unable to find anything
> that isn't behind a paywall of one sort or another (and thus a total
> nonstarter for me).
> 
> Admittedly, I haven't looked recently.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06


Re: tar vs device special files

2023-10-28 Thread Mouse
> I don't think any one else cares about pre-ustar.  Pretty much any
> reader and writer around uses at least ustar and generally wants to
> have extended POSIX as well when caring about large files.

So there _is_ a POSIX spec for tarchives?  Is the spec available, or is
this yet another pay-to-play "standard"?  I've gone looking for specs
for tar before, but each time I have, I've been unable to find anything
that isn't behind a paywall of one sort or another (and thus a total
nonstarter for me).

Admittedly, I haven't looked recently.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: tar vs device special files

2023-10-28 Thread Mouse
>> (It doesn't help that I haven't managed to find a clear spec for tar
>> format; the closest I've found so far is a description of what pax,
>> in its (supposedly-)tar-compatible mode, is supposed to read/write.)
> All of this can be found in:
> src/external/bsd/libarchive/dist/libarchive/archive_read_support_format_tar.c

Thank you!  I'll have a look.

> If the libarchive tar doesn't see a "ustar  \0" (GNU tar) or "ustar"
> (POSIX tar) magic at 0x101 (see: tar_read_header()), it take the file
> to be a non-POSIX old-style tar archive which (according to
> libarchive) doesn't store maj./min. nos. (see: struct
> archive_entry_header_ustar)

That is ... a significant deviation from historical practice, to the
extent that I would call it a bug in libarchive's tar support.  (I
don't think I've ever stumbled across any other tar that didn't
understand mtar's archives, though admittedly I don't pass archives
including device special files between implementations very often, so
if the incompatibility is limited to them I might well not notice.)

> Maybe your tar could supply a "ustar" magic char. seq. at 0x101 for
> libarchive.  (see: header_ustar() vs. header_old_tar())

I'll read the file you pointed at (though the path makes it sound like
a description of what libarchive chooses to do rather than anything
authoritative, though admittedly I don't know whether there _is_
anything authoritative when it comes to tar in general, as opposed to
specific tar implementations).

> Or, fix libarchive like this: [...]

If this isn't just a NetBSD oddity, I'd prefer to generate archives
that are more widely compatible.  Maybe even if it is.  Either way,
fixing libarchive is counterindicated (unless NetBSD is willing to take
up the changes, which strikes me as unlikely).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: tar vs device special files

2023-10-28 Thread Joerg Sonnenberger
On Sunday, October 29, 2023 12:40:06 AM CEST RVP wrote:
> On Sat, 28 Oct 2023, Mouse wrote:
> > I'm having trouble seeing what's responsible, and in particular am
> > wondering whether this is my bug or /bin/tar's bug or what.  (It
> > doesn't help that I haven't managed to find a clear spec for tar
> > format; the closest I've found so far is a description of what pax, in
> > its (supposedly-)tar-compatible mode, is supposed to read/write.)
> 
> All of this can be found in:
> 
> src/external/bsd/libarchive/dist/libarchive/
archive_read_support_format_tar.c

There is even a man page going over many of the variants and the details.

> Maybe your tar could supply a "ustar" magic char. seq. at 0x101 for
> libarchive. (see: header_ustar() vs. header_old_tar())

I don't think any one else cares about pre-ustar. Pretty much any reader and 
writer around uses at least ustar and generally wants to have extended POSIX 
as well when caring about large files. I see no reasons for adding random hacks 
for outdated tar programs with little real world exposure, changes are high it 
is going to break something with other archives.

Joerg




Re: tar vs device special files

2023-10-28 Thread RVP

On Sat, 28 Oct 2023, Mouse wrote:


I'm having trouble seeing what's responsible, and in particular am
wondering whether this is my bug or /bin/tar's bug or what.  (It
doesn't help that I haven't managed to find a clear spec for tar
format; the closest I've found so far is a description of what pax, in
its (supposedly-)tar-compatible mode, is supposed to read/write.)



All of this can be found in:

src/external/bsd/libarchive/dist/libarchive/archive_read_support_format_tar.c

If the libarchive tar doesn't see a "ustar  \0" (GNU tar) or "ustar"
(POSIX tar) magic at 0x101 (see: tar_read_header()), it take the
file to be a non-POSIX old-style tar archive which (according to
libarchive) doesn't store maj./min. nos. (see: struct 
archive_entry_header_ustar)


The 9.1 /bin/tar tarball (hexdump -C) is

00a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0100  00 75 73 74 61 72 00 30  30 72 6f 6f 74 00 00 00  |.ustar.00root...|
0110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
0120  00 00 00 00 00 00 00 00  00 6f 70 65 72 61 74 6f  |.operato|
0130  72 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |r...|
0140  00 00 00 00 00 00 00 00  00 30 30 30 30 30 33 20  |.03 |
0150  00 30 30 30 30 30 33 20  00 00 00 00 00 00 00 00  |.03 |
0160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||

whereas mine is

00a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0140  00 00 00 00 00 00 00 00  00 30 30 30 30 30 33 20  |.03 |
0150  00 30 30 30 30 30 33 20  00 00 00 00 00 00 00 00  |.03 |
0160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||



Except for the stuff at offsets 0x100-0x131, they look pretty close to
identical to me (the value at 0x94 is the header checksum), and that
stuff is, as far as I can tell, owner name strings (which I'm not
supplying, just using the numeric uid and gid values).  But the stock
9.1 tar seems to be taking the 03 major and minor numbers as zero
for reasons I don't understand, since it understands its own,
apparently identical, major and minor numbers just fine.

Any ideas?



Maybe your tar could supply a "ustar" magic char. seq. at 0x101 for libarchive.
(see: header_ustar() vs. header_old_tar())

Or, fix libarchive like this:

```
diff -urN 
a/src/external/bsd/libarchive/dist/libarchive/archive_read_support_format_tar.c 
b/src/external/bsd/libarchive/dist/libarchive/archive_read_support_format_tar.c
--- 
a/src/external/bsd/libarchive/dist/libarchive/archive_read_support_format_tar.c 
2019-07-24 13:50:23.0 +
+++ 
b/src/external/bsd/libarchive/dist/libarchive/archive_read_support_format_tar.c 
2023-10-28 22:10:28.778721000 +
@@ -1383,6 +1383,14 @@
if (err > err2)
err = err2;

+   /* Parse out device numbers only for char and block specials. */
+   if (header->typeflag[0] == '3' || header->typeflag[0] == '4') {
+   archive_entry_set_rdevmajor(entry, (dev_t)
+   tar_atol(header->rdevmajor, sizeof(header->rdevmajor)));
+   archive_entry_set_rdevminor(entry, (dev_t)
+   tar_atol(header->rdevminor, sizeof(header->rdevminor)));
+   }
+
tar->entry_padding = 0x1ff & (-tar->entry_bytes_remaining);
return (err);
 }
```

-RVP