Re: dkwedge: checksum before swapping?

2023-05-08 Thread Mouse
>> But that comment clearly indicates that _someone_ thought it
>> reasonable to checksum before swapping, so I can't help wondering
>> what use case that's appropriate for.
> It's a checksum over the 16bit words in native byte order.  So when
> you access the words in opposite byte order, you need to swap the
> checksum too.

Yes.  But, to me, that would mean byteswap, then checksum.

In the case at hand, the checksum is also bit-order-independent (in C
terms, it's ^, not +), which means that byteswapping is almost
irrelevant.  But there are 8-bit members involved too (p_fstype and
p_frag) and strings (d_typename and d_packname), which don't get
swapped, so swapping and checksumming do not commute.

As a toy example, consider:

struct toy {
uint16_t a;
uint16_t b;
uint8_t c;
uint8_t d;
uint16_t checksum;
};

Let's set a=0xfeed, b=0xface, c=0xf0 d=0x0d.  On a big-endian machine,
the resulting octet stream is fe ed fa ce f0 0d, checksum f4 2e.  On
little-endian, ed fe ce fa f0 0d, checksum d3 09 - not 2e f4.

> Unlike the regular disklabel code (which ignores other-endian
> disklabels) the wedge autodiscover code will accept either.

Is that actually known to work, or is it more "is intended to accept
either"?  Because it looks to me as though it will not accept labels
checksummed and written natively by the other endianness, unless the
strings and fstype/frag values happen to be such that they checksum to
a multiple of 0x0101 (which is possible but unlikely).  And the comment
indicates that someone thought about the issue and came to what looks
to me like an incorrect conclusion.

> As for padding, the structure is nowadays defined with fixed size
> types and explicit padding fields, so we may still assume that the
> compiler won't add any further padding by itself.

Currently true, though still disturbingly fragile.  ("Nowadays"?  It
was so even as far back as 1.4T.  Well, as far as the fixed-size types
thing goes, at least; there are no explicit padding fields in any
version I have, but it is (presumably carefully) defined so there is no
need for them, either.  At least assuming a power-of-two
octet-addressed machine, a char * that's no bigger than 8 bytes, and a
non-malicious compiler.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: dkwedge: checksum before swapping?

2023-05-07 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>But that comment clearly indicates that _someone_ thought it
>reasonable to checksum before swapping, so I can't help wondering what
>use case that's appropriate for.

It's a checksum over the 16bit words in native byte order. So when
you access the words in opposite byte order, you need to swap the
checksum too.

Unlike the regular disklabel code (which ignores other-endian disklabels)
the wedge autodiscover code will accept either.

As for padding, the structure is nowadays defined with fixed size types
and explicit padding fields, so we may still assume that the compiler
won't add any further padding by itself.



dkwedge: checksum before swapping?

2023-05-07 Thread Mouse
In sys/dev/dkwedge/dkwedge_bsdlabel.c, I find (and I see more or less
the same code in 5.2 and what cvsweb.n.o shows me)

static int
validate_label(mbr_args_t *a, daddr_t label_sector, size_t label_offset)
{
...
/*
 * We have validated the partition count.  Checksum it.
 * Note that we purposefully checksum before swapping
 * the byte order.
 */
...code that does indeed checksum before swapping...
}

Does anyone know what the intent of this is?  The only reason I would
expect to see a byteswapped label is when it's written by a machine of
the other endianness, and in that case the checksum will be wrong
unless the label is swapped before checksumming (and even then only if
the compiler doesn't insert shims in the struct, not that it's likely
to given the present layout).

But that comment clearly indicates that _someone_ thought it
reasonable to checksum before swapping, so I can't help wondering what
use case that's appropriate for.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B