On Tue, Sep 14, 2021 at 10:48:44AM -0600, Theo de Raadt wrote:
> Mark Kettenis <[email protected]> wrote:
>
> > To be honest, I do think that adding __packed is a reasonable way to
> > handle protocol structs like this where performance doesn't really
> > matter. This translates into __attribute__((packed)) and both GCC and
> > LLVM started treating that in a way to signal that the data might not
> > be properly aligned and use byte access on architectures that need
> > strict alignment. This is still not explicitly documented but I don't
> > think the compiler writers can backtrack on that at this point.
>
> __packed does not mean "use bytes". It means ensure there are no
> padding bytes, and then on strict alignment systems when something is
> misaligned, the compiler can see it must use smaller loads.
>
> Unfortunately, tcpdump also parses encapsulated protocols, and some of
> the outer layers are limited to short-alignment. So if an inner layer
> has a 32-bit value inside the __packed array after packing, the compiler
> will believe it is still on a 32-bit boundary, and not use char or short
> accesses. This will fault, because tcpdump is passing a pointe to an
> object with tighter alignment.
>
> __packed is not saying "each object must be loaded as bytes", and I
> really am susprised you claim that is what happens today. That is a
> crazy expensive choice on some architectures.
>
> So I think __packed is not enough for what tcpdump is doing.
>
> Perhaps some of the compilers are being more cynical, or their costing
> of loads leads them to do smaller-storage operations... but I have a
> recollection that at least gcc on the alpha gets confused and does the
> wrong thing.
To appease the situation, I have 1. taken what Visa said about the nonce,
2. taken my old patch and changed the memcpy's to EXTRACT_64BITS() which
I had to homeroll off EXTRACT_32BITS(), and made a patch and tested it.
No bus errors again, though I don't know if it's the right approach. The
nonces in the tcpdump were sequential counting up from 1 as my wireguard
hardware was rebooting. I think I control-c'ed by 218 nonce or so.
Here is that patch:
Index: extract.h
===================================================================
RCS file: /cvs/src/usr.sbin/tcpdump/extract.h,v
retrieving revision 1.9
diff -u -p -u -r1.9 extract.h
--- extract.h 7 Oct 2007 16:41:05 -0000 1.9
+++ extract.h 14 Sep 2021 17:24:17 -0000
@@ -35,6 +35,17 @@
(u_int32_t)*((const u_int8_t *)(p) + 2) << 8 | \
(u_int32_t)*((const u_int8_t *)(p) + 3))
+#define EXTRACT_64BITS(p) \
+ ((u_int64_t)*((const u_int8_t *)(p) + 0) << 56 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 1) << 48 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 2) << 40 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 3) << 32 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 4) << 24 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 5) << 16 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 6) << 8 | \
+ (u_int64_t)*((const u_int8_t *)(p) + 7))
+
+
#define EXTRACT_24BITS(p) \
((u_int32_t)*((const u_int8_t *)(p) + 0) << 16 | \
(u_int32_t)*((const u_int8_t *)(p) + 1) << 8 | \
Index: print-wg.c
===================================================================
RCS file: /cvs/src/usr.sbin/tcpdump/print-wg.c,v
retrieving revision 1.6
diff -u -p -u -r1.6 print-wg.c
--- print-wg.c 14 Apr 2021 19:34:56 -0000 1.6
+++ print-wg.c 14 Sep 2021 17:24:17 -0000
@@ -103,6 +103,7 @@ wg_print(const u_char *bp, u_int length)
struct wg_cookie *cookie = (void *)bp;
struct wg_data *data = (void *)bp;
u_int caplen;
+ uint64_t nonce;
caplen = snapend - bp;
if (caplen < sizeof(type))
@@ -142,8 +143,11 @@ wg_print(const u_char *bp, u_int length)
printf("[wg] keepalive ");
if (caplen < offsetof(struct wg_data, mac))
goto trunc;
+
+ nonce = EXTRACT_64BITS(&data->nonce);
+
printf("to 0x%08x nonce %llu",
- letoh32(data->receiver), letoh64(data->nonce));
+ letoh32(data->receiver), letoh64(nonce));
break;
}
return;
[cut]
I cvs update'ed with -PAd flags to -current before making the patch this time.
Best Regards,
-peter