On Tue, Sep 14, 2021 at 10:48:44AM -0600, Theo de Raadt wrote:
> Mark Kettenis <[email protected]> wrote:
> 
> > To be honest, I do think that adding __packed is a reasonable way to
> > handle protocol structs like this where performance doesn't really
> > matter.  This translates into __attribute__((packed)) and both GCC and
> > LLVM started treating that in a way to signal that the data might not
> > be properly aligned and use byte access on architectures that need
> > strict alignment.  This is still not explicitly documented but I don't
> > think the compiler writers can backtrack on that at this point.
> 
> __packed does not mean "use bytes".  It means ensure there are no
> padding bytes, and then on strict alignment systems when something is
> misaligned, the compiler can see it must use smaller loads.
> 
> Unfortunately, tcpdump also parses encapsulated protocols, and some of
> the outer layers are limited to short-alignment.  So if an inner layer
> has a 32-bit value inside the __packed array after packing, the compiler
> will believe it is still on a 32-bit boundary, and not use char or short
> accesses.  This will fault, because tcpdump is passing a pointe to an
> object with tighter alignment.
> 
> __packed is not saying "each object must be loaded as bytes", and I
> really am susprised you claim that is what happens today.  That is a
> crazy expensive choice on some architectures.
> 
> So I think __packed is not enough for what tcpdump is doing.
> 
> Perhaps some of the compilers are being more cynical, or their costing
> of loads leads them to do smaller-storage operations... but I have a
> recollection that at least gcc on the alpha gets confused and does the
> wrong thing.

To appease the situation, I have 1. taken what Visa said about the nonce,
2. taken my old patch and changed the memcpy's to EXTRACT_64BITS() which
I had to homeroll off EXTRACT_32BITS(), and made a patch and tested it.

No bus errors again, though I don't know if it's the right approach.  The
nonces in the tcpdump were sequential counting up from 1 as my wireguard 
hardware was rebooting.  I think I control-c'ed by 218 nonce or so.

Here is that patch:


Index: extract.h
===================================================================
RCS file: /cvs/src/usr.sbin/tcpdump/extract.h,v
retrieving revision 1.9
diff -u -p -u -r1.9 extract.h
--- extract.h   7 Oct 2007 16:41:05 -0000       1.9
+++ extract.h   14 Sep 2021 17:24:17 -0000
@@ -35,6 +35,17 @@
        (u_int32_t)*((const u_int8_t *)(p) + 2) << 8 | \
        (u_int32_t)*((const u_int8_t *)(p) + 3))
 
+#define EXTRACT_64BITS(p) \
+       ((u_int64_t)*((const u_int8_t *)(p) + 0) << 56 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 1) << 48 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 2) << 40 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 3) << 32 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 4) << 24 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 5) << 16 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 6) << 8 | \
+       (u_int64_t)*((const u_int8_t *)(p) + 7))
+       
+
 #define EXTRACT_24BITS(p) \
        ((u_int32_t)*((const u_int8_t *)(p) + 0) << 16 | \
        (u_int32_t)*((const u_int8_t *)(p) + 1) << 8 | \
Index: print-wg.c
===================================================================
RCS file: /cvs/src/usr.sbin/tcpdump/print-wg.c,v
retrieving revision 1.6
diff -u -p -u -r1.6 print-wg.c
--- print-wg.c  14 Apr 2021 19:34:56 -0000      1.6
+++ print-wg.c  14 Sep 2021 17:24:17 -0000
@@ -103,6 +103,7 @@ wg_print(const u_char *bp, u_int length)
        struct wg_cookie        *cookie = (void *)bp;
        struct wg_data          *data = (void *)bp;
        u_int                    caplen;
+       uint64_t                nonce;
 
        caplen = snapend - bp;
        if (caplen < sizeof(type))
@@ -142,8 +143,11 @@ wg_print(const u_char *bp, u_int length)
                        printf("[wg] keepalive ");
                if (caplen < offsetof(struct wg_data, mac))
                        goto trunc;
+
+               nonce = EXTRACT_64BITS(&data->nonce);
+
                printf("to 0x%08x nonce %llu",
-                   letoh32(data->receiver), letoh64(data->nonce));
+                   letoh32(data->receiver), letoh64(nonce));
                break;
        }
        return;



[cut]

I cvs update'ed with -PAd flags to -current before making the patch this time.

Best Regards,
-peter

Reply via email to