Re: [PATCH] PF: cksum modification & refactor [3/24]

Richard Procter Tue, 01 Sep 2015 21:04:49 -0700

Hi Sasha, 

On 1/09/2015, at 12:47 AM, Alexandr Nedvedicky wrote:
> the code in patch looks good for the first glance. However it seems to me
> the newly introduced pf_cksum_fixup*() are not called yet.  Do you think you
> can reshuffle changes between your set of patches a bit, so the newly
> introduced functions will become alive (get called)?
> 
> Also I think your patch 0/24, you've sent earlier, can be fold here (setting
> pd->pcksum to point to icmp6 header chksum field).


Sure thing. I'm on a bit of a learning curve with this process. 

The attached is an amalgam of the following patches in my series:

       0) pd->pcksum = icmp6 header cksum
       3) introduce pf_cksum_fixup*() 
       4) introduce pf_change_*() 
      22) misc. transitions to pf_change_*() interface

The new ones include my notes below.

Also, I'd like to explain my last patch adding 'const' to a line deleted
by the next. My aim is to ease review by using the compiler to establish
universals where possible. That helps to eliminate programmer error due to
'not looking hard enough at the code'. I also find it easier on my mental
health :)

You won't, though, want to burn CVS commits on patches that exist only to
show my 'working', and in future I'll bundle these in one email for commitment
as one cumulative patch.

* Introduce pf_change_{8,16,16_unaligned,32}() interface
        
        - modification of checksum-covered data is 
          assignment-with-side-effects. 
        - all new functions will be used by later patches

      +ve provides type-appropriate checksum modification
      +ve will replace existing 'altered value' guards, 
          reducing code length
      -ve five new functions in total

C assignment hides behind one assignment operator the nitty gritty of differing
l-value widths. As we cannot change the language to suit our needs, we are
obliged to expose these differences in our interface.

An added wrinkle is that our side-effect, namely, modifying the checksum,
depends on the alignment of the l-value within the packet (the checksum's
summands are 16-bit aligned with respect to the packet). The interface therefore
provides _unaligned() versions parameterised by the l-value's packet alignment,
either 'hi' or 'lo'. Thankfully, these are for most protocol fields unnecessary.

Later patches will augment these functions with 'altered value' guards,
allowing us to replace, e.g.

        if (icmpid != pd->hdr.icmp->icmp_id) {
                if (pd->csum_status == PF_CSUM_UNKNOWN)
                        pf_check_proto_cksum(pd, pd->off,
                            pd->tot_len - pd->off, pd->proto,
                            pd->af);
                pd->hdr.icmp->icmp_id = icmpid;
                rewrite = 1;
        }

with
        rewrite += pf_change_16(pd, &pd->hdr.icmp->icmp_id, icmpid);

And, despite the new code, by the end of this series pf is net ~ 500 bytes 
shorter (i386).

Lastly, mikeb@ suggested the name pf_patch_{...}(), which is both shorter and
more descriptive. I like it, however we follow the preexisting 'change' 
terminology 
for consistency, also noting that 'match' and 'patch' may be, possibly, 
undesirably
similar. But if you'd rather this patch use pf_patch, I'd be happy to reissue 
it. 

* Convert miscellaneous packet modification to pf_change_*() interface

As pf_change_*() modifies the checksum, the checksum status 
needn't be computed: that is used only to decide whether we may regenerate the
checksum later on in pf_cksum() to account for the altered packet. (Other 
parts of the code do not appear to depend on these removed checks.)

Testing: Same code as in my email "[patch] cleaner checksum modification for 
pf", 
see my testing notes there.

Just to be sure/anal retentive, I have retested the more involved changes: 

        - normalize tcp->thx2 -> 0 (via patched hping3) 
        - mss clamping (both aligned and unaligned via customised OpenBSD)
and also 
        - NAT of icmpv4 ping

This was on i386; I haven't tested on an architecture with stronger alignment 
requirements but expect no problems. 

best, 
Richard. 

Index: net/pf.c
===================================================================
--- net.orig/pf.c
+++ net/pf.c
@@ -145,7 +145,10 @@ void                        pf_init_threshold(struct 
pf_thre
                            u_int32_t);
 void                    pf_add_threshold(struct pf_threshold *);
 int                     pf_check_threshold(struct pf_threshold *);
-
+void                    pf_cksum_fixup(u_int16_t *, u_int16_t, u_int16_t,
+                           u_int8_t);
+void                    pf_cksum_fixup_a(u_int16_t *, const struct pf_addr *,
+                           const struct pf_addr *, sa_family_t, u_int8_t);
 void                    pf_change_ap(struct pf_pdesc *, struct pf_addr *,
                            u_int16_t *, struct pf_addr *, u_int16_t,
                            sa_family_t);
@@ -286,6 +289,8 @@ static __inline int pf_state_compare_key
        struct pf_state_key *);
 static __inline int pf_state_compare_id(struct pf_state *,
        struct pf_state *);
+static __inline void pf_cksum_uncover(u_int16_t *, u_int16_t, u_int8_t);
+static __inline void pf_cksum_cover(u_int16_t *, u_int16_t, u_int8_t);
 
 struct pf_src_tree tree_src_tracking;
 
@@ -1651,6 +1656,184 @@ pf_addr_wrap_neq(struct pf_addr_wrap *aw
        }
 }
 
+/* This function, given arguments of one endian, is invariant over the
+ * endian of the host. Why?
+ *
+ * Define the unary transpose operator ~ on a bitstring in python slice
+ * notation as lambda m: m[X:] + m[:X] , for some constant X
+ *
+ * Th. ~ distributes over ones-complement addition, denoted by +_1, i.e.
+ *
+ *     ~m +_1 ~n  =  ~(m +_1 n)    (for all bitstrings m,n of equal length)
+ *
+ * Proof. Regard the bitstrings in m +_1 n as split at X, forming at
+ * most two 'half-adds'. Under ones-complement addition, each half-add
+ * carries to the other, so the sum of each half-add is unaffected by
+ * their relative order. Therefore:
+ *
+ *     ~m +_1 ~n
+ *   =    { half-adds invariant under transposition }
+ *     ~s
+ *   =    { substitute }
+ *     ~(m +_1 n)                   [end of proof]
+ *
+ * Th. Summing two in-memory ones-complement 16-bit variables m,n
+ * on a machine with the converse endian does not alter the result.
+ *
+ * Proof.
+ *        { converse machine endian: load/store transposes, X := 8 }
+ *     ~(~m +_1 ~n)
+ *   =    { ~ over +_1 }
+ *     ~~m +_1 ~~n
+ *   =    { ~ is an involution }
+ *      m +_1 n                     [end of proof]
+ */
+void
+pf_cksum_fixup(u_int16_t *cksum, u_int16_t was, u_int16_t now,
+    u_int8_t proto)
+{
+       u_int32_t l;
+       const int udp = proto == IPPROTO_UDP;
+
+       if (udp && *cksum == 0x0000)
+               return;
+
+       l = *cksum + was - now;
+       l = ((l >> 16) + (l & 0xffff))  &  0xffff;
+
+       if (udp && l == 0x0000)
+               l = 0xffff;
+
+        *cksum = (u_int16_t)(l);
+}
+
+/* pre: coverage(cksum) covers coverage(cksum_covered) */
+static __inline void
+pf_cksum_uncover(u_int16_t *cksum, u_int16_t cksum_covered, u_int8_t proto)
+{
+       pf_cksum_fixup(cksum, ~cksum_covered, 0x0, proto);
+}
+
+/* pre: disjoint(coverage(cksum), coverage(cksum_uncovered)) */
+static __inline void
+pf_cksum_cover(u_int16_t *cksum, u_int16_t cksum_uncovered, u_int8_t proto)
+{
+       pf_cksum_fixup(cksum, 0x0, ~cksum_uncovered, proto);
+}
+
+/* pre: changes are 16-bit aligned within the packet
+ *
+ * We emulate 16-bit ones-complement arithmetic by conserving its carries,
+ * which twos-complement otherwise discards, in the upper 16 bits of l.
+ * These accumulated carries when added to the lower 16-bits then
+ * complete the ones-complement sum.
+ *
+ * Note, the accumulator, despite l being unsigned, supports net-negative
+ * carries:
+ *
+ * Arithmetic or assignment on n unsigned bits is modulo 2^n.
+ * Def. x mod y  =  x - (x//y)*y   for integer x,y
+ *
+ * Th. (x + (y mod z)) mod z
+ *    =  { def mod }
+ *     (x + y - (y//z)*z) mod z
+ *    =  { (x + y*z) mod z = x mod z }
+ *     (x + y) mod z   (0)
+ *
+ * Now, the value of the unsigned m-bit accumulator having assigned
+ * integer x to it is (x mod 2^m). Added to the sum, we have:
+ *
+ *   (sum + (x mod 2^m)) mod 2^n
+ * =     { accumulator same width as sum; m = n }
+ *   (sum + (x mod 2^n)) mod 2^n
+ * =     { (0) }
+ *   (sum + x) mod 2^n
+ *
+ * ... and when x < 0 this equals (sum - |x|) mod 2^n
+ *
+ * The scheme is therefore correct over a range of at least plus or
+ * minus 2^16 - 1 accumulated carries, afterwhich the accumulator
+ * wraps. This far exceeds the worst case below of plus or minus 8.
+ */
+void
+pf_cksum_fixup_a(u_int16_t *cksum, const struct pf_addr *a,
+    const struct pf_addr *an, sa_family_t af, u_int8_t proto)
+{
+       u_int32_t        l;
+       const u_int16_t *n = an->addr16;
+       const u_int16_t *o = a->addr16;
+       const int        udp = proto == IPPROTO_UDP;
+
+       switch (af) {
+       case AF_INET:
+               l = *cksum + o[0] - n[0] + o[1] - n[1];
+               break;
+#ifdef INET6
+       case AF_INET6:
+               l = *cksum + o[0] - n[0] + o[1] - n[1] + o[2] - n[2] +
+                   o[3] - n[3] + o[4] - n[4] + o[5] - n[5] + o[6] -
+                   n[6] + o[7] - n[7];
+               break;
+#endif /* INET6 */
+       default:
+               unhandled_af(af);
+       }
+
+       if (udp && *cksum == 0x0000)
+               return;
+
+       l = ((l >> 16) + (l & 0xffff))  &  0xffff;
+
+       if (udp && l == 0x0000)
+               l = 0xffff;
+
+       *cksum = (u_int16_t)(l);
+}
+
+void
+pf_change_8(struct pf_pdesc *pd, u_int8_t *f, u_int8_t v, bool hi)
+{
+       u_int16_t new = htons(hi ? ( v << 8) :  v);
+       u_int16_t old = htons(hi ? (*f << 8) : *f);
+
+       pf_cksum_fixup(pd->pcksum, old, new, pd->proto);
+       *f = v;
+}
+
+/* pre: *f is 16-bit aligned within its packet */
+void
+pf_change_16(struct pf_pdesc *pd, u_int16_t *f, u_int16_t v)
+{
+       pf_cksum_fixup(pd->pcksum, *f, v, pd->proto);
+       *f = v;
+}
+
+void
+pf_change_16_unaligned(struct pf_pdesc *pd, void *f, u_int16_t v, bool hi)
+{
+       u_int8_t *fb = (u_int8_t*)f;
+       u_int8_t *vb = (u_int8_t*)&v;
+
+       if (hi && ALIGNED_POINTER(f, u_int16_t)) {
+               pf_change_16(pd, f, v); /* optimise */
+               return;
+       }
+
+       pf_change_8(pd, fb++, *vb++, hi);
+       pf_change_8(pd, fb++, *vb++,!hi);
+}
+
+/* pre: *f is 16-bit aligned within its packet */
+void
+pf_change_32(struct pf_pdesc *pd, u_int32_t *f, u_int32_t v)
+{
+       u_int16_t *pc = pd->pcksum;
+
+       pf_cksum_fixup(pc, *f / (1 << 16), v / (1 << 16), pd->proto);
+       pf_cksum_fixup(pc, *f % (1 << 16), v % (1 << 16), pd->proto);
+       *f = v;
+}
+
 void
 pf_change_ap(struct pf_pdesc *pd, struct pf_addr *a, u_int16_t *p,
     struct pf_addr *an, u_int16_t pn, sa_family_t naf)
@@ -3722,11 +3905,8 @@ pf_translate(struct pf_pdesc *pd, struct
                        u_int16_t icmpid = (icmp_dir == PF_IN) ? sport : dport;
 
                        if (icmpid != pd->hdr.icmp->icmp_id) {
-                               if (pd->csum_status == PF_CSUM_UNKNOWN)
-                                       pf_check_proto_cksum(pd, pd->off,
-                                           pd->tot_len - pd->off, pd->proto,
-                                           pd->af);
-                               pd->hdr.icmp->icmp_id = icmpid;
+                               pf_change_16(pd,
+                                   &pd->hdr.icmp->icmp_id, icmpid);
                                rewrite = 1;
                        }
                }
@@ -3758,11 +3938,8 @@ pf_translate(struct pf_pdesc *pd, struct
                        u_int16_t icmpid = (icmp_dir == PF_IN) ? sport : dport;
 
                        if (icmpid != pd->hdr.icmp6->icmp6_id) {
-                               if (pd->csum_status == PF_CSUM_UNKNOWN)
-                                       pf_check_proto_cksum(pd, pd->off,
-                                           pd->tot_len - pd->off, pd->proto,
-                                           pd->af);
-                               pd->hdr.icmp6->icmp6_id = icmpid;
+                               pf_change_16(pd,
+                                   &pd->hdr.icmp6->icmp6_id, icmpid);
                                rewrite = 1;
                        }
                }
@@ -4570,11 +4747,8 @@ pf_test_state_icmp(struct pf_pdesc *pd,
                                }
 
                                if (nk->port[iidx] !=  pd->hdr.icmp->icmp_id) {
-                                       if (pd->csum_status == PF_CSUM_UNKNOWN)
-                                               pf_check_proto_cksum(pd,
-                                                   pd->off, pd->tot_len -
-                                                   pd->off, pd->proto, pd->af);
-                                       pd->hdr.icmp->icmp_id = nk->port[iidx];
+                                       pf_change_16(pd, &pd->hdr.icmp->icmp_id,
+                                           nk->port[iidx]);
                                }
 
                                m_copyback(pd->m, pd->off, ICMP_MINLEN,
@@ -4602,12 +4776,9 @@ pf_test_state_icmp(struct pf_pdesc *pd,
                                }
 
                                if (nk->port[iidx] != pd->hdr.icmp6->icmp6_id) {
-                                       if (pd->csum_status == PF_CSUM_UNKNOWN)
-                                               pf_check_proto_cksum(pd,
-                                                   pd->off, pd->tot_len -
-                                                   pd->off, pd->proto, pd->af);
-                                       pd->hdr.icmp6->icmp6_id =
-                                           nk->port[iidx];
+                                       pf_change_16(pd,
+                                           &pd->hdr.icmp6->icmp6_id,
+                                           nk->port[iidx]);
                                }
 
                                m_copyback(pd->m, pd->off,
@@ -6229,6 +6400,7 @@ pf_setup_pdesc(struct pf_pdesc *pd, void
                        REASON_SET(reason, PFRES_SHORT);
                        return (PF_DROP);
                }
+               pd->pcksum = &pd->hdr.icmp6->icmp6_cksum;
                break;
        }
 #endif /* INET6 */
Index: net/pfvar.h
===================================================================
--- net.orig/pfvar.h
+++ net/pfvar.h
@@ -1719,6 +1719,13 @@ void     pf_addr_inc(struct pf_addr *, sa_fa
 
 void   *pf_pull_hdr(struct mbuf *, int, void *, int, u_short *, u_short *,
            sa_family_t);
+#define PF_HI (true)
+#define PF_LO (!PF_HI)
+#define PF_ALGNMNT(off) (((off) % 2) == 0 ? PF_HI : PF_LO)
+void   pf_change_8(struct pf_pdesc *, u_int8_t *, u_int8_t, bool);
+void   pf_change_16(struct pf_pdesc *, u_int16_t *, u_int16_t);
+void   pf_change_16_unaligned(struct pf_pdesc *, void *, u_int16_t, bool);
+void   pf_change_32(struct pf_pdesc *, u_int32_t *, u_int32_t);
 void   pf_change_a(struct pf_pdesc *, void *, u_int32_t);
 int    pf_check_proto_cksum(struct pf_pdesc *, int, int, u_int8_t,
            sa_family_t);
Index: net/pf_norm.c
===================================================================
--- net.orig/pf_norm.c
+++ net/pf_norm.c
@@ -845,10 +845,6 @@ pf_normalize_tcp(struct pf_pdesc *pd)
        u_int8_t         flags;
        u_int            rewrite = 0;
 
-       if (pd->csum_status == PF_CSUM_UNKNOWN)
-               pf_check_proto_cksum(pd, pd->off, pd->tot_len - pd->off,
-                   pd->proto, pd->af);
-
        flags = th->th_flags;
        if (flags & TH_SYN) {
                /* Illegal packet */
@@ -870,15 +866,18 @@ pf_normalize_tcp(struct pf_pdesc *pd)
        }
 
        /* If flags changed, or reserved data set, then adjust */
-       if (flags != th->th_flags || th->th_x2 != 0) {
-               th->th_flags = flags;
-               th->th_x2 = 0;
-               rewrite = 1;
-       }
+       if (flags != th->th_flags || th->th_x2 != 0) {
+               /* hack: set 4-bit th_x2 = 0 */
+               u_int8_t *th_off = (u_int8_t*)(&th->th_ack+1);
+               pf_change_8(pd, th_off, th->th_off << 4, PF_HI);
+
+               pf_change_8(pd, &th->th_flags, flags, PF_LO);
+               rewrite = 1;
+       }
 
        /* Remove urgent pointer, if TH_URG is not set */
        if (!(flags & TH_URG) && th->th_urp) {
-               th->th_urp = 0;
+               pf_change_16(pd, &th->th_urp, 0);
                rewrite = 1;
        }
 
@@ -1382,12 +1381,8 @@ pf_normalize_mss(struct pf_pdesc *pd, u_
        u_int16_t        mss;
        int              thoff;
        int              opt, cnt, optlen = 0;
-       u_char           opts[MAX_TCPOPTLEN];
-       u_char          *optp = opts;
-
-       if (pd->csum_status == PF_CSUM_UNKNOWN)
-               pf_check_proto_cksum(pd, pd->off, pd->tot_len - pd->off,
-                   pd->proto, pd->af);
+       u_int8_t         opts[MAX_TCPOPTLEN];
+       u_int8_t        *optp = opts;
 
        thoff = th->th_off << 2;
        cnt = thoff - sizeof(struct tcphdr);
@@ -1410,12 +1405,15 @@ pf_normalize_mss(struct pf_pdesc *pd, u_
                                break;
                }
                if (opt == TCPOPT_MAXSEG) {
-                       memcpy(&mss, (optp + 2), 2);
+                       u_int8_t *mssp = optp + 2;
+                       memcpy(&mss, mssp, sizeof(mss));
                        if (ntohs(mss) > maxmss) {
-                               mss = htons(maxmss);
+                               size_t mssoffopts = mssp - opts;
+                               pf_change_16_unaligned(pd, &mss,
+                                   htons(maxmss), PF_ALGNMNT(mssoffopts));
                                m_copyback(pd->m,
-                                   pd->off + sizeof(*th) + optp + 2 - opts,
-                                   2, &mss, M_NOWAIT);
+                                   pd->off + sizeof(*th) + mssoffopts,
+                                   sizeof(mss), &mss, M_NOWAIT);
                                pf_cksum(pd, pd->m);
                                m_copyback(pd->m, pd->off, sizeof(*th), th,
                                    M_NOWAIT);

Re: [PATCH] PF: cksum modification & refactor [3/24]

Reply via email to