Hi Sasha, On 1/09/2015, at 12:47 AM, Alexandr Nedvedicky wrote: > the code in patch looks good for the first glance. However it seems to me > the newly introduced pf_cksum_fixup*() are not called yet. Do you think you > can reshuffle changes between your set of patches a bit, so the newly > introduced functions will become alive (get called)? > > Also I think your patch 0/24, you've sent earlier, can be fold here (setting > pd->pcksum to point to icmp6 header chksum field).
Sure thing. I'm on a bit of a learning curve with this process. The attached is an amalgam of the following patches in my series: 0) pd->pcksum = icmp6 header cksum 3) introduce pf_cksum_fixup*() 4) introduce pf_change_*() 22) misc. transitions to pf_change_*() interface The new ones include my notes below. Also, I'd like to explain my last patch adding 'const' to a line deleted by the next. My aim is to ease review by using the compiler to establish universals where possible. That helps to eliminate programmer error due to 'not looking hard enough at the code'. I also find it easier on my mental health :) You won't, though, want to burn CVS commits on patches that exist only to show my 'working', and in future I'll bundle these in one email for commitment as one cumulative patch. * Introduce pf_change_{8,16,16_unaligned,32}() interface - modification of checksum-covered data is assignment-with-side-effects. - all new functions will be used by later patches +ve provides type-appropriate checksum modification +ve will replace existing 'altered value' guards, reducing code length -ve five new functions in total C assignment hides behind one assignment operator the nitty gritty of differing l-value widths. As we cannot change the language to suit our needs, we are obliged to expose these differences in our interface. An added wrinkle is that our side-effect, namely, modifying the checksum, depends on the alignment of the l-value within the packet (the checksum's summands are 16-bit aligned with respect to the packet). The interface therefore provides _unaligned() versions parameterised by the l-value's packet alignment, either 'hi' or 'lo'. Thankfully, these are for most protocol fields unnecessary. Later patches will augment these functions with 'altered value' guards, allowing us to replace, e.g. if (icmpid != pd->hdr.icmp->icmp_id) { if (pd->csum_status == PF_CSUM_UNKNOWN) pf_check_proto_cksum(pd, pd->off, pd->tot_len - pd->off, pd->proto, pd->af); pd->hdr.icmp->icmp_id = icmpid; rewrite = 1; } with rewrite += pf_change_16(pd, &pd->hdr.icmp->icmp_id, icmpid); And, despite the new code, by the end of this series pf is net ~ 500 bytes shorter (i386). Lastly, mikeb@ suggested the name pf_patch_{...}(), which is both shorter and more descriptive. I like it, however we follow the preexisting 'change' terminology for consistency, also noting that 'match' and 'patch' may be, possibly, undesirably similar. But if you'd rather this patch use pf_patch, I'd be happy to reissue it. * Convert miscellaneous packet modification to pf_change_*() interface As pf_change_*() modifies the checksum, the checksum status needn't be computed: that is used only to decide whether we may regenerate the checksum later on in pf_cksum() to account for the altered packet. (Other parts of the code do not appear to depend on these removed checks.) Testing: Same code as in my email "[patch] cleaner checksum modification for pf", see my testing notes there. Just to be sure/anal retentive, I have retested the more involved changes: - normalize tcp->thx2 -> 0 (via patched hping3) - mss clamping (both aligned and unaligned via customised OpenBSD) and also - NAT of icmpv4 ping This was on i386; I haven't tested on an architecture with stronger alignment requirements but expect no problems. best, Richard. Index: net/pf.c =================================================================== --- net.orig/pf.c +++ net/pf.c @@ -145,7 +145,10 @@ void pf_init_threshold(struct pf_thre u_int32_t); void pf_add_threshold(struct pf_threshold *); int pf_check_threshold(struct pf_threshold *); - +void pf_cksum_fixup(u_int16_t *, u_int16_t, u_int16_t, + u_int8_t); +void pf_cksum_fixup_a(u_int16_t *, const struct pf_addr *, + const struct pf_addr *, sa_family_t, u_int8_t); void pf_change_ap(struct pf_pdesc *, struct pf_addr *, u_int16_t *, struct pf_addr *, u_int16_t, sa_family_t); @@ -286,6 +289,8 @@ static __inline int pf_state_compare_key struct pf_state_key *); static __inline int pf_state_compare_id(struct pf_state *, struct pf_state *); +static __inline void pf_cksum_uncover(u_int16_t *, u_int16_t, u_int8_t); +static __inline void pf_cksum_cover(u_int16_t *, u_int16_t, u_int8_t); struct pf_src_tree tree_src_tracking; @@ -1651,6 +1656,184 @@ pf_addr_wrap_neq(struct pf_addr_wrap *aw } } +/* This function, given arguments of one endian, is invariant over the + * endian of the host. Why? + * + * Define the unary transpose operator ~ on a bitstring in python slice + * notation as lambda m: m[X:] + m[:X] , for some constant X + * + * Th. ~ distributes over ones-complement addition, denoted by +_1, i.e. + * + * ~m +_1 ~n = ~(m +_1 n) (for all bitstrings m,n of equal length) + * + * Proof. Regard the bitstrings in m +_1 n as split at X, forming at + * most two 'half-adds'. Under ones-complement addition, each half-add + * carries to the other, so the sum of each half-add is unaffected by + * their relative order. Therefore: + * + * ~m +_1 ~n + * = { half-adds invariant under transposition } + * ~s + * = { substitute } + * ~(m +_1 n) [end of proof] + * + * Th. Summing two in-memory ones-complement 16-bit variables m,n + * on a machine with the converse endian does not alter the result. + * + * Proof. + * { converse machine endian: load/store transposes, X := 8 } + * ~(~m +_1 ~n) + * = { ~ over +_1 } + * ~~m +_1 ~~n + * = { ~ is an involution } + * m +_1 n [end of proof] + */ +void +pf_cksum_fixup(u_int16_t *cksum, u_int16_t was, u_int16_t now, + u_int8_t proto) +{ + u_int32_t l; + const int udp = proto == IPPROTO_UDP; + + if (udp && *cksum == 0x0000) + return; + + l = *cksum + was - now; + l = ((l >> 16) + (l & 0xffff)) & 0xffff; + + if (udp && l == 0x0000) + l = 0xffff; + + *cksum = (u_int16_t)(l); +} + +/* pre: coverage(cksum) covers coverage(cksum_covered) */ +static __inline void +pf_cksum_uncover(u_int16_t *cksum, u_int16_t cksum_covered, u_int8_t proto) +{ + pf_cksum_fixup(cksum, ~cksum_covered, 0x0, proto); +} + +/* pre: disjoint(coverage(cksum), coverage(cksum_uncovered)) */ +static __inline void +pf_cksum_cover(u_int16_t *cksum, u_int16_t cksum_uncovered, u_int8_t proto) +{ + pf_cksum_fixup(cksum, 0x0, ~cksum_uncovered, proto); +} + +/* pre: changes are 16-bit aligned within the packet + * + * We emulate 16-bit ones-complement arithmetic by conserving its carries, + * which twos-complement otherwise discards, in the upper 16 bits of l. + * These accumulated carries when added to the lower 16-bits then + * complete the ones-complement sum. + * + * Note, the accumulator, despite l being unsigned, supports net-negative + * carries: + * + * Arithmetic or assignment on n unsigned bits is modulo 2^n. + * Def. x mod y = x - (x//y)*y for integer x,y + * + * Th. (x + (y mod z)) mod z + * = { def mod } + * (x + y - (y//z)*z) mod z + * = { (x + y*z) mod z = x mod z } + * (x + y) mod z (0) + * + * Now, the value of the unsigned m-bit accumulator having assigned + * integer x to it is (x mod 2^m). Added to the sum, we have: + * + * (sum + (x mod 2^m)) mod 2^n + * = { accumulator same width as sum; m = n } + * (sum + (x mod 2^n)) mod 2^n + * = { (0) } + * (sum + x) mod 2^n + * + * ... and when x < 0 this equals (sum - |x|) mod 2^n + * + * The scheme is therefore correct over a range of at least plus or + * minus 2^16 - 1 accumulated carries, afterwhich the accumulator + * wraps. This far exceeds the worst case below of plus or minus 8. + */ +void +pf_cksum_fixup_a(u_int16_t *cksum, const struct pf_addr *a, + const struct pf_addr *an, sa_family_t af, u_int8_t proto) +{ + u_int32_t l; + const u_int16_t *n = an->addr16; + const u_int16_t *o = a->addr16; + const int udp = proto == IPPROTO_UDP; + + switch (af) { + case AF_INET: + l = *cksum + o[0] - n[0] + o[1] - n[1]; + break; +#ifdef INET6 + case AF_INET6: + l = *cksum + o[0] - n[0] + o[1] - n[1] + o[2] - n[2] + + o[3] - n[3] + o[4] - n[4] + o[5] - n[5] + o[6] - + n[6] + o[7] - n[7]; + break; +#endif /* INET6 */ + default: + unhandled_af(af); + } + + if (udp && *cksum == 0x0000) + return; + + l = ((l >> 16) + (l & 0xffff)) & 0xffff; + + if (udp && l == 0x0000) + l = 0xffff; + + *cksum = (u_int16_t)(l); +} + +void +pf_change_8(struct pf_pdesc *pd, u_int8_t *f, u_int8_t v, bool hi) +{ + u_int16_t new = htons(hi ? ( v << 8) : v); + u_int16_t old = htons(hi ? (*f << 8) : *f); + + pf_cksum_fixup(pd->pcksum, old, new, pd->proto); + *f = v; +} + +/* pre: *f is 16-bit aligned within its packet */ +void +pf_change_16(struct pf_pdesc *pd, u_int16_t *f, u_int16_t v) +{ + pf_cksum_fixup(pd->pcksum, *f, v, pd->proto); + *f = v; +} + +void +pf_change_16_unaligned(struct pf_pdesc *pd, void *f, u_int16_t v, bool hi) +{ + u_int8_t *fb = (u_int8_t*)f; + u_int8_t *vb = (u_int8_t*)&v; + + if (hi && ALIGNED_POINTER(f, u_int16_t)) { + pf_change_16(pd, f, v); /* optimise */ + return; + } + + pf_change_8(pd, fb++, *vb++, hi); + pf_change_8(pd, fb++, *vb++,!hi); +} + +/* pre: *f is 16-bit aligned within its packet */ +void +pf_change_32(struct pf_pdesc *pd, u_int32_t *f, u_int32_t v) +{ + u_int16_t *pc = pd->pcksum; + + pf_cksum_fixup(pc, *f / (1 << 16), v / (1 << 16), pd->proto); + pf_cksum_fixup(pc, *f % (1 << 16), v % (1 << 16), pd->proto); + *f = v; +} + void pf_change_ap(struct pf_pdesc *pd, struct pf_addr *a, u_int16_t *p, struct pf_addr *an, u_int16_t pn, sa_family_t naf) @@ -3722,11 +3905,8 @@ pf_translate(struct pf_pdesc *pd, struct u_int16_t icmpid = (icmp_dir == PF_IN) ? sport : dport; if (icmpid != pd->hdr.icmp->icmp_id) { - if (pd->csum_status == PF_CSUM_UNKNOWN) - pf_check_proto_cksum(pd, pd->off, - pd->tot_len - pd->off, pd->proto, - pd->af); - pd->hdr.icmp->icmp_id = icmpid; + pf_change_16(pd, + &pd->hdr.icmp->icmp_id, icmpid); rewrite = 1; } } @@ -3758,11 +3938,8 @@ pf_translate(struct pf_pdesc *pd, struct u_int16_t icmpid = (icmp_dir == PF_IN) ? sport : dport; if (icmpid != pd->hdr.icmp6->icmp6_id) { - if (pd->csum_status == PF_CSUM_UNKNOWN) - pf_check_proto_cksum(pd, pd->off, - pd->tot_len - pd->off, pd->proto, - pd->af); - pd->hdr.icmp6->icmp6_id = icmpid; + pf_change_16(pd, + &pd->hdr.icmp6->icmp6_id, icmpid); rewrite = 1; } } @@ -4570,11 +4747,8 @@ pf_test_state_icmp(struct pf_pdesc *pd, } if (nk->port[iidx] != pd->hdr.icmp->icmp_id) { - if (pd->csum_status == PF_CSUM_UNKNOWN) - pf_check_proto_cksum(pd, - pd->off, pd->tot_len - - pd->off, pd->proto, pd->af); - pd->hdr.icmp->icmp_id = nk->port[iidx]; + pf_change_16(pd, &pd->hdr.icmp->icmp_id, + nk->port[iidx]); } m_copyback(pd->m, pd->off, ICMP_MINLEN, @@ -4602,12 +4776,9 @@ pf_test_state_icmp(struct pf_pdesc *pd, } if (nk->port[iidx] != pd->hdr.icmp6->icmp6_id) { - if (pd->csum_status == PF_CSUM_UNKNOWN) - pf_check_proto_cksum(pd, - pd->off, pd->tot_len - - pd->off, pd->proto, pd->af); - pd->hdr.icmp6->icmp6_id = - nk->port[iidx]; + pf_change_16(pd, + &pd->hdr.icmp6->icmp6_id, + nk->port[iidx]); } m_copyback(pd->m, pd->off, @@ -6229,6 +6400,7 @@ pf_setup_pdesc(struct pf_pdesc *pd, void REASON_SET(reason, PFRES_SHORT); return (PF_DROP); } + pd->pcksum = &pd->hdr.icmp6->icmp6_cksum; break; } #endif /* INET6 */ Index: net/pfvar.h =================================================================== --- net.orig/pfvar.h +++ net/pfvar.h @@ -1719,6 +1719,13 @@ void pf_addr_inc(struct pf_addr *, sa_fa void *pf_pull_hdr(struct mbuf *, int, void *, int, u_short *, u_short *, sa_family_t); +#define PF_HI (true) +#define PF_LO (!PF_HI) +#define PF_ALGNMNT(off) (((off) % 2) == 0 ? PF_HI : PF_LO) +void pf_change_8(struct pf_pdesc *, u_int8_t *, u_int8_t, bool); +void pf_change_16(struct pf_pdesc *, u_int16_t *, u_int16_t); +void pf_change_16_unaligned(struct pf_pdesc *, void *, u_int16_t, bool); +void pf_change_32(struct pf_pdesc *, u_int32_t *, u_int32_t); void pf_change_a(struct pf_pdesc *, void *, u_int32_t); int pf_check_proto_cksum(struct pf_pdesc *, int, int, u_int8_t, sa_family_t); Index: net/pf_norm.c =================================================================== --- net.orig/pf_norm.c +++ net/pf_norm.c @@ -845,10 +845,6 @@ pf_normalize_tcp(struct pf_pdesc *pd) u_int8_t flags; u_int rewrite = 0; - if (pd->csum_status == PF_CSUM_UNKNOWN) - pf_check_proto_cksum(pd, pd->off, pd->tot_len - pd->off, - pd->proto, pd->af); - flags = th->th_flags; if (flags & TH_SYN) { /* Illegal packet */ @@ -870,15 +866,18 @@ pf_normalize_tcp(struct pf_pdesc *pd) } /* If flags changed, or reserved data set, then adjust */ - if (flags != th->th_flags || th->th_x2 != 0) { - th->th_flags = flags; - th->th_x2 = 0; - rewrite = 1; - } + if (flags != th->th_flags || th->th_x2 != 0) { + /* hack: set 4-bit th_x2 = 0 */ + u_int8_t *th_off = (u_int8_t*)(&th->th_ack+1); + pf_change_8(pd, th_off, th->th_off << 4, PF_HI); + + pf_change_8(pd, &th->th_flags, flags, PF_LO); + rewrite = 1; + } /* Remove urgent pointer, if TH_URG is not set */ if (!(flags & TH_URG) && th->th_urp) { - th->th_urp = 0; + pf_change_16(pd, &th->th_urp, 0); rewrite = 1; } @@ -1382,12 +1381,8 @@ pf_normalize_mss(struct pf_pdesc *pd, u_ u_int16_t mss; int thoff; int opt, cnt, optlen = 0; - u_char opts[MAX_TCPOPTLEN]; - u_char *optp = opts; - - if (pd->csum_status == PF_CSUM_UNKNOWN) - pf_check_proto_cksum(pd, pd->off, pd->tot_len - pd->off, - pd->proto, pd->af); + u_int8_t opts[MAX_TCPOPTLEN]; + u_int8_t *optp = opts; thoff = th->th_off << 2; cnt = thoff - sizeof(struct tcphdr); @@ -1410,12 +1405,15 @@ pf_normalize_mss(struct pf_pdesc *pd, u_ break; } if (opt == TCPOPT_MAXSEG) { - memcpy(&mss, (optp + 2), 2); + u_int8_t *mssp = optp + 2; + memcpy(&mss, mssp, sizeof(mss)); if (ntohs(mss) > maxmss) { - mss = htons(maxmss); + size_t mssoffopts = mssp - opts; + pf_change_16_unaligned(pd, &mss, + htons(maxmss), PF_ALGNMNT(mssoffopts)); m_copyback(pd->m, - pd->off + sizeof(*th) + optp + 2 - opts, - 2, &mss, M_NOWAIT); + pd->off + sizeof(*th) + mssoffopts, + sizeof(mss), &mss, M_NOWAIT); pf_cksum(pd, pd->m); m_copyback(pd->m, pd->off, sizeof(*th), th, M_NOWAIT);