Hi,

I'm running FreeBSD-7.0 RELEASE with the following patch to the kernel (I know it's integrated in the latest patchlevels which you get when you do freebsd-update, but since I'm still getting state-mismatches WITH the patch I'm holding off on the upgrade until I have more information as to the nature of the problem):

*** net/pf.c    2007/09/07 21:34:10     1.558
--- net/pf.c    2007/09/18 19:45:59     1.559
*************** pf_test_state_tcp(struct pf_state **state, int directi
*** 3730,3735 ****
--- 3730,3751 ----
                        REASON_SET(reason, PFRES_SYNPROXY);
                        return (PF_SYNPROXY_DROP);
                }
+       }
+
+       if (((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN) &&
+           dst->state >= TCPS_FIN_WAIT_2 &&
+           src->state >= TCPS_FIN_WAIT_2) {
+               if (pf_status.debug >= PF_DEBUG_MISC) {
+                       printf("pf: state reuse ");
+                       pf_print_state(*state);
+                       pf_print_flags(th->th_flags);
+                       printf("\n");
+               }
+               /* XXX make sure it's the same direction ?? */
+               (*state)->src.state = (*state)->dst.state = TCPS_CLOSED;
+               pf_unlink_state(*state);
+               *state = NULL;
+               return (PF_DROP);
        }

        if (src->wscale && dst->wscale && !(th->th_flags & TH_SYN)) {


The problem I'm having is that I get intermittent connection refused/operation not permitted to another machine on the local network. When I do pfctl -s info I see *huge* numbers of state mismatches:

Status: Enabled for 94 days 01:27:40          Debug: Urgent

State Table                          Total             Rate
  current entries                      398
  searches                       986228319          121.4/s
  inserts                        104049508           12.8/s
  removals                       104049110           12.8/s
Counters
  match                          107482262           13.2/s
  bad-offset                             0            0.0/s
  fragment                               0            0.0/s
  short                                  0            0.0/s
  normalize                             42            0.0/s
  memory                           3125235            0.4/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                              0            0.0/s
  proto-cksum                        13919            0.0/s
  state-mismatch                   3039814            0.4/s
  state-insert                           0            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                               0            0.0/s

This is causing serious problems at them moment. It seems that the state problems occur in certain small time windows (my nagios starts reporting that every service is connection refused/operation not permitted, which is about 20 services). Then I get 20 recovery messages.

The firewall rules are trivially simple, $ext_if has 2 ips and $int_if has one:

interfaces = "{" $ext_if "," $int_if "}"

scrub in all
set skip on lo0
antispoof for $interfaces inet
block out log quick on $ext_if from !$ext_ip1 to any
block in quick on $ext_if from any to 255.255.255.255
block log all

pass in quick inet proto icmp all icmp-type $icmp_types

pass in quick on $int_if from $int_net to any
pass out quick on $int_if from any to $int_net

pass out on $ext_if proto tcp all
pass out on $ext_if proto { udp, icmp } all
pass in on $ext_if proto tcp from any to $ext_ip1 port $tcp_services1
pass in on $ext_if proto tcp from any to $ext_ip2 port $tcp_services2

Does anybody have any idea what's going on and where I can look? This is a production server so it's seriously influencing the quality of the hosted services. :-(


Regards,
Sebastiaan

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to