On Tue, 23 Oct 2012, Marcin wrote: > Hi, > > I recently upgraded to 5.1, but I was able to reproduce the issue > described below with 4.8, 5.0 and 5.2 snapshot. > > After the upgrade I discovered that workstations behind the OpenBSD > firewall experience occasional timeouts > while trying to access web servers running IIS 6.0 on Windows 2003 > Server. The firewall itself is not affected. > The problem is rather intermittent and happens with 30%-50% > requests.The workstations are running Windows 7, > Windows XP and Linux. > > I was also able to reproduce the issue by installing Windows 2003 R2 > server in default configuration, > setting up extremely basic PF rules to redirect port 80 and accessing > the server from the Internet. I was unable to expose > this issue in LAN, which suggests it might happen only on links slower > than 100Mbit. However, it seems to > be hardware independent (although all tests were run on i386 arch) as > I achieve the same results on three > different machines in three different geographic locations connected > via independent ISPs. > > This is how the problem can be exposed with curl: > > #curl -vI http://www.startvbdotnet.com/ > * About to connect() to www.startvbdotnet.com port 80 (#0) > * Trying 64.79.160.13... connected > > > HEAD / HTTP/1.1 > > User-Agent: curl/7.22.0 (i686-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 > > zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Host: www.startvbdotnet.com > > Accept: */* > > * Recv failure: Connection reset by peer > * Closing connection #0 > curl: (56) Recv failure: Connection reset by peer > > I uploaded the tcpdump from machine running curl here: > http://pastebin.com/AkqCeQwW > > As far as I can tell, the Win 2008 and Win 2012 are not affected. > Also, the 4.5 seemed to be free from this problem. > > Thanks in advance for any suggestions / workarounds!
Unfortunately we have no idea what firewall rules you have configured, however I'm going to take random guess and say that you're using a scrub rule with 'reassemble tcp' - if this is the case you'll probably find that some TCP connections to Windows-based servers will fail, since they often violate RFC1323 by using a 0-value timestamp during the three-way handshake, then increase it by some value between 0 and 2^31 on the first data packet. Note the TS val fields in the first two packets from 64.79.160.13: 20:49:05.962419 IP (tos 0x0, ttl 117, id 20521, offset 0, flags [none], proto TCP (6), length 64) 64.79.160.13.80 > 192.168.1.20.51163: Flags [S.], cksum 0xd698 (correct), seq 2659727337, ack 77418264, win 16384, options [mss 1440,nop,wscale 0,nop,nop,TS val 0 ecr 0,nop,nop,sackOK], length 0 20:49:06.146338 IP (tos 0x0, ttl 117, id 14230, offset 0, flags [none], proto TCP (6), length 292) 64.79.160.13.80 > 192.168.1.20.51163: Flags [P.], cksum 0xd584 (correct), seq 1:241, ack 173, win 65363, options [nop,nop,TS val 2152972614 ecr 1629278667], length 240 Combined with the way that PF handles timestamp modulation (there is a subtle and difficult to fix bug here), you can trigger the PAWS checks, which results in packets being dropped. IIRC these drops will be logged if you run with 'pfctl -x debug'. Removing 'reassemble tcp' should resolve the issue. -- "Reason is not automatic. Those who deny it cannot be conquered by it. Do not count on them. Leave them alone." -- Ayn Rand