Re: hoststated/relayd and Linux's tcp_tw_recycle option
On 2008-04-18, Denis Doroshenko [EMAIL PROTECTED] wrote: google quickly gives a url http://kbase.redhat.com/faq/FAQ_80_6180.shtm where it is said It is likely an artifact of having tcp_tw_recycle and tcp_tw_reuse enabled in the sysctl settings. Work is underway at the moment to suppress these messages in further releases of Red Hat Enterprise Linux but is not a high priority because of the messages' benign nature. Oh so clever.
hoststated/relayd and Linux's tcp_tw_recycle option
I setup hoststated earlier this week to provide load balancing and fail over for a few Linux web servers. It went fairly smoothly, except that one of the Linux machines only passed the 'check http / code 200' test about 50% of the time. Just using 'check tcp' worked fine, and I saw the same results from another 4.2 box, and even from a 4.3-beta snapshot using relayd. I could run 'curl -I http://host/' on any of the OpenBSD boxes in a loop just fine, but the moment I started hoststated/relayd, curl would start failing about 50% of the time too. The SYN packets were showing up in tcpdump on the Linux machine's interface, but the kernel would just randomly refuse to respond. Because curl ran fine right up until hoststated was started, I assumed it was hoststated's fault for the longest time. But after giving up trying to find a bug there, I discovered that on the misbehaving Linux box, the net.ipv4.tcp_tw_recycle=1 sysctl was enabled. Apparently, this flag changes how Linux handles sockets in TIME-WAIT state (and violates the TCP specification, according to the sparse documentation), which I'm guessing doesn't play nicely with OpenBSD's sequence number randomization. It was originally set because one of the database vendors we spoke with suggested a bunch of sysctl changes for optimization (some necessary like fixing memory overcommit), but also bad ones like that. (Searching google shows a lot of hits for people mindlessly suggesting to enable tcp_tw_recycle.) Just thought I'd mention this in case it saves someone else the same frustrating experience.
Re: hoststated/relayd and Linux's tcp_tw_recycle option
google quickly gives a url http://kbase.redhat.com/faq/FAQ_80_6180.shtm where it is said It is likely an artifact of having tcp_tw_recycle and tcp_tw_reuse enabled in the sysctl settings. On Fri, Apr 18, 2008 at 8:08 PM, Matthew Dempsky [EMAIL PROTECTED] wrote: I setup hoststated earlier this week to provide load balancing and fail over for a few Linux web servers. It went fairly smoothly, except that one of the Linux machines only passed the 'check http / code 200' test about 50% of the time. Just using 'check tcp' worked fine, and I saw the same results from another 4.2 box, and even from a 4.3-beta snapshot using relayd. I could run 'curl -I http://host/' on any of the OpenBSD boxes in a loop just fine, but the moment I started hoststated/relayd, curl would start failing about 50% of the time too. The SYN packets were showing up in tcpdump on the Linux machine's interface, but the kernel would just randomly refuse to respond. Because curl ran fine right up until hoststated was started, I assumed it was hoststated's fault for the longest time. But after giving up trying to find a bug there, I discovered that on the misbehaving Linux box, the net.ipv4.tcp_tw_recycle=1 sysctl was enabled. Apparently, this flag changes how Linux handles sockets in TIME-WAIT state (and violates the TCP specification, according to the sparse documentation), which I'm guessing doesn't play nicely with OpenBSD's sequence number randomization. It was originally set because one of the database vendors we spoke with suggested a bunch of sysctl changes for optimization (some necessary like fixing memory overcommit), but also bad ones like that. (Searching google shows a lot of hits for people mindlessly suggesting to enable tcp_tw_recycle.) Just thought I'd mention this in case it saves someone else the same frustrating experience.
Re: hoststated/relayd and Linux's tcp_tw_recycle option
On Fri, Apr 18, 2008 at 11:13 AM, Denis Doroshenko [EMAIL PROTECTED] wrote: google quickly gives a url http://kbase.redhat.com/faq/FAQ_80_6180.shtm where it is said It is likely an artifact of having tcp_tw_recycle and tcp_tw_reuse enabled in the sysctl settings. Okay? The problem I was facing is that I didn't consider I needed to check for a violate_rfc_793 sysctl on one of the Linux hosts, and that it only started causing problems while hoststated/relayd was running. I wasn't the one who added tcp_tw_recycle=1 to that machine's sysctl.conf, otherwise I would have checked the Linux kernel documentation and questioned the database vendor's advice.