Re: hoststated/relayd and Linux's tcp_tw_recycle option

2008-04-19 Thread Stuart Henderson
On 2008-04-18, Denis Doroshenko [EMAIL PROTECTED] wrote:
 google quickly gives a url

 http://kbase.redhat.com/faq/FAQ_80_6180.shtm

 where it is said It is likely an artifact of having
 tcp_tw_recycle and tcp_tw_reuse enabled in the
 sysctl settings.

Work is underway at the moment to suppress these messages in further
releases of Red Hat Enterprise Linux but is not a high priority
because of the messages' benign nature.

Oh so clever.



hoststated/relayd and Linux's tcp_tw_recycle option

2008-04-18 Thread Matthew Dempsky
I setup hoststated earlier this week to provide load balancing and
fail over for a few Linux web servers.  It went fairly smoothly,
except that one of the Linux machines only passed the 'check http /
code 200' test about 50% of the time.  Just using 'check tcp' worked
fine, and I saw the same results from another 4.2 box, and even from a
4.3-beta snapshot using relayd.  I could run 'curl -I http://host/' on
any of the OpenBSD boxes in a loop just fine, but the moment I started
hoststated/relayd, curl would start failing about 50% of the time too.
 The SYN packets were showing up in tcpdump on the Linux machine's
interface, but the kernel would just randomly refuse to respond.

Because curl ran fine right up until hoststated was started, I assumed
it was hoststated's fault for the longest time.  But after giving up
trying to find a bug there, I discovered that on the misbehaving Linux
box, the net.ipv4.tcp_tw_recycle=1 sysctl was enabled.

Apparently, this flag changes how Linux handles sockets in TIME-WAIT
state (and violates the TCP specification, according to the sparse
documentation), which I'm guessing doesn't play nicely with OpenBSD's
sequence number randomization.  It was originally set because one of
the database vendors we spoke with suggested a bunch of sysctl changes
for optimization (some necessary like fixing memory overcommit), but
also bad ones like that.  (Searching google shows a lot of hits for
people mindlessly suggesting to enable tcp_tw_recycle.)

Just thought I'd mention this in case it saves someone else the same
frustrating experience.



Re: hoststated/relayd and Linux's tcp_tw_recycle option

2008-04-18 Thread Denis Doroshenko
google quickly gives a url

http://kbase.redhat.com/faq/FAQ_80_6180.shtm

where it is said It is likely an artifact of having
tcp_tw_recycle and tcp_tw_reuse enabled in the
sysctl settings.

On Fri, Apr 18, 2008 at 8:08 PM, Matthew Dempsky [EMAIL PROTECTED] wrote:
 I setup hoststated earlier this week to provide load balancing and
  fail over for a few Linux web servers.  It went fairly smoothly,
  except that one of the Linux machines only passed the 'check http /
  code 200' test about 50% of the time.  Just using 'check tcp' worked
  fine, and I saw the same results from another 4.2 box, and even from a
  4.3-beta snapshot using relayd.  I could run 'curl -I http://host/' on
  any of the OpenBSD boxes in a loop just fine, but the moment I started
  hoststated/relayd, curl would start failing about 50% of the time too.
   The SYN packets were showing up in tcpdump on the Linux machine's
  interface, but the kernel would just randomly refuse to respond.

  Because curl ran fine right up until hoststated was started, I assumed
  it was hoststated's fault for the longest time.  But after giving up
  trying to find a bug there, I discovered that on the misbehaving Linux
  box, the net.ipv4.tcp_tw_recycle=1 sysctl was enabled.

  Apparently, this flag changes how Linux handles sockets in TIME-WAIT
  state (and violates the TCP specification, according to the sparse
  documentation), which I'm guessing doesn't play nicely with OpenBSD's
  sequence number randomization.  It was originally set because one of
  the database vendors we spoke with suggested a bunch of sysctl changes
  for optimization (some necessary like fixing memory overcommit), but
  also bad ones like that.  (Searching google shows a lot of hits for
  people mindlessly suggesting to enable tcp_tw_recycle.)

  Just thought I'd mention this in case it saves someone else the same
  frustrating experience.



Re: hoststated/relayd and Linux's tcp_tw_recycle option

2008-04-18 Thread Matthew Dempsky
On Fri, Apr 18, 2008 at 11:13 AM, Denis Doroshenko
[EMAIL PROTECTED] wrote:
 google quickly gives a url

  http://kbase.redhat.com/faq/FAQ_80_6180.shtm

  where it is said It is likely an artifact of having
  tcp_tw_recycle and tcp_tw_reuse enabled in the
  sysctl settings.

Okay?

The problem I was facing is that I didn't consider I needed to check
for a violate_rfc_793 sysctl on one of the Linux hosts, and that it
only started causing problems while hoststated/relayd was running.  I
wasn't the one who added tcp_tw_recycle=1 to that machine's
sysctl.conf, otherwise I would have checked the Linux kernel
documentation and questioned the database vendor's advice.