Re: -sf/-st & rereading configuration file

2009-12-24 Thread Willy Tarreau
On Thu, Dec 24, 2009 at 01:33:02PM +0100, XANi wrote:
> There is also little iptables hack, if u wanna be 100% sure no client
> will get rejected when you're restarting, block sending TCP RST packets
> to client, so when TCP SYN hits loadbalancer when its restarting and
> frontend port is closed, client connection won't get resetted, TCP will
> just retransmit SYN packet.

Yes. With recent kernels, another possibility might be to use the socket
match to accept incoming packets only for bound sockets. In general you
should avoid blocking outgoing RST packets for a long time because they're
quite needed to reset out of sync sessions. But that's fine during the
restart though.

Willy




Re: -sf/-st & rereading configuration file

2009-12-24 Thread XANi
Dnia 2009-12-24, czw o godzinie 06:29 +0100, Willy Tarreau pisze:
> On Wed, Dec 23, 2009 at 02:43:04PM -0800, Paul Hirose wrote:
> > I was asked how to get haproxy to reload its configuration file, and not 
> > disturb any existing connections.  For example, if I have two servers 
> > listed, and I want to take one out for maintenance.
> > 
> > I wasn't sure about the difference between -sf and -st, but from reading 
> > 2.4(.1), I'm guessing -sf is the better way.  It allows all existing 
> > connections to finish, then temporarily stops/pauses all services(?), 
> > rereads the configuration file, then restarts again?
> 
> No it does not work like that.
> 
> You start a new process (with -sf or -st). If reads the config and tries
> to bind as many services as it can. If some ports are busy, it then sends
> a signal to the old process asking it to temporarily release its ports so
> that the new one can bind them. This leaves a small window of a few 
> milliseconds
> between the instant the port is unbound and it is rebound, where the port
> is not bound at all. But apparently people have absolutely no problem with
> that. Then, once the new process is ready, it sends one signal to the old
> one indicating to it that it can either finish what it's doing (-sf) or
> immediately stop (-st). So upon every restart, you have a fresh new process.
> Some people even use that to upgrade the binary without service disruption.
There is also little iptables hack, if u wanna be 100% sure no client
will get rejected when you're restarting, block sending TCP RST packets
to client, so when TCP SYN hits loadbalancer when its restarting and
frontend port is closed, client connection won't get resetted, TCP will
just retransmit SYN packet.

-- 
Mariusz Gronczewski (XANi) 
GnuPG: 0xEA8ACE64
http://devrandom.pl


signature.asc
Description: To jest część  wiadomości podpisana cyfrowo


Re: -sf/-st & rereading configuration file

2009-12-23 Thread Willy Tarreau
On Wed, Dec 23, 2009 at 02:43:04PM -0800, Paul Hirose wrote:
> I was asked how to get haproxy to reload its configuration file, and not 
> disturb any existing connections.  For example, if I have two servers 
> listed, and I want to take one out for maintenance.
> 
> I wasn't sure about the difference between -sf and -st, but from reading 
> 2.4(.1), I'm guessing -sf is the better way.  It allows all existing 
> connections to finish, then temporarily stops/pauses all services(?), 
> rereads the configuration file, then restarts again?

No it does not work like that.

You start a new process (with -sf or -st). If reads the config and tries
to bind as many services as it can. If some ports are busy, it then sends
a signal to the old process asking it to temporarily release its ports so
that the new one can bind them. This leaves a small window of a few milliseconds
between the instant the port is unbound and it is rebound, where the port
is not bound at all. But apparently people have absolutely no problem with
that. Then, once the new process is ready, it sends one signal to the old
one indicating to it that it can either finish what it's doing (-sf) or
immediately stop (-st). So upon every restart, you have a fresh new process.
Some people even use that to upgrade the binary without service disruption.

> What if an existing connection is a very long one (say returning a large 
> amount of data from a LDAP or database query)?  It could, potentially, 
> cause haproxy to stay not-listening (the logs say Pausing proxy XX) for a 
> while.

No. It just means the old process will remain present for that long time.
This is the reason for the -st option. In general, people use -sf to be
the most transparent possible. But when there are long or permanent
connections, sometimes it's better to kill them ASAP with -st. Other
people working with terminal services (RDP, ...) reload with -sf not to
break any existing connection, then kill the old processes at night if
they still remain.

> But I'd rather not assume that's going to happen and do a -st and 
> kill all existing connections, just so I won't necessarily get stalled 
> waiting for one long connection to finish.

no there is no such risk because the service is assumed by the new
process.

> It appears SIGUSR1 is the clean way to gracefully shut down haproxy 
> completely.

yes, but you should not need to know the signal because it appears in
a sequence sent by the new process to the old one.

> I guess is there a way to have it reread the file, note the differences, 
> and then go from there?

no, for several reasons :

  - the process should be chrooted, so it does not have access to the
config file anymore. This will be solved with the CLI later ; we
should be able to feed it the changes.

  - the process should have dropped its privileges, and as such will
not be able to bind to reserved ports. This could be worked around
using the CLI too, because we could have the new process bind the
ports itself and offer the sockets to the old one over the unix
socket.

Alternatively, we could also suggest that a chrooted and/or secured
process cannot reload its config and that it's a trade-off between
security and ease of use.

> For example, if the only thing I've done is 
> comment out "dontlog-normal", there's really no reason  for haproxy to 
> TTIN+USR1/TTOU anything. And if the only thing I've done is add-back a 
> previously removed server or remove a server about to go into maintenance, 
> there's really no reason again for haproxy to necessarily completely stop 
> listening on its ports, it just needs to start (or stop) sending 
> connections to the added/removed backend server.

As you see above, yes there are reasons.

I can also give you some examples of tricky changes.

Let's say you simply rename a frontend. All it will see will be that
you deleted the old frontend and created a new one. Imagine that you've
changed the sequence of servers in a backend and that you have added a
few and renamed others. It's impossible to reverse-engineer your
operations in such a case. All we'll be able to see is that the servers
list has changed. But we still need to run with the old config for some
time because some requests may be pending in the backend's queue for
instance. Same for established connections. So config reloading is a
very complex feature without a CLI because you're feeding blocks of
changes at once instead of step-by-step transformations.

And anyone who's used Alteon load balancers for a while knows what I'm
talking about. Most often changes are OK. But sometimes for obscure
reasons, some changes do not seem to have any effect, or at least they
have side effects (eg: when you insert filters and renumber the existing
ones), but finally after a reboot the problem is solved. Cisco equipments
solved the issue by applying changes in-line and doing dependency checking,
so basically the user is forced to solve the issues first.

Regar

-sf/-st & rereading configuration file

2009-12-23 Thread Paul Hirose

I was asked how to get haproxy to reload its configuration file, and not 
disturb any existing connections.  For example, if I have two servers listed, 
and I want to take one out for maintenance.

I wasn't sure about the difference between -sf and -st, but from reading 
2.4(.1), I'm guessing -sf is the better way.  It allows all existing 
connections to finish, then temporarily stops/pauses all services(?), rereads 
the configuration file, then restarts again?

What if an existing connection is a very long one (say returning a large amount 
of data from a LDAP or database query)?  It could, potentially, cause haproxy 
to stay not-listening (the logs say Pausing proxy XX) for a while.  But I'd 
rather not assume that's going to happen and do a -st and kill all existing 
connections, just so I won't necessarily get stalled waiting for one long 
connection to finish.

It appears SIGUSR1 is the clean way to gracefully shut down haproxy completely.

I guess is there a way to have it reread the file, note the differences, and then go from 
there?  For example, if the only thing I've done is comment out 
"dontlog-normal", there's really no reason  for haproxy to TTIN+USR1/TTOU 
anything.  And if the only thing I've done is add-back a previously removed server or 
remove a server about to go into maintenance, there's really no reason again for haproxy 
to necessarily completely stop listening on its ports, it just needs to start (or stop) 
sending connections to the added/removed backend server.

I also noticed it appears as doing the -sf stops all services temporarily until 
the reload is done, etc.  If I have multiple services, and I only changed one 
of them, there's no reason for Haproxy to pause all the services.

PH
--
Paul Hirose  : pthir...@ucdavis.edu : Sysadm Motto: rm -fr /MyLife