Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-13 Thread Willy Tarreau
Hi Simon,

On Mon, Mar 14, 2011 at 07:05:22AM +0900, Simon Horman wrote:
  But if it's in the chroot, it can't reread is configuration file.
 
 True, I have a slightly dubious way to deal with that.
 After chrooting the configuration file is opened relative
 to the chroot. So you need to do something like this.
 
 edit /etc/haproxy.cfg
 mkdir -p /a/chroot/etc
 cp /etc/haproxy.cfg /a/chroot/etc
 # chmod / chown /a/chroot/etc/aproxy.cfg as needed
 haproxy -sr $(cat /var/run/haproxy.pid) # signal restart

I must say I really don't like this way of dealing with chroots. For
me, a chroot is in an empty directory on a read-only file system,
otherwise it's useless. The simple fact that the mkdir above is
possible makes it possible for the chrooted processes to use that
directory to escape the chroot.

   But I don't think that changing thing to keep it outside would
   be much trouble at all.
  
  I don't think it should be an issue either, maybe a minor change
  of the starting sequence. Also, probably that this master will later
  become the one able to log to real files and to do other things that
  need to be done outside of the chroot
 
 As it stands that should be easy enough to handle using the same technique
 that is used for the pid file.  Of course adding such log directives on
 restart would need to be prohibited.  But that probably needs to happen for
 permissions/capabilities reasons anyway.  And in any case it should also
 not be difficult.

OK.

 getppid is run once just before before entering the loop.
 And the workers send kill(ppid, 0) while in the loop.
 This only occurs in workers when master_worker mode is enabled.
 
 I didn't pay particular attention to performance considerations.
 If you are worried about that then perhaps the workers could
 kill less often, perhaps using a task.

Yes I think that would be better. The task might be run once a second
and will not consume more than one syscall per second whatever the
load. I'd be much worried about the kill(ppid, 0) when dealing with
20 incoming connections per second!

 The feature was a request to allow situations where there is more
 than one process but not detached from the terminal - master_worker
 either without daemon or with -db (patch forthcoming) to play nicely
 with a monitor such as runit[1].
 
 [1] http://smarden.org/runit/

OK, I did not imagine there was a use case for doing that.

Well, if master_worker mode with -db doesn't daemonize it, that's 
already
OK then.
   
   Understood. I'll verify that something reasonable occurs on -db.
  
  OK fine.
  
  I'll try to find some time ASAP to review your work, it will be easier
  to discuss it and you won't have to explain me things that I should have
  figured by myself if I had tested it :-)
 
 I don't mind explaining things. But I do look forward to a review.

You'll get it, now that I have released 1.5-dev4, it'll be easier to
stay focused on that. I'll probably try that tomorrow if time permits.

Thanks!
Willy




Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-13 Thread Simon Horman
On Sun, Mar 13, 2011 at 11:16:33PM +0100, Willy Tarreau wrote:
 Hi Simon,
 
 On Mon, Mar 14, 2011 at 07:05:22AM +0900, Simon Horman wrote:
   But if it's in the chroot, it can't reread is configuration file.
  
  True, I have a slightly dubious way to deal with that.
  After chrooting the configuration file is opened relative
  to the chroot. So you need to do something like this.
  
  edit /etc/haproxy.cfg
  mkdir -p /a/chroot/etc
  cp /etc/haproxy.cfg /a/chroot/etc
  # chmod / chown /a/chroot/etc/aproxy.cfg as needed
  haproxy -sr $(cat /var/run/haproxy.pid) # signal restart
 
 I must say I really don't like this way of dealing with chroots. For
 me, a chroot is in an empty directory on a read-only file system,
 otherwise it's useless. The simple fact that the mkdir above is
 possible makes it possible for the chrooted processes to use that
 directory to escape the chroot.

Our image of chroots is slightly different, though I do take your point.

Unfortunately I'm not sure of a good alternative to re-opening the file.
Keeping the fd open would be way to fragile as editors often
create a new file rather than editing in-place.

I guess we could pipe the configuration over a socket or something,
and we could also use the same socket as the error-condition detector
that you were discussing in an earlier email.

But I don't think that changing thing to keep it outside would
be much trouble at all.
   
   I don't think it should be an issue either, maybe a minor change
   of the starting sequence. Also, probably that this master will later
   become the one able to log to real files and to do other things that
   need to be done outside of the chroot
  
  As it stands that should be easy enough to handle using the same technique
  that is used for the pid file.  Of course adding such log directives on
  restart would need to be prohibited.  But that probably needs to happen for
  permissions/capabilities reasons anyway.  And in any case it should also
  not be difficult.
 
 OK.
 
  getppid is run once just before before entering the loop.
  And the workers send kill(ppid, 0) while in the loop.
  This only occurs in workers when master_worker mode is enabled.
  
  I didn't pay particular attention to performance considerations.
  If you are worried about that then perhaps the workers could
  kill less often, perhaps using a task.
 
 Yes I think that would be better. The task might be run once a second
 and will not consume more than one syscall per second whatever the
 load. I'd be much worried about the kill(ppid, 0) when dealing with
 20 incoming connections per second!

Sure, I'll move it into a worker.

  The feature was a request to allow situations where there is more
  than one process but not detached from the terminal - master_worker
  either without daemon or with -db (patch forthcoming) to play nicely
  with a monitor such as runit[1].
  
  [1] http://smarden.org/runit/
 
 OK, I did not imagine there was a use case for doing that.

It wasn't immediately obvious to me either, but I think it does make sense.

 Well, if master_worker mode with -db doesn't daemonize it, that's 
 already
 OK then.

Understood. I'll verify that something reasonable occurs on -db.
   
   OK fine.
   
   I'll try to find some time ASAP to review your work, it will be easier
   to discuss it and you won't have to explain me things that I should have
   figured by myself if I had tested it :-)
  
  I don't mind explaining things. But I do look forward to a review.
 
 You'll get it, now that I have released 1.5-dev4, it'll be easier to
 stay focused on that. I'll probably try that tomorrow if time permits.

Great.

I live in Tokyo. Thankfully I am ok. But as a result of the recent
earthquakes in Japan I am likely to experience rolling blackouts over the
next few weeks starting tomorrow. So I may be a little unresponsive from
time to time. Your understanding is appreciated.





Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-13 Thread Willy Tarreau
On Mon, Mar 14, 2011 at 07:26:30AM +0900, Simon Horman wrote:
 Unfortunately I'm not sure of a good alternative to re-opening the file.
 Keeping the fd open would be way to fragile as editors often
 create a new file rather than editing in-place.

Indeed, keeping the fd open would not work. Many scripts do a cp
or mv of the new file over the old one.

 I guess we could pipe the configuration over a socket or something,
 and we could also use the same socket as the error-condition detector
 that you were discussing in an earlier email.

Using a socket would be very nice indeed, because the new process would
be able to pass the config it has on its command line. But that requires
basic socket handling from the new process before we ever start the
polling loop, and for long I've thought this required some code
duplication. I may be wrong though.

In the mean time, I really think that keeping the master out of the
chroot is the way to go. In my opinion it makes a lot of sense : the
master is the one which owns rights on ports, config, etc... and which
can fork slave process deprived from their permissions. Seen that way,
there's really nothing that justifies jailing the master.

 I live in Tokyo. Thankfully I am ok. But as a result of the recent
 earthquakes in Japan I am likely to experience rolling blackouts over the
 next few weeks starting tomorrow. So I may be a little unresponsive from
 time to time. Your understanding is appreciated.

OK, thanks for the info, I did not know you were there. I certainly
understand the difficulties you might face, given the terrible images
I've seen on the net! I wish you and your relatives a good luck !

Best regards,
Willy




Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-13 Thread Simon Horman
On Mon, Mar 14, 2011 at 12:37:17AM +0100, Willy Tarreau wrote:
 On Mon, Mar 14, 2011 at 07:26:30AM +0900, Simon Horman wrote:
  Unfortunately I'm not sure of a good alternative to re-opening the file.
  Keeping the fd open would be way to fragile as editors often
  create a new file rather than editing in-place.
 
 Indeed, keeping the fd open would not work. Many scripts do a cp
 or mv of the new file over the old one.
 
  I guess we could pipe the configuration over a socket or something,
  and we could also use the same socket as the error-condition detector
  that you were discussing in an earlier email.
 
 Using a socket would be very nice indeed, because the new process would
 be able to pass the config it has on its command line. But that requires
 basic socket handling from the new process before we ever start the
 polling loop, and for long I've thought this required some code
 duplication. I may be wrong though.
 
 In the mean time, I really think that keeping the master out of the
 chroot is the way to go. In my opinion it makes a lot of sense : the
 master is the one which owns rights on ports, config, etc... and which
 can fork slave process deprived from their permissions. Seen that way,
 there's really nothing that justifies jailing the master.

Yes, I think that is an approach worth investigating.
I'll see about implementing it.



Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-10 Thread Simon Horman
On Thu, Mar 10, 2011 at 08:46:20AM +0100, Willy Tarreau wrote:
 On Thu, Mar 10, 2011 at 03:52:58PM +0900, Simon Horman wrote:
   Wow, I'm impressed, you managed to do that really quickly !
  
  Thanks. Its taken quite a bit of effort to get this far.
 
 I have no doubt about that :-)
 
   In the mean time, I have a few questions which come to mind :
 - does the socket cache consider all of the bind parameters ?
   (eg: mss, interface, transparent, ...)
  
  Yes, it attempts to take them into account, though in a very naive way.
  If they all match then the socket cache entry can be re-used.
  Otherwise the entry is invalidated.
 
 OK.
 
 - what happens if the new config file uses some conflicting bind
   entries ? Eg: old config used to listen on 192.168.1.1:80 and
   the new one uses 0.0.0.0:80 ? Or even :80 for the old one (IPv4)
   and :::80 for the new one (IPv6) ?
  
  I think there will be a problem. Lets talk about what would be a sensible
  thing to do.
 
 In my opinion we should define how to fail a restart. Right now with
 -sf/-st, the new process is able to detect the failure to bind and
 exits with an error which can be detected by management scripts and
 reported to the admin which must immediately take action.

I think that -sr currently behaves that way. I'll check.

 I think that a failure to restart smoothly means that the admin will
 have to take the decision to restart harder (-sf/-st). Sometimes the
 scripts will do that by themselves as an automatic fallback. So that
 make me believe that all we want is the master process to refuse to
 change anything in case an anomaly is detected, and to indicate its
 refusal.

Right. The main thing that I was worrying about is that the master process
probably won't know its that by the time the master process detects an
anomaly its probably already changes things. So it will either need to know
how to roll back, or detect problems before making changes to itself.

That said, there are probably only a few changes to the master that
actually matter. For example, if it has loaded a configuration file
with a bogus proxy port in it, then it doesn't need to roll that back,
it just needs to tell the old workers to keep going - they still have
the old config.

So I think that the main problem that I have is knowing what the
master would need to rollback. Loggers spring to mind. I'm sure
there are a few other things.

Note, chroot, daemon and master_worker currently can't be changed on
restart, so we don't need to worry about them.

 We cannot make it send a signal to the calling process
 because it does not know its pid, and even if it knew it it would
 probably not have enough permissions to send it anyway. But we should
 find something the master process can act on that the calling one can
 detect (eg: among the horrible ideas, if it's not chrooted, it can
 change its pid and update the pidfile, but that's ugly).

The master process already has a facility to rewrite the pid file
even if it is chrooted (it keeps the fd open) so that could work.
The pidfile currently does change on restart - all worker's pids are
updated, polling the pid file for something new and valid or an error
status could work.

As an aside, having the worker pids in the pid file isn't strictly
necessary. I only added them for consistency with the existing usage
of the pidfile. But the master could actually just write the master pid
and close the fd.

 - does the master send a signal to all children asking them to unbind
   (as we did with -sf) ?
  
  Yes. It sends them a SIGUSR1.
  
  It does not send them a SIGTTOU. Because although that works fine
  I am not entirely sure how to sanely unwind the master at that point.
 
 I think it's OK that way because the master still owns the sockets, so
 even in case of a late failure, it can restart new processes anyway.
 
 - do the debug modes (-d/-db) disable the master_worker mode ?
  
  No. I can make that so if you like.
 
 Yes that's the idea. Many users abuse -d/-db (including me) and it's
 important that they don't have to touch their config for this.

In the case of -db, master_worker mode doesn't imply background (daemon) mode,
so there shouldn't be any changes there by adding my current patches.
Do you want me to disable it anyway?

   In fact I'm interested in any corner cases we should be aware of so
   that we can clearly document them and indicate how to handle them
   (eg: fall back to -st if it's not possible to rebind, etc...)
  
  I don't know of any off hand. But one of my reasons for posting
  the code was to get more eyes on it so we can find such cases.
 
 Fine. Any failure to restart with the expected config typically is what
 I'd call a corner case. That's why I'd like that we find a way to safely
 fail so that proper action can be taken externally.
 
  Re-initialising the configuration was somewhat non-trivial, and I am
  sure that I have missed a few things.
 
 I'm certain that 

Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-10 Thread Simon Horman
On Thu, Mar 10, 2011 at 09:40:44PM +0100, Willy Tarreau wrote:
 On Thu, Mar 10, 2011 at 05:40:19PM +0900, Simon Horman wrote:
   I think that a failure to restart smoothly means that the admin will
   have to take the decision to restart harder (-sf/-st). Sometimes the
   scripts will do that by themselves as an automatic fallback. So that
   make me believe that all we want is the master process to refuse to
   change anything in case an anomaly is detected, and to indicate its
   refusal.
  
  Right. The main thing that I was worrying about is that the master process
  probably won't know its that by the time the master process detects an
  anomaly its probably already changes things. So it will either need to know
  how to roll back, or detect problems before making changes to itself.
 
 Good point.
 
  That said, there are probably only a few changes to the master that
  actually matter. For example, if it has loaded a configuration file
  with a bogus proxy port in it, then it doesn't need to roll that back,
  it just needs to tell the old workers to keep going - they still have
  the old config.
 
 I agree. The master should only have to take care of keeping the bound
 sockets cache. It does not need to even keep a copy of the config at
 all provided it knows how the sockets were bound.

As it stands the master needs the entire configuration as it actually
starts all the proxies and then forks the workers. This is in order to
allow the master to know what sockets to bind and unbind.  But other than
that it doesn't care about most of the configuration.

  So I think that the main problem that I have is knowing what the
  master would need to rollback. Loggers spring to mind. I'm sure
  there are a few other things.
 
 Rolling back an unused config should not be too hard a work once we
 move all the config behind a pointer. Some little work was started
 on that a long time ago, you'll notice an unused parent member in
 the proxy struct, to attach it to a configuration instance.
 
 The idea would be that we could have all the global config and all
 proxies in a same struct behind a pointer, and instantiate a new
 one when reloading the config, then once everything's OK, we can
 switch the configs and fork new processes. The socket cache just
 has to be global. And even if we add new listening sockets to this
 cache when trying to start a new config, it's not dramatic.

Yes, I think that would work. But I wonder if we need to go that far,
as its probably ok for the master to have an invalid configuration so long
as we are careful about the (few?) configuration parameters that
both affect the master and can be changed on restart.

The socket_cache currently is global.

  Note, chroot, daemon and master_worker currently can't be changed on
  restart, so we don't need to worry about them.
 
 That makes sense ;-)
 
  The master process already has a facility to rewrite the pid file
  even if it is chrooted (it keeps the fd open) so that could work.
  The pidfile currently does change on restart - all worker's pids are
  updated, polling the pid file for something new and valid or an error
  status could work.
 
 That makes me think that we should most likely keep the master out
 of the chroot anyway, since it does not participate into network
 communications, it's not exposed.

I don't have any strong opinions on that either way.
As it is, having it in the chroot isn't a major burden.
But I don't think that changing thing to keep it outside would
be much trouble at all.

  As an aside, having the worker pids in the pid file isn't strictly
  necessary.
 
 Yes it is, because if the master dies, you have no way to tell which
 children were related to the dead service and must be killed to achieve
 a sane restart.

True. But the children will detect that the master has died and exit
of their own accord.

  I only added them for consistency with the existing usage
  of the pidfile. But the master could actually just write the master pid
  and close the fd.
 (...)
   - do the debug modes (-d/-db) disable the master_worker mode ?

No. I can make that so if you like.
   
   Yes that's the idea. Many users abuse -d/-db (including me) and it's
   important that they don't have to touch their config for this.
  
  In the case of -db, master_worker mode doesn't imply background (daemon) 
  mode,
  so there shouldn't be any changes there by adding my current patches.
  Do you want me to disable it anyway?
 
 Well, if master_worker mode with -db doesn't daemonize it, that's already
 OK then.

Understood. I'll verify that something reasonable occurs on -db.




Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-09 Thread Simon Horman
On Thu, Mar 10, 2011 at 01:32:42PM +0900, Simon Horman wrote:
 Hi,
 
 this patch series attempts to allow haproxy to be restarted - perhaps
 reconfigured would be a better term - without refusing or dropping
 connections.
 
 The model that I am using is that of a master process,
 which really does nothing but hold bound file descriptors,
 and workers - these should be very similar to what haproxy
 processes have looked like up to now.
 
 This feature can be enabled using the -M command line option
 of the master_worker configuration file option. It may be
 used in conjunction with daemon.
 
 In order to effect a restart a SIGUSR2 should be sent to the master
 process. The master process is guaranteed to be the first pid listed
 in the pid file. It will also be the parent process, and can thus
 be found using ps. It should also be noted in the logs.
 
 SIGUSR2 is ignored if master/worker mode is not enabled.
 SIGUSR2 is ignored by workers.

The patches are available in git
git://github.com/horms/haproxy.git master



Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-09 Thread Willy Tarreau
Hi Simon,

On Thu, Mar 10, 2011 at 01:32:42PM +0900, Simon Horman wrote:
 Hi,
 
 this patch series attempts to allow haproxy to be restarted - perhaps
 reconfigured would be a better term - without refusing or dropping
 connections.

Wow, I'm impressed, you managed to do that really quickly !

You sent that at the right moment, because I have almost settled down
on the internal changes needed to work on server-side keep-alive, so
I should issue -dev4 once I'm fine with those changes. Then we'll be
able to start by integrating your changes.

In the mean time, I have a few questions which come to mind :
  - does the socket cache consider all of the bind parameters ?
(eg: mss, interface, transparent, ...)

  - what happens if the new config file uses some conflicting bind
entries ? Eg: old config used to listen on 192.168.1.1:80 and
the new one uses 0.0.0.0:80 ? Or even :80 for the old one (IPv4)
and :::80 for the new one (IPv6) ?

  - does the master send a signal to all children asking them to unbind
(as we did with -sf) ?

  - do the debug modes (-d/-db) disable the master_worker mode ?

In fact I'm interested in any corner cases we should be aware of so
that we can clearly document them and indicate how to handle them
(eg: fall back to -st if it's not possible to rebind, etc...)

Thanks !
Willy




Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-09 Thread Simon Horman
Hi Willy,

On Thu, Mar 10, 2011 at 07:29:11AM +0100, Willy Tarreau wrote:
 Hi Simon,
 
 On Thu, Mar 10, 2011 at 01:32:42PM +0900, Simon Horman wrote:
  Hi,
  
  this patch series attempts to allow haproxy to be restarted - perhaps
  reconfigured would be a better term - without refusing or dropping
  connections.
 
 Wow, I'm impressed, you managed to do that really quickly !

Thanks. Its taken quite a bit of effort to get this far.

 You sent that at the right moment, because I have almost settled down
 on the internal changes needed to work on server-side keep-alive, so
 I should issue -dev4 once I'm fine with those changes. Then we'll be
 able to start by integrating your changes.

Great.

 In the mean time, I have a few questions which come to mind :
   - does the socket cache consider all of the bind parameters ?
 (eg: mss, interface, transparent, ...)

Yes, it attempts to take them into account, though in a very naive way.
If they all match then the socket cache entry can be re-used.
Otherwise the entry is invalidated.

Thanks for earlier advice on this else I would have probably
written code that didn't take these parameters into account at all.

   - what happens if the new config file uses some conflicting bind
 entries ? Eg: old config used to listen on 192.168.1.1:80 and
 the new one uses 0.0.0.0:80 ? Or even :80 for the old one (IPv4)
 and :::80 for the new one (IPv6) ?

I think there will be a problem. Lets talk about what would be a sensible
thing to do.

   - does the master send a signal to all children asking them to unbind
 (as we did with -sf) ?

Yes. It sends them a SIGUSR1.

It does not send them a SIGTTOU. Because although that works fine
I am not entirely sure how to sanely unwind the master at that point.

   - do the debug modes (-d/-db) disable the master_worker mode ?

No. I can make that so if you like.

 In fact I'm interested in any corner cases we should be aware of so
 that we can clearly document them and indicate how to handle them
 (eg: fall back to -st if it's not possible to rebind, etc...)

I don't know of any off hand. But one of my reasons for posting
the code was to get more eyes on it so we can find such cases.

Re-initialising the configuration was somewhat non-trivial, and I am
sure that I have missed a few things.

 Thanks !
 Willy
 
 



Re: [PATCH 00/26] Restart of haproxy without dropping connections

2011-03-09 Thread Willy Tarreau
On Thu, Mar 10, 2011 at 03:52:58PM +0900, Simon Horman wrote:
  Wow, I'm impressed, you managed to do that really quickly !
 
 Thanks. Its taken quite a bit of effort to get this far.

I have no doubt about that :-)

  In the mean time, I have a few questions which come to mind :
- does the socket cache consider all of the bind parameters ?
  (eg: mss, interface, transparent, ...)
 
 Yes, it attempts to take them into account, though in a very naive way.
 If they all match then the socket cache entry can be re-used.
 Otherwise the entry is invalidated.

OK.

- what happens if the new config file uses some conflicting bind
  entries ? Eg: old config used to listen on 192.168.1.1:80 and
  the new one uses 0.0.0.0:80 ? Or even :80 for the old one (IPv4)
  and :::80 for the new one (IPv6) ?
 
 I think there will be a problem. Lets talk about what would be a sensible
 thing to do.

In my opinion we should define how to fail a restart. Right now with
-sf/-st, the new process is able to detect the failure to bind and
exits with an error which can be detected by management scripts and
reported to the admin which must immediately take action.

I think that a failure to restart smoothly means that the admin will
have to take the decision to restart harder (-sf/-st). Sometimes the
scripts will do that by themselves as an automatic fallback. So that
make me believe that all we want is the master process to refuse to
change anything in case an anomaly is detected, and to indicate its
refusal.  We cannot make it send a signal to the calling process
because it does not know its pid, and even if it knew it it would
probably not have enough permissions to send it anyway. But we should
find something the master process can act on that the calling one can
detect (eg: among the horrible ideas, if it's not chrooted, it can
change its pid and update the pidfile, but that's ugly).

- does the master send a signal to all children asking them to unbind
  (as we did with -sf) ?
 
 Yes. It sends them a SIGUSR1.
 
 It does not send them a SIGTTOU. Because although that works fine
 I am not entirely sure how to sanely unwind the master at that point.

I think it's OK that way because the master still owns the sockets, so
even in case of a late failure, it can restart new processes anyway.

- do the debug modes (-d/-db) disable the master_worker mode ?
 
 No. I can make that so if you like.

Yes that's the idea. Many users abuse -d/-db (including me) and it's
important that they don't have to touch their config for this.

  In fact I'm interested in any corner cases we should be aware of so
  that we can clearly document them and indicate how to handle them
  (eg: fall back to -st if it's not possible to rebind, etc...)
 
 I don't know of any off hand. But one of my reasons for posting
 the code was to get more eyes on it so we can find such cases.

Fine. Any failure to restart with the expected config typically is what
I'd call a corner case. That's why I'd like that we find a way to safely
fail so that proper action can be taken externally.

 Re-initialising the configuration was somewhat non-trivial, and I am
 sure that I have missed a few things.

I'm certain that some parts were a real nightmare ! I'll take time to
review all your patches because yes, it's possible that you fell in some
traps or had to take some decisions that might have other impacts. That's
also why I'd like to get your code quickly merged, it's the best way to
find what might have been overlooked.

Best regards,
Willy