Hi Willy,
On Fri, Aug 1, 2014 at 10:49 AM, Willy Tarreau <w...@1wt.eu> wrote: > Hi Stefan, > > On Thu, Jul 24, 2014 at 03:32:30PM +0200, Stefan Majer wrote: > > Hi Willy, > > > > coming back to this old thread. > > We still have the problem that from time to time that after doing a > > > > # service haproxy reload > > which actually does > > # /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p > /var/run/haproxy.pid > > -sf <pid> > > the old process persists and we end up having more than one haproxy > process. > > The old process will never end (even after days) till we forcefully kill > it. > > > > We issue a haproxy reload every time a configuration change happened > which > > can happen quite often because, say 50 - 100 times a day. > > > > To nail this problem down we are finally able to reproduce this behavior > > easily ! > > > > We do the following commands on a recent ubuntu, centos, rhel whatever. > We > > installed haproxy-1.5.2, and 1.4.25 same effect. > > We reload haproxy in parallel by executing: > > # service haproxy reload & service haproxy reload & > > > > repeat this some time (5-10 times) > > and you will see: > > # ps -ef |grep haproxy > > # haproxy 3855 1 0 12:34 ? 00:00:00 /usr/sbin/haproxy -f > > /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 3797 > > # haproxy 3950 1 0 12:35 ? 00:00:00 /usr/sbin/haproxy -f > > /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 3932 > > > > I know it is not recommended to reload a already reloading process, but i > > want to make clear that this is a potential source of confusion. > > I dont know if it is possible to check if there is already a reload in > > progress and return silently ? > > But you realize that this is completely expected ? You're asking multiple > processes in parallel to signal the same old one that it must be leaving > and then to all start in parallel. > Of course, we are aware that we need to prevent a parallel reload by serializing configuration generation and reloading the haproxy process. > There would be a solution to avoid this, it consists in disabling the > SO_REUSEPORT option on the listening sockets, so that only one process > gets the listening ports and the other ones fail and leave. The problem > is that it would make the reloads more noticeable because you'd get a > short period of time with no port bound. > > We could also think about grabbing a lock on the pid file, but that would > make life harder for people working with minimal environments where locks > are not implemented. Also it would require keeping the lock on the file > for all the process' life, which is not really nice either. Additionally, > not everyone uses pidfiles anyway... > One solution might be to have one master process which handles all the configuration parsing, and child management and the childs get restarted once the master have told them so. This is the way nginx works. What do you think ? > How large is your configuration ? With small configs, haproxy can start in > a few milliseconds. Here on my laptop, a small 20 lines config takes 2 ms > to start, and a huge one (300000 backends) takes 3 seconds, so that's 10 > microseconds per backend. I really doubt that even an excited user could > manage to cause conflicts during a startup, especially when you restart > it 100 times a day at most :-/ > Our configuration is not that big, we have currently ~200 frontend and ~ 1000 backends configured. But as is understand the old process dies not before he surpassed all sockets to the new daemon and this may take some time if some of the sockets are processing long running sessions. So the overall reload may take some seconds to probably minutes. > > Regards, > Willy > > Since we know how to act on this situation we got back a very stable and performant loadbalancer, thanks for that ! Greetings Stefan -- Stefan Majer