On Thu, May 24, 2018 at 11:00:29PM +0200, William Dauchy wrote:
> On Thu, May 24, 2018 at 12:01:38PM +0200, William Lallemand wrote:
> > I managed to reproduce something similar with the 1.8.8 version. It looks 
> > like
> > letting a socat connected to the socket helps.
> >
> > I'm looking into the code to see what's happening.
> 
> Indeed, after some more hours, I got the same issue on v1.8.8. However it 
> seems to
> be easier to reproduce in v1.8.9, but I might be wrong.
> So now I bet on either thread issue, or bind with reuseport.
> I'll try to do some more tests.
> 
> Best,

Hi,

I don't think I reproduced the same problem, so I have a few questions for you 
:-)

Are the problematical workers leaving when you reload a second time?

Did you try to kill -USR1 the worker ? It should exits with "Former worker $PID
exited with code 0" on stderr.

If not, could you check the Sig* lines in /proc/$PID/status for this worker?

Do you know how much time take haproxy to load its configuration, and do you
think you tried a reload before it finished to parse and load the config?
Type=notify in your systemd unit file should help for this case. If I remember
well it checks that the service is 'ready' before trying to reload.

I suspect the SIGUSR1 signal is not received by the worker, but I'm not sure
either if it's the master that didn't send it or if the worker blocked it.

Thanks!

-- 
William Lallemand

Reply via email to