On Thu, Mar 10, 2011 at 05:40:19PM +0900, Simon Horman wrote:
> > I think that a failure to restart smoothly means that the admin will
> > have to take the decision to restart harder (-sf/-st). Sometimes the
> > scripts will do that by themselves as an automatic fallback. So that
> > make me believe that all we want is the master process to refuse to
> > change anything in case an anomaly is detected, and to indicate its
> > refusal.
> 
> Right. The main thing that I was worrying about is that the master process
> probably won't know its that by the time the master process detects an
> anomaly its probably already changes things. So it will either need to know
> how to roll back, or detect problems before making changes to itself.

Good point.

> That said, there are probably only a few changes to the master that
> actually matter. For example, if it has loaded a configuration file
> with a bogus proxy port in it, then it doesn't need to roll that back,
> it just needs to tell the old workers to keep going - they still have
> the old config.

I agree. The master should only have to take care of keeping the bound
sockets cache. It does not need to even keep a copy of the config at
all provided it knows how the sockets were bound.

> So I think that the main problem that I have is knowing what the
> master would need to rollback. Loggers spring to mind. I'm sure
> there are a few other things.

Rolling back an unused config should not be too hard a work once we
move all the config behind a pointer. Some little work was started
on that a long time ago, you'll notice an unused "parent" member in
the proxy struct, to attach it to a configuration instance.

The idea would be that we could have all the global config and all
proxies in a same struct behind a pointer, and instantiate a new
one when reloading the config, then once everything's OK, we can
switch the configs and fork new processes. The socket cache just
has to be global. And even if we add new listening sockets to this
cache when trying to start a new config, it's not dramatic.

> Note, chroot, daemon and master_worker currently can't be changed on
> restart, so we don't need to worry about them.

That makes sense ;-)

> The master process already has a facility to rewrite the pid file
> even if it is chrooted (it keeps the fd open) so that could work.
> The pidfile currently does change on restart - all worker's pids are
> updated, polling the pid file for something new and valid or an error
> status could work.

That makes me think that we should most likely keep the master out
of the chroot anyway, since it does not participate into network
communications, it's not exposed.

> As an aside, having the worker pids in the pid file isn't strictly
> necessary.

Yes it is, because if the master dies, you have no way to tell which
children were related to the dead service and must be killed to achieve
a sane restart.

> I only added them for consistency with the existing usage
> of the pidfile. But the master could actually just write the master pid
> and close the fd.
(...)
> > > >   - do the debug modes (-d/-db) disable the master_worker mode ?
> > > 
> > > No. I can make that so if you like.
> > 
> > Yes that's the idea. Many users abuse -d/-db (including me) and it's
> > important that they don't have to touch their config for this.
> 
> In the case of -db, master_worker mode doesn't imply background (daemon) mode,
> so there shouldn't be any changes there by adding my current patches.
> Do you want me to disable it anyway?

Well, if master_worker mode with -db doesn't daemonize it, that's already
OK then.

Cheers,
Willy


Reply via email to