On Tue, Mar 05, 2002 at 05:03:16PM -0500, Jeff Trawick wrote:
> > >   axe the entry on graceful restart problems with worker
> > >   
> > >   I was too stupid to read the code to determine that the accept mutex
> > >   failure log messages were harmless and not indicative of a real problem.
> > >   
> > >   I'll try to understand the conditions where I'm seeing connections
> > >   dropped.
> > 
> > I suspect these are one and the same problem. 
> 
> What do you mean by "these"?

I was referring to the dropped connections and the errors on mutex acquire.

> >                              When we get a failure
> > while trying to acquire a mutex it probably means that the mutex was
> > already destroyed. Is it possible that we also destroyed the fdqueue
> > while there were connections waiting to be picked up by a worker?
> 
> worker threads exit as soon as workers_may_exit is set...  I don't see
> any logic to make sure we don't lose any accepted connections (stuff
> in the queue)
> 
> so yes, it looks normal to destroy worker_queue without looking to see
> if any accepted connections are in the queue
> 
> Once the listener thread realizes that we're terminating and it will
> no longer call accept, it needs some way to trigger an error on the
> queue so that once the last connection is dequeued by a worker thread
> subsequent pop operations fail in a way that worker treads know they
> should exit.  And instead of exiting as soon as workers_may_exit is
> set, worker threads should exit once they get the magic queue-is-dead
> error from a pop operation.

I would agree that this is a limitation of the current fdqueue logic
and that it is a bug. I had some code that did something similiar to
what you were talking about, but it was related to a different worker
implementtion variant. If I get some time I'll try to take a look.

-aaron

Reply via email to