On Fri, Jul 13, 2007 at 07:44:08PM -0700, Paul B. Henson wrote: > On Wed, 11 Jul 2007, Robert Felber wrote: > > > This could happen if all policyd-weight processes are hogged up. Should > > be logged with "MAX_PROX NN reached". How many policyd-weight childs do > > you have at such moments? > > There are no instances of that message in my logs. I currently have the > maximum number of processes set to 100. > > > > Alternatively, what is your kernel setting for somaxconn? If it is 128 > > then you should increase it to 1024 or some higher value (this is a > > general recommendation for any server). This isn't being logged by > > policyd-weight, as this cannot be detected by polw. > > somaxconn is currently the default, which I believe is 128.
You should really increase this. I will update the setup howto as well. This level has caused many problems in the past. > I currently have the maximum number of postfix smtp processes set to 300, > so the theory here is that all 100 policyd-weight processes are busy, 128 > postfix processes are attempting to connect and sitting in the listen > queue, and then the 129th+ processes get connection timed out? Yes because policyd-weight childrens all are in a "accept" state. If the kernel doesnt provide a socket-descriptor due to somaxconn issues the policyd-weight returns to accept() on its listen socket. At some time postfix will timeout. > But that > doesn't make sense, because shouldn't policyd-weight log a notification > when it tried to start the 101st process which would have exceeded the > maximum? Yes. How many policyd-weight instances are up at this time? > The only way the queue backlog should exceed 128 is if that many > connections are made without policyd-weight doing an accept? Or not being able to do a sane accept(). -- Robert Felber (PGP: 896CF30B) Munich, Germany ____________________________________________________________ Policyd-weight Mailinglist - http://www.policyd-weight.org/