Re: [Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-08-04 Thread Pekka Savola

On Mon, 4 Aug 2008, Timo Sirainen wrote:
It doesn't seem to be that the current logic is working; there is no 
program with the PID that's in master.pid, and dovecot (1.0.7 + RHEL 
patches) refuses to start.


root: /root$ /sbin/service dovecot start
Starting Dovecot Imap: Error: Dovecot is already running with PID 2746 
(read from /var/run/dovecot/master.pid)

Fatal: Invalid configuration in /etc/dovecot.conf
  [FAILED]
root: /root$ more /var/run/dovecot/master.pid
2746
root: /root$ ps auxw | grep 2746
root 31714  0.0  0.1   4116   584 pts/1R+   20:19   0:00 grep 2746


SELinux perhaps? It checks this by kill()ing the process and seeing if it 
returns ESRCH. If not, it assumes the process exists. If you've SELinux 
perhaps it always return EPERM to the call..


'getenforce' says disabled, so no.  This is pretty strange -- I looked 
at the code and basically duplicated the logic there and could not 
reproduce this problem with a smaller piece of code.  And it doesn't 
seem to appear always in any case -- I killed dovecot with KILL signal 
(leaving the PID file behind), and after that it started up without 
problems.  Unless you have other ideas what to look for, I guess this 
will remain a mystery..


--
Pekka Savola You each name yourselves king, yet the
Netcore Oykingdom bleeds.
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings


Re: [Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-08-04 Thread Dan Horák

Pekka Savola píše v Po 04. 08. 2008 v 12:40 +0300:
 On Mon, 4 Aug 2008, Timo Sirainen wrote:
  It doesn't seem to be that the current logic is working; there is no 
  program with the PID that's in master.pid, and dovecot (1.0.7 + RHEL 
  patches) refuses to start.
  
  root: /root$ /sbin/service dovecot start
  Starting Dovecot Imap: Error: Dovecot is already running with PID 2746 
  (read from /var/run/dovecot/master.pid)
  Fatal: Invalid configuration in /etc/dovecot.conf
[FAILED]
  root: /root$ more /var/run/dovecot/master.pid
  2746
  root: /root$ ps auxw | grep 2746
  root 31714  0.0  0.1   4116   584 pts/1R+   20:19   0:00 grep 2746
 
  SELinux perhaps? It checks this by kill()ing the process and seeing if it 
  returns ESRCH. If not, it assumes the process exists. If you've SELinux 
  perhaps it always return EPERM to the call..
 
 'getenforce' says disabled, so no.  This is pretty strange -- I looked 
 at the code and basically duplicated the logic there and could not 
 reproduce this problem with a smaller piece of code.  And it doesn't 
 seem to appear always in any case -- I killed dovecot with KILL signal 
 (leaving the PID file behind), and after that it started up without 
 problems.  Unless you have other ideas what to look for, I guess this 
 will remain a mystery..
 

There is a not-so-prefect init script installed for dovecot in RHEL, try
using the one from Fedora
(http://cvs.fedoraproject.org/viewcvs/rpms/dovecot/devel/dovecot.init?rev=1.6view=auto).
 A new init script will be added in RHEL 5.3.


Dan

-- 
Fedora and Red Hat package maintainer



Re: [Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-08-03 Thread Timo Sirainen

On Aug 2, 2008, at 8:22 PM, Pekka Savola wrote:

It doesn't seem to be that the current logic is working; there is no  
program with the PID that's in master.pid, and dovecot (1.0.7 + RHEL  
patches) refuses to start.


root: /root$ /sbin/service dovecot start
Starting Dovecot Imap: Error: Dovecot is already running with PID  
2746 (read from /var/run/dovecot/master.pid)

Fatal: Invalid configuration in /etc/dovecot.conf
  [FAILED]
root: /root$ more /var/run/dovecot/master.pid
2746
root: /root$ ps auxw | grep 2746
root 31714  0.0  0.1   4116   584 pts/1R+   20:19   0:00  
grep 2746


SELinux perhaps? It checks this by kill()ing the process and seeing if  
it returns ESRCH. If not, it assumes the process exists. If you've  
SELinux perhaps it always return EPERM to the call..





PGP.sig
Description: This is a digitally signed message part


Re: [Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-08-02 Thread Pekka Savola

On Tue, 1 Jul 2008, Timo Sirainen wrote:

$ /sbin/service dovecot start
Starting Dovecot Imap: Error: Dovecot is already running with PID 10825
(read from /var/run/dovecot/master.pid)
Fatal: Invalid configuration in /etc/dovecot.conf
[FAILED]
(Note: there is nothing wrong in the configuration file so the error
message is somewhat misleading.)


Yes, it's a bit misleading. But I don't think I'll bother fixing it
before rewriting the master/config handling for v2.0.


Is this already a known problem?
Should the start-up logic be made more robust (e.g. check whether a
process corresponding to the PID actually exists)?


It already checks if the PID exists, but it doesn't check what that
process is (and I don't think there is a portable way to do it anyway).
I don't think it's too much to ask to delete the master.pid if in rare
situations it fails to start due to a PID conflict.


Getting back to this after another power loss.

It doesn't seem to be that the current logic is working; there is no 
program with the PID that's in master.pid, and dovecot (1.0.7 + RHEL 
patches) refuses to start.


root: /root$ /sbin/service dovecot start
Starting Dovecot Imap: Error: Dovecot is already running with PID 2746 
(read from /var/run/dovecot/master.pid)

Fatal: Invalid configuration in /etc/dovecot.conf
   [FAILED]
root: /root$ more /var/run/dovecot/master.pid
2746
root: /root$ ps auxw | grep 2746
root 31714  0.0  0.1   4116   584 pts/1R+   20:19   0:00 grep 
2746


--
Pekka Savola You each name yourselves king, yet the
Netcore Oykingdom bleeds.
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings


[Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-07-01 Thread Pekka Savola

Hi,

I'm running Dovecot 1.0.7 (with various patches) on CentOS 5.2.

The server has suffered a couple of power loss events.  Dovecot is run 
as a standalone server.


The problem is that dovecot refuses to start up at boot because the 
PID file from before the power loss is left behind. The message is as 
follows:


$ /sbin/service dovecot start
Starting Dovecot Imap: Error: Dovecot is already running with PID 10825
(read from /var/run/dovecot/master.pid)
Fatal: Invalid configuration in /etc/dovecot.conf
   [FAILED]
(Note: there is nothing wrong in the configuration file so the error 
message is somewhat misleading.)


I looked at the release notes of 1.0.xx releases and they didn't 
mention this.


Is this already a known problem?
Should the start-up logic be made more robust (e.g. check whether a 
process corresponding to the PID actually exists)?


--
Pekka Savola You each name yourselves king, yet the
Netcore Oykingdom bleeds.
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings


Re: [Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-07-01 Thread Timo Sirainen
On Tue, 2008-07-01 at 00:14 +0300, Pekka Savola wrote:
 $ /sbin/service dovecot start
 Starting Dovecot Imap: Error: Dovecot is already running with PID 10825
 (read from /var/run/dovecot/master.pid)
 Fatal: Invalid configuration in /etc/dovecot.conf
 [FAILED]
 (Note: there is nothing wrong in the configuration file so the error 
 message is somewhat misleading.)

Yes, it's a bit misleading. But I don't think I'll bother fixing it
before rewriting the master/config handling for v2.0.

 Is this already a known problem?
 Should the start-up logic be made more robust (e.g. check whether a 
 process corresponding to the PID actually exists)?

It already checks if the PID exists, but it doesn't check what that
process is (and I don't think there is a portable way to do it anyway).
I don't think it's too much to ask to delete the master.pid if in rare
situations it fails to start due to a PID conflict.



signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] Server power loss and Dovecot is already running with PID xxx

2008-07-01 Thread Sean Kamath


On Jul 1, 2008, at 12:51 AM, Timo Sirainen wrote:

Is this already a known problem?
Should the start-up logic be made more robust (e.g. check whether a
process corresponding to the PID actually exists)?


It already checks if the PID exists, but it doesn't check what that
process is (and I don't think there is a portable way to do it  
anyway).

I don't think it's too much to ask to delete the master.pid if in rare
situations it fails to start due to a PID conflict.


This is a pet peeve of mine for many services started at boot time.   
Since the ordering of service startup is usually fairly static, a  
*LOT* of times process IDs are nearly identical on boot.  Depending on  
which way they go, if they drift towards earlier, you'll have the PID  
in use.  This drove me NUTS with Sun's LDAP server.


Many recent OSes are now using memory-based filesystems for /var/run,  
or otherwise clear out /var/run at boot time.  But if a process stores  
its PID somewhere else, you're SOL (much like Sun One Directory Server  
does).


The problem with having to remove a master.pid file on boot is that  
you might have a BUNCH of clients or customers that are using your  
system, and you're either asleep at 3am when the server kicked over,  
or in another state.  It's not a problem if you have staff watching  
machines reboot. ;-)


Sorry, had to kibitz.

Sean

PS I often times add a 'rm $PID' line in the init.d script, and let a  
server die because it couldn't bind to the port.  That doesn't work  
with everything, though.