Re: postfix starting bug
Am Dienstag, 21. März 2006 12:47 schrieb Torsten Homeyer: Ralf S. Engelschall wrote: On Tue, Mar 21, 2006, Bernhard Reiter wrote: [...] Thus I am suggesting the following patch (only minimally tested). [...] +{ ! ps -elf|grep postfix/m[a]ster /dev/null 21 \ + /kolab/libexec/postfix/master -t /dev/null 21 ; } || \ + postfix_active=yes [...] ps -elf is not portable enough. Better would be something like a kill -0 `cat master.pid`. Ralf, Talking about shell portability, you probably know a lot more than I do. I have looked up POSIX for ps and -elf are posix options. BTW: Do you have a reference where such shell portability issues are written down? And there my be more than one OpenPKG instance with a running Postfix on a given system. Yes! It is a lot better to precisely look for the one postfix process we want, Ralf suggestion does this. Thanks! __ The OpenPKG Projectwww.openpkg.org User Communication List openpkg-users@openpkg.org
Re: postfix starting bug
On Wed, Mar 22, 2006, Bernhard Reiter wrote: [...] Talking about shell portability, you probably know a lot more than I do. I have looked up POSIX for ps and -elf are posix options. grin Yes, I know. They even can be _used_ without errors on a BSD system. But unfortunately (for historical reasons) they do something different on a BSD system. Hence they cannot really used in a portable way, at least not if one wants to achieve the same results. The only ugly workaround (if one really wants to watch the process list in a package) would be to use prefix/lib/openpkg/rpmtool signal. There I've added a small abstraction for most of the platforms. But best is to completely avoid looking at the process list :-( BTW: Do you have a reference where such shell portability issues are written down? No, I've no good reference at hand. What I know is partly in my head and partly coded down in my GNU Portable Shell Tool (shtool). Whenever I need a portable-enough shell construct, I look into my own code of GNU shtool: everything I used there is portable enough as GNU shtool since many years is driven on mostly all Unix platforms without problems. Some of the constructs in GNU shtool are horribly and totally unelegant, but they work on all known Unix platforms... And there my be more than one OpenPKG instance with a running Postfix on a given system. Yes! It is a lot better to precisely look for the one postfix process we want, Ralf suggestion does this. Nevertheless we have to live with the fact that the possibility exists that our kill -0 test might still lead an incorrect result as a different process now could own the PID. The chance is acceptable small under regular circumstances (not too much time between the run-time of the original and the kill -0 test). But it is the best we currently can do in a portable way... Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com __ The OpenPKG Projectwww.openpkg.org User Communication List openpkg-users@openpkg.org
Re: postfix starting bug
Torsten Homeyer [EMAIL PROTECTED] writes: Ralf S. Engelschall wrote: On Tue, Mar 21, 2006, Bernhard Reiter wrote: [...] Thus I am suggesting the following patch (only minimally tested). [...] +{ ! ps -elf|grep postfix/m[a]ster /dev/null 21 \ + /kolab/libexec/postfix/master -t /dev/null 21 ; } || \ + postfix_active=yes [...] ps -elf is not portable enough. Better would be something like a kill -0 `cat master.pid`. And there my be more than one OpenPKG instance with a running Postfix on a given system. Just my two cents Or someone may be using the native postfix and not want to start or use the OpenPKG version[1]. If you are going to test for the postfix process in the process list please check it is the one you expect (the OpenPKG postfix master), not just any postfix master process before you decide to kill it. Simon [1] I use my own Postfix RPMs on RHEL with OpenPKG amavis/clamav/spamassassin as they are updated more frequently than the vendor's own packages. __ The OpenPKG Projectwww.openpkg.org User Communication List openpkg-users@openpkg.org
Re: postfix starting bug (was: Not getting error msgs.)
Am Montag, 20. März 2006 18:41 schrieb Bill Campbell: On Mon, Mar 20, 2006, Bernhard Reiter wrote: Am Montag, 20. März 2006 13:32 schrieb Bernhard Reiter: Using OpenPKG 2.5 b) postfix /kolab/sbin/postfix start failed, because of an old master.pid (without process). The message of postfix-scripts did not end up in the logs and openpkg did not fail. This was more complicated then I have thought. On a start attempt, OpenPKG will do rcService postfix enable yes || exit 0 rcService postfix active yes exit 0 /kolab/sbin/postfix start The rcService postfix active yes line will call rc.kolab status which again calls master -t to test if the postfix is running or not. For whatever reasons master -t did return that postfix was still active, this probably was the stale lock. So the script of OpenPKG just did not try to restart postfix without any error message. Note that /kolab/sbin/postfix start would end up running /kolab/sbin/postfix-script which would have done the same test, but would have issued an error message. The main problem with this approach is that a non-negative return code of master -t cannot be used as a test that postfix is active. It means: postfix is active or the application is broken. In case of the special situation described, OpenPKG behaves badly. There does not seem to be an easy remedy. I did not discover a test if postfix is actually running, so the line cannot easily be exchanged. Maybe looking for master in the process table in addition to master -t, would make this test more robust. Perhaps it would be better to use the standard tests on %{l_prefix}/var/postfix/pid/master.pid instead of the one calling master -t? I have looked at the code for master -t and it first does an existance test for master.pid and the tries to get the lock for it. This is the right test in principle, so doing it manually instead of master -t will not prove the situation I think. Having master -t existing 1 and no master process before in the process list, would at least exculde other error conditions. I know there was a problem with using the postgresql native tests because they didn't detect a stale pid file. __ The OpenPKG Projectwww.openpkg.org User Communication List openpkg-users@openpkg.org
Re: postfix starting bug (was: Not getting error msgs.)
Am Montag, 20. März 2006 18:52 schrieb Bernhard Reiter: Am Montag, 20. März 2006 18:41 schrieb Bill Campbell: On Mon, Mar 20, 2006, Bernhard Reiter wrote: Am Montag, 20. März 2006 13:32 schrieb Bernhard Reiter: Using OpenPKG 2.5 b) postfix /kolab/sbin/postfix start failed, because of an old master.pid (without process). The message of postfix-scripts did not end up in the logs and openpkg did not fail. This was more complicated then I have thought. On a start attempt, OpenPKG will do rcService postfix enable yes || exit 0 rcService postfix active yes exit 0 /kolab/sbin/postfix start The rcService postfix active yes line will call rc.kolab status which again calls master -t to test if the postfix is running or not. For whatever reasons master -t did return that postfix was still active, this probably was the stale lock. So the script of OpenPKG just did not try to restart postfix without any error message. Note that /kolab/sbin/postfix start would end up running /kolab/sbin/postfix-script which would have done the same test, but would have issued an error message. The main problem with this approach is that a non-negative return code of master -t cannot be used as a test that postfix is active. It means: postfix is active or the application is broken. In case of the special situation described, OpenPKG behaves badly. There does not seem to be an easy remedy. I did not discover a test if postfix is actually running, so the line cannot easily be exchanged. Maybe looking for master in the process table in addition to master -t, would make this test more robust. Perhaps it would be better to use the standard tests on %{l_prefix}/var/postfix/pid/master.pid instead of the one calling master -t? I have looked at the code for master -t and it first does an existance test for master.pid and the tries to get the lock for it. This is the right test in principle, so doing it manually instead of master -t will not prove the situation I think. Having master -t existing 1 and no master process before in the process list, would at least exculde other error conditions. Thus I am suggesting the following patch (only minimally tested). --- rc.postfix.org 2006-03-21 08:41:50.775905000 +0100 +++ rc.postfix 2006-03-21 08:40:27.519905000 +0100 @@ -19,7 +19,9 @@ postfix_usable=no postfix_active=no /kolab/sbin/postfix check /dev/null 21 postfix_usable=yes -/kolab/libexec/postfix/master -t /dev/null 21 || postfix_active=yes +{ ! ps -elf|grep postfix/m[a]ster /dev/null 21 \ + /kolab/libexec/postfix/master -t /dev/null 21 ; } || \ + postfix_active=yes echo postfix_enable=\$postfix_enable\ echo postfix_usable=\$postfix_usable\ echo postfix_active=\$postfix_active\ pgpoKS4UaSf3j.pgp Description: PGP signature