On 4/15/12 4:42 PM, Ed W wrote: > On 15/04/2012 23:53, Tom Eastep wrote: >>> I'm seeing a regression in stale lock handling. If there is a stale >>> lock at boot is seems to deadlock forever (which is inconvenient...). >>> If I start it via the command line it seems to time out after some large >>> number of seconds and continue. Old behaviour (4.5.1.1) was to somehow >>> immediately burst the lock if it was stale. >> Lock handling hasn't changed in years; so what you are seeing must be a >> side effect of something else. > > Hmm, it's possible. I'm just debugging another problem where ipset > takes some many seconds to run if reverse dns isn't available (eg > iptables -P OUTPUT DROP), eg this takes some 10s of seconds in this > state... (the change was I tried to lock down iptables at boot about the > same time I updated shorewall, durr)
That's what Shorewall-init is for.
> ipset create cp1 bitmap:ip,mac range 192.168.111.0/24
>
>
>
>> What are your settings for:
>>
>> MUTEX_TIMEOUT
> 60
So it takes 60 seconds to time out a stale lock.
>
>
>> SUBSYSLOCK
> /var/lock/subsys/shorewall
That file really isn't a lock file; it simply exists when Shorewall is
started and removed when shorewall is Stopped.
>
> However, the message I get says something about "stale lock on
> /var/lib/shorewall/lock", so I think it's something different?
>
>
>> What are the contents of your shorewallrc file (normally
>> /usr/share/shorewall/shorewallrc)?
>
> HOST=linux
> PREFIX=/usr
> SHAREDIR=/usr/share
> LIBEXECDIR=${PREFIX}/share
> PERLLIBDIR=${PREFIX}/share/shorewall
> CONFDIR=/etc
> SBINDIR=/sbin
> MANDIR=/usr/share/man
> INITDIR=etc/init.d
> INITSOURCE=init.sh
> INITFILE=$PRODUCT
> AUXINITSOURCE=
> AUXINITFILE=
> SYSTEMD=
> SYSCONFFILE=
> SYSCONFDIR=
> ANNOTATED=
> VARDIR=/var/lib
>
> Can you confirm this looks sensible? (Gentoo based system, setting
> host=linux to build).
Looks reasonable.
>
> However, I'm sure you made a change for me some few versions back where
> the lock file handling got smarter, I had assumed you checked for a pid
> listed by the lock file? What I'm seeing now (but perhaps it's the same
> for 4.5.1.1) is that lock timeout is quite some time (presume 60 seconds...)
That's your MUTEX_TIMEOUT setting, yet.
>
> I *think* however, I need to do some more testing. I believe that what
> I might be seeing is problems due to the ipset timeouts, this has
> triggered some reboots to gain control and that in turn may have caused
> me to see some lock timeouts. Let me just check that chain of logic.
> However, in any case, would it not be possible to check if there even is
> a PID with the number shown in the lock file and bail out immediately if
> not? This is a common algorithm (although I will concede it can get
> racy in corner cases)
That code is in place -- see mutex_on() in lib.base.
-Tom
--
Tom Eastep \ When I die, I want to go like my Grandfather who
Shoreline, \ died peacefully in his sleep. Not screaming like
Washington, USA \ all of the passengers in his car
http://shorewall.net \________________________________________________
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________ Shorewall-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/shorewall-users
