On Fri, 07 Nov 2008 06:14:14 -0600 Victor Lowther <[EMAIL PROTECTED]> wrote:
> On Tue, 2008-11-04 at 21:55 -0600, Robby Workman wrote: > > I've gotten a report of pm-utils failing to work suddenly working > > successfully in the past, and through use of PM_DEBUG, I've traced > > it to the presence of a > > stale /var/run/pm-utils/locks/pm-suspend.lock being present. > > Looking at /usr/lib/pm-utils/functions, I don't see anything > > obviously wrong with locking functions, and of course it wouldn't > > be prudent to trap abnormal exits with lock removal, as that would > > paper over what's likely a real problem somewhere. > > Actually, pm-action removes locks via the shell trap mechanism -- the > pm-suspend lock should be removed no matter how the script exits. The > only exceptions are if pm-action is kill -9'ed, or if the system is > restarted instead of resuming. We don't do anything about the kill -9 > case, and on the reboot case there we rely on the FHS spec that says > the distro should clean out /var/run on reboot > (http://www.pathname.com/fhs/pub/fhs-2.3.html#VARRUNRUNTIMEVARIABLEDATA). > Does Slack do that? Not completely, no. We have several things that use subdirectories of /var/run, and at least one of them doesn't create its subdirectory on its own if it doesn't already exist, so it doesn't start (this is HAL, btw). I've been meaning to see about getting that addressed upstream (and yeah, I know we could work around it in the init script), but I keep forgetting :/ I know some distributions put /var/run on a tmpfs, which is probably a decent approach, but I don't think we'll go that route. Anyway, I'm looking into some better cruft cleanup in our rc.S (runlevel 1) script. > > Unfortunately, neither I nor the reporter have been able to > > reproduce it since that one instance, so I'm at a loss on how to > > further troubleshoot this. Has anyone else had this happen, and if > > so, did you figure out what was causing it? > > It used to happen to me all the time while writing the locking code. > I haven't seen it happen since we released 1.1.0, though. :) Interestingly enough, it *just* happened to me again today, and the cause is indeed a failed resume. I did a suspend to disk last night before going to bed, and when I powered up this morning, I got a fresh instance of the OS. I have no idea why that happened, as s2disk has always worked flawlessly here, but I think this is the first time I've actually done it since I've been running 2.6.27.4. No time to try debugging right now though, so don't put any brain cycles into it - I'll work on that later :-) -RW
signature.asc
Description: PGP signature
_______________________________________________ Pm-utils mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pm-utils
