Issue #5139 has been updated by Andrew Forgue.

I spent a bit of time checking this out today.  The code that locks the catalog 
that seems to not be working is essentially:

    if lockfile.lock
      begin
        run_catalog
      ensure
        lockfile.unlock
      end
    end
    
Where lock is basically a create, yield and then remove lock file.  Removing 
the lockfile is handled in an ensure.  For some reason either (a) ensure 
doesn't seem to be getting called, or (b) create_lock is returning false when 
it shouldn't, leaving an empty puppetdlock behind.  I've seen the puppetdlock 
file getting left around on normal catalog runs as well, so I'm leaning to (b). 
 However, looking at Puppet::Util::Pidfile,  I think the only way lock can 
return false is if there's a race condition in lock() between two puppet 
agents, but I can't see it.

I wrote some code that splits Puppet::Util::Pidlock into Puppet::Util::Filelock 
and Pidlock, the difference being that Filelocks are anonymous (no-pid) so the 
existence of the file would be considered locked, and I changed Pidlock to 
consider any stale pid *or empty file* as stale and will be removed (and thus 
isn't a lock).  I don't know if this is the proper way to handle it, but if 
anyone has any better Ideas please fix it :).

This problem is causing us to lose about two servers a day to this problem, 
causing us to have to go in and manually run puppet agent --enable.  The 
administrator defined lock and puppet internal lock should really be separate, 
IMHO.
----------------------------------------
Bug #5139: puppetdlock file can be empty
https://projects.puppetlabs.com/issues/5139

Author: Alan Barrett
Status: Accepted
Priority: High
Assignee: 
Category: 
Target version: 2.6.x
Affected Puppet version: 
Keywords: mcollective
Branch: 


There seems to be something wrong with the way the $statedir/puppetdlock file 
is created.  Under normal circumstances, when puppetd is running, the file 
contains the PID of the running puppetd process.  For example:

<pre>
$ cat /var/puppet/state/puppetdlock
26898 [no newline at end of file]
</pre>

If puppetd crashes or is killed, then the file may be empty:

<pre>
$ cat /var/puppet/state/puppetdlock
[empty]
$ ls -l /var/puppet/state/puppetdlock
-rw-r--r--   1 root     root           0 Oct 28 14:32 
/var/pupp/state/puppetdlock
</pre>

This causes future puppetd runs to fail with "notice: Run of Puppet 
configuration client already in progress; skipping".

I suspect that the problem is that the lock file is created in such a way that 
a crash between creating the file and writing to the file leaves the file 
empty.  A safe technique is to write to a temporary file and then rename the 
temporary file, so that the actual lock file either does not exist, or exists 
with correct contents, but never exists with partial contents.

Also, the code that complains about "client already in progress" could be 
smarter; it should read the PID from the lock file and verify that the process 
is actually running.

If you need sample code, then see NetBSD's shlock(1) utility 
(http://cvsweb.netbsd.org/bsdweb.cgi/usr.bin/shlock/), which is derived from 
the code that was in HoneyDanBer UUCP.


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to