Issue #2211 has been updated by John Florian.

Great news!  I now know how you can reproduce this and exactly what the problem 
is.  Please disregard the previous patch.

The fault actually lies with Facter::Util::Resolution in the "value" function 
when there is a timeout for a sub-process.  The code tries to reap the zombie 
sub-process that timed-out, but doesn't necessarily reap just its own (see 
facter/util/resolution.rb:128 and the Process.waitall).  What I see is 
effectively a race condition between the puppet thread that is looking for a 
package provider (spawns a thread for 'rpm --version') and this zombie reaper 
in facter (spawns a thread to Process.waitall).  When facter launches a 
sub-process for 'host #{hostname}' _and_ the network interface is down, the 
host command takes about 10 seconds to timeout.  Facter sees this is taking too 
long and goes to reap it as a zombie.  Unfortunately Process.waitall also 
happens to reap the sub-process for 'rpm --version' and thus steals that exit 
code away from puppet and BOOM the whole thing goes ugly quickly after that.

Please let me know if you need more info on how to reproduce this, but I 
suspect you should have no difficulty at this point.  It looks like the proper 
solution might involve having Facter::Util::Resolution.exec() also return the 
child PID so that the zombie reaper can wait for that PID specifically rather 
than all PIDs.
----------------------------------------
Bug #2211: puppet won't install packages if network interface does not have an 
IP address bound
http://projects.reductivelabs.com/issues/2211

Author: John Florian
Status: Accepted
Priority: High
Assigned to: Luke Kanies
Category: 
Target version: 0.25.0
Complexity: Unknown
Affected version: 0.24.8
Keywords: 


It is no longer possible to have puppet install packages via yum/rpm if the 
network interface is not bound to an IP address.  Our use case requires using 
puppet in the non-daemon mode and this is possible for us because the system 
will have all necessary manifests and other necessary files locally.  This 
worked just fine with 0.24.6 on Fedora 10, but began failing upon the upgrade 
to 0.24.8.

See the attachments for failure messages and a code diff that seems to have 
introduced the regression.  If I revert this one change, things work nicely 
once again.  Looks like a very simple fix if it weren't for the ominous looking 
comment in the code. :-)


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://reductivelabs.com/redmine/my/account

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to