[Puppet Users] Re: puppetd hangs, dies - case 2

2010-03-11 Thread Gerhard Rieger
Hi

On Mar 11, 12:02 am, Peter Meier peter.me...@immerda.ch wrote:
  2) puppetd had successfully connected to puppetmasterd before. When on
  a following scheduled connection attempt puppetmasterd cannot be
  reached (host down or network broken), puppetd terminates silently
  instead of retrying later.

 this might be the cause for:http://projects.reductivelabs.com/issues/2888if 
 you can provide any
 further details or even reproduce this problem with --trace --debug it
 would be very helpfull.

This one can be most easily reproduced.
Platform is Debian Lenny, puppetd is 0.25.1-2~bpo50+1_all.deb (from
backports)

On the puppetd host stop the daemon, then run:

puppetd --no-daemonize -v -d --trace

On the same host add a filter rule to drop the reply packets, thus
simulating a network or master failure because puppetd will not get
EPERM or so.

iptables -t filter -A INPUT -p tcp --sport 8140 --source
puppetmaster ip-addr  -j DROP

wait...

the last messages are for me:

notice: Finished catalog run in 1.11 seconds
debug: Loaded state in 0.01 seconds
debug: Using cached certificate for ca
debug: Using cached certificate for whiskey.domain.com
debug: Using cached certificate_revocation_list for ca
debug: Format s not supported for Puppet::Resource::Catalog; has not
implemented method 'from_s'
/usr/lib/ruby/1.8/timeout.rb:60:in `open': execution expired
(Timeout::Error)
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/net/http.rb:553:in `do_start'
from /usr/lib/ruby/1.8/net/http.rb:542:in `start'
from /usr/lib/ruby/1.8/net/http.rb:1035:in `request'
from /usr/lib/ruby/1.8/net/http.rb:772:in `get'
from /usr/lib/ruby/1.8/puppet/indirector/rest.rb:69:in `find'
from /usr/lib/ruby/1.8/puppet/indirector/indirection.rb:198:in
`find'
from /usr/lib/ruby/1.8/puppet/indirector.rb:51:in `find'
from /usr/lib/ruby/1.8/puppet/configurer.rb:94:in
`retrieve_catalog'
from /usr/lib/ruby/1.8/puppet/util.rb:417:in `thinmark'
from /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime'
from /usr/lib/ruby/1.8/puppet/util.rb:416:in `thinmark'
from /usr/lib/ruby/1.8/puppet/configurer.rb:93:in
`retrieve_catalog'
from /usr/lib/ruby/1.8/puppet/configurer.rb:145:in `run'
from /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'
from /usr/lib/ruby/1.8/puppet/agent/locker.rb:21:in `lock'
from /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'
from /usr/lib/ruby/1.8/sync.rb:230:in `synchronize'
from /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'
from /usr/lib/ruby/1.8/puppet/agent.rb:130:in `with_client'
from /usr/lib/ruby/1.8/puppet/agent.rb:51:in `run'
from /usr/lib/ruby/1.8/puppet/agent.rb:104:in `start'
from /usr/lib/ruby/1.8/puppet/external/event-loop/signal-
system.rb:97:in `call'
from /usr/lib/ruby/1.8/puppet/external/event-loop/signal-
system.rb:97:in `__signal__'
from /usr/lib/ruby/1.8/puppet/external/event-loop/signal-
system.rb:97:in `each'
from /usr/lib/ruby/1.8/puppet/external/event-loop/signal-
system.rb:97:in `__signal__'
from (eval):2:in `signal'
from /usr/lib/ruby/1.8/puppet/external/event-loop/event-
loop.rb:321:in `sound_alarm'
from /usr/lib/ruby/1.8/puppet/external/event-loop/event-
loop.rb:130:in `select'
from /usr/lib/ruby/1.8/puppet/external/event-loop/event-
loop.rb:130:in `each'
from /usr/lib/ruby/1.8/puppet/external/event-loop/event-
loop.rb:130:in `select'
from /usr/lib/ruby/1.8/puppet/external/event-loop/event-
loop.rb:116:in `iterate'
from /usr/lib/ruby/1.8/puppet/external/event-loop/event-
loop.rb:107:in `run'
from /usr/lib/ruby/1.8/puppet/daemon.rb:130:in `start'
from /usr/lib/ruby/1.8/puppet/application/puppetd.rb:116:in
`main'
from /usr/lib/ruby/1.8/puppet/application.rb:226:in `send'
from /usr/lib/ruby/1.8/puppet/application.rb:226:in
`run_command'
from /usr/lib/ruby/1.8/puppet/application.rb:217:in `run'
from /usr/lib/ruby/1.8/puppet/application.rb:306:in
`exit_on_fail'
from /usr/lib/ruby/1.8/puppet/application.rb:217:in `run'
from /usr/sbin/puppetd:159

No lock file problems occur.

dont forget to remove the filter:
iptables -t filter -D INPUT -p tcp --sport 8140 --source
puppetmaster ip-addr  -j DROP

-gr

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Users group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



[Puppet Users] Re: puppetd hangs, dies - case 1

2010-03-11 Thread Gerhard Rieger
Hi,

On Mar 11, 12:02 am, Peter Meier peter.me...@immerda.ch wrote:
  1) When puppetd starts for the first time and cannot reach
  puppetmasterd (due to routing or firewall problem), it hangs and
  cannot be stopped with SIGTERM (that is used by /etc/init.d/puppet
  stop and restart)

 Might be related to 3) ?

I found out more but it is not yet fully reproducable on arbitrary
hosts.
It appears to be related to DNS domain detection: When puppetd
completely fails
to determine the domain name of the local host AND no  server = ...
directive
is specified in the [puppetd] section of puppet.conf then  puppetd
does not terminate on SIGTERM.

puppetd (or maybe ruby) tries at least the following to find the
domain name:

* calls program dnsdomainname
* DNS-resolves the local hostname if it is not already fully qualified

When it runs into the problem it prints the following two error lines:

dnsdomainname: Unknown host
dnsdomainname: Unknown host

To reproduce you have to remove all domain and search lines from
/etc/resolv.conf; make sure that hostname is the short name; check
that
dnsdomainname fails with Unknown host and that nslookup
shortname does not
resolve. But there seems to be another contributing factor that I
could not yet
find (it is not nscd).
Then stop puppetd and run it in foreground.

Trace output:

# puppetd --no-daemonize -d -v --trace
dnsdomainname: Unknown host
dnsdomainname: Unknown host
debug: Failed to load library 'selinux' for feature 'selinux'
debug: Puppet::Type::User::ProviderPw: file pw does not exist
debug: Puppet::Type::User::ProviderDirectoryservice: file /usr/bin/
dscl does not exist
debug: Puppet::Type::User::ProviderUser_role_add: file roledel does
not exist
debug: Failed to load library 'ldap' for feature 'ldap'
debug: Puppet::Type::User::ProviderLdap: feature ldap is missing
debug: /File[/var/lib/puppet/lib]: Autorequiring File[/var/lib/puppet]
debug: /File[/var/lib/puppet/ssl/public_keys]: Autorequiring File[/var/
lib/puppet/ssl]
debug: /File[/var/lib/puppet/ssl/private_keys/wodka1.pem]:
Autorequiring File[/var/lib/puppet/ssl/private_keys]
debug: /File[/var/lib/puppet/state]: Autorequiring File[/var/lib/
puppet]
debug: /File[/var/lib/puppet/client_yaml]: Autorequiring File[/var/lib/
puppet]
debug: /File[/var/lib/puppet/ssl/certs]: Autorequiring File[/var/lib/
puppet/ssl]
debug: /File[/var/run/puppet/puppetd.pid]: Autorequiring File[/var/run/
puppet]
debug: /File[/var/lib/puppet/ssl/private_keys]: Autorequiring File[/
var/lib/puppet/ssl]
debug: /File[/var/lib/puppet/ssl]: Autorequiring File[/var/lib/puppet]
debug: /File[/var/lib/puppet/ssl/public_keys/wodka1.pem]:
Autorequiring File[/var/lib/puppet/ssl/public_keys]
debug: /File[/var/lib/puppet/ssl/private]: Autorequiring File[/var/lib/
puppet/ssl]
debug: /File[/var/lib/puppet/state/graphs]: Autorequiring File[/var/
lib/puppet/state]
debug: /File[/var/lib/puppet/ssl/certificate_requests]: Autorequiring
File[/var/lib/puppet/ssl]
debug: /File[/var/lib/puppet/facts]: Autorequiring File[/var/lib/
puppet]
debug: /File[/etc/puppet/puppet.conf]: Autorequiring File[/etc/puppet]
debug: /File[/var/lib/puppet/clientbucket]: Autorequiring File[/var/
lib/puppet]
debug: Finishing transaction -609097158 with 0 changes
err: Could not request certificate: getaddrinfo: Name or service not
known
err: Could not request certificate: getaddrinfo: Name or service not
known
err: Could not request certificate: getaddrinfo: Name or service not
known
err: Could not request certificate: getaddrinfo: Name or service not
known


The last line is repeated every few minutes. SIGTERM does not
terminate
puppetd.

-gr

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Users group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



[Puppet Users] puppetd hangs, dies

2010-03-10 Thread Gerhard Rieger
Hi,

I experienced some problems with puppetd when it cannot reach
puppetmasterd. They apply to 0.25.1 on Debian; I could not find these
issues in changelog or bugtrack, so here I go:

1) When puppetd starts for the first time and cannot reach
puppetmasterd (due to routing or firewall problem), it hangs and
cannot be stopped with SIGTERM (that is used by /etc/init.d/puppet
stop and restart)

2) puppetd had successfully connected to puppetmasterd before. When on
a following scheduled connection attempt puppetmasterd cannot be
reached (host down or network broken), puppetd terminates silently
instead of retrying later.

3) When puppetd has actually established a connection to puppetmasterd
and the network breaks, puppetd hangs until the server responds again;
when there is a stateful firewall between these hosts that drops the
established packets after some time puppetd hangs forever instead of
closing the connection after some timeout and retrying later.

Are these already known/fixed?

-gr

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Users group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.