On Tue, Feb 16, 2010 at 12:23 PM, Nigel Kersten <[email protected]> wrote:
> On Sat, Feb 13, 2010 at 6:13 PM, Joshua Anderson
> <[email protected]> wrote:
>> I'm afraid that I couldn't reproduce this on a Debian VM with Kai's example.
>
> Joshua, I was just having issues reproducing it as well on a 4 core system.
>
> As soon as I ran 3 instances of:
>
> while : ; do openssl speed; done
>
> to peg 3 of the cores, I could reproduce the same case as Kai initially
> posted.
>
> exec {"TEST-EXEC" :
> cwd => "/tmp/",
> command =>"/usr/bin/touch /tmp/7777 >/tmp/123 2>&1",
> timeout => 5,
> logoutput=> on_failure
> }
>
> puppet -v ~/test_exec.pp
> err: //Exec[TEST-EXEC]/returns: change from notrun to 0 failed:
> Command exceeded timeout at /root/test_exec.pp:6
ahah. cc'ing puppet-dev as they may have suggestions for the best way forward.
So this isn't a Puppet bug at all.
It looks to be a bug in the Ruby Timeout module that seems to be
triggered when most of your cores are busy.
I can reliably reproduce it by firing up openssl speed (n-1) times
where n is the number of cores and then using the Timeout module.
#!/usr/bin/ruby1.8
#
%x{/usr/bin/touch /tmp/7777}
puts "executed without timeout ok"
puts "executing with timeout"
require 'timeout'
status = Timeout::timeout(5) {
%x{/usr/bin/touch /tmp/7777}
}
puts "executed with timeout ok"
which will produce something like:
r...@testhost:~# ps auxww|grep [o]penssl
root 22337 99.6 0.0 14616 2028 pts/6 R 15:04 2:52 openssl speed
root 22338 99.9 0.0 14616 2028 pts/6 R 15:04 2:49 openssl speed
root 22339 100 0.0 14616 2024 pts/6 R 15:04 2:49 openssl speed
r...@testhost:~# ~/tickle_ruby.rb
executed without timeout ok
executing with timeout
/usr/lib/ruby/1.8/timeout.rb:60: execution expired (Timeout::Error)
from /root/tickle_ruby.rb:11
r...@testhost:~# killall openssl
[1] Terminated openssl speed &>/dev/null
[2]- Terminated openssl speed &>/dev/null
[3]+ Terminated openssl speed &>/dev/null
r...@testhost:~# ~/tickle_ruby.rb
executed without timeout ok
executing with timeout
executed with timeout ok
This looks to be a problem for all the ruby 1.8.7 p249 variants I've
tried, apart from the MacPorts one, which looks to have a bunch of
patches around these issues.
>
>
>>
>> Here's my attempt:
>>
>> j...@debian:~$ uname -a
>> Linux debian 2.6.18.8-x86_64-linode10 #1 SMP Tue Nov 10 16:29:17 UTC 2009
>> x86_64 GNU/Linux
>> j...@debian:~$ ruby -v
>> ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
>> j...@debian:~$ puppet --version
>> 0.25.4
>> j...@debian:~$ puppet --debug --trace test.pp
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> Could not retrieve virtual: Permission denied - /proc/xen/capabilities
>> debug: Creating default schedules
>> debug: Failed to load library 'selinux' for feature 'selinux'
>> debug: Failed to load library 'ldap' for feature 'ldap'
>> debug: /File[/home/josh/.puppet/ssl]: Autorequiring File[/home/josh/.puppet]
>> debug: /File[/home/josh/.puppet/var/client_yaml]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/ssl/certificate_requests]: Autorequiring
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/var/log]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/lib]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/state]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/clientbucket]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/ssl/private_keys]: Autorequiring
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/ssl/certs]: Autorequiring
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/var]: Autorequiring File[/home/josh/.puppet]
>> debug: /File[/home/josh/.puppet/ssl/private]: Autorequiring
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/ssl/public_keys]: Autorequiring
>> File[/home/josh/.puppet/ssl]
>> debug: /File[/home/josh/.puppet/var/state/graphs]: Autorequiring
>> File[/home/josh/.puppet/var/state]
>> debug: /File[/home/josh/.puppet/var/facts]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: /File[/home/josh/.puppet/var/run]: Autorequiring
>> File[/home/josh/.puppet/var]
>> debug: Finishing transaction 23715921915640 with 0 changes
>> info: Applying configuration version '1266113402'
>> debug: //testmodule/Exec[TEST-EXEC]: Changing returns
>> debug: //testmodule/Exec[TEST-EXEC]: 1 change(s)
>> debug: //testmodule/Exec[TEST-EXEC]: Executing '/usr/bin/touch /tmp/7777
>> >/tmp/123 2>&1'
>> debug: Executing '/usr/bin/touch /tmp/7777 >/tmp/123 2>&1'
>> notice: //testmodule/Exec[TEST-EXEC]/returns: executed successfully
>> debug: Finishing transaction 23715922698720 with 1 changes
>> j...@debian:~$
>>
>> -Josh
>>
>>
>> On Feb 13, 2010, at 9:49 AM, Nigel Kersten wrote:
>>
>>> Note too that the same bug should be affecting Debian testing and
>>> unstable if the Ruby 1.8.7 p249 package is the problem.
>>>
>>> Surely we have some people running Debian testing on the list? Seeing
>>> any weird timeouts with execs?
>>>
>>>
>>>
>>> On Fri, Feb 12, 2010 at 11:57 AM, Joel Ebel <[email protected]> wrote:
>>>> Kai, and anyone else experiencing this problem, please go vote, and
>>>> optionally chime in with any details you can provide on:
>>>> https://bugs.launchpad.net/ubuntu/+source/ruby1.8/+bug/520715
>>>>
>>>> Thanks,
>>>> Joel
>>>>
>>>> On Feb 11, 3:06 pm, Joel Ebel <[email protected]> wrote:
>>>>> I've reported this bug to Ubuntu. The solution is to rebuild ruby1.8
>>>>> without pthreads, unless ruby fixes the bug upstream which causes the
>>>>> hang.
>>>>>
>>>>> https://bugs.launchpad.net/ubuntu/+source/ruby1.8/+bug/520715
>>>>>
>>>>> Joel
>>>>>
>>>>> On Feb 10, 2:42 pm, Nigel Kersten <[email protected]> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> On Wed, Feb 10, 2010 at 11:48 AM, Nigel Kersten <[email protected]>
>>>>>> wrote:
>>>>>>> On Tue, Feb 9, 2010 at 5:06 AM, kai.steverding
>>>>>>> <[email protected]> wrote:
>>>>>>>> I installed ruby on the above server and tried with a simple exec-
>>>>>>>> test :
>>>>>
>>>>>>>> class testmodule {
>>>>>>>> exec {"TEST-EXEC" :
>>>>>>>> cwd => "/tmp/",
>>>>>>>> command =>"/usr/bin/touch /tmp/7777 >/tmp/123
>>>>>>>> 2>&1",
>>>>>>>> timeout => 5,
>>>>>>>> logoutput=> on_failure
>>>>>>>> }
>>>>>>>> }
>>>>>
>>>>>>>> This simple thing gets the following output from "puppet --debug --
>>>>>>>> test"
>>>>>
>>>>>>>> debug: Loaded state in 0.00 seconds
>>>>>>>> info: Applying configuration version '1265719507'
>>>>>>>> debug: //testmodule/Exec[TEST-EXEC]: Changing returns
>>>>>>>> debug: //testmodule/Exec[TEST-EXEC]: 1 change(s)
>>>>>>>> debug: //testmodule/Exec[TEST-EXEC]: Executing '/usr/bin/touch /tmp/
>>>>>>>> 7777'
>>>>>>>> debug: Executing '/usr/bin/touch /tmp/7777'
>>>>>>>> err: //testmodule/Exec[TEST-EXEC]/returns: change from notrun to 0
>>>>>>>> failed: Command exceeded timeout at /etc/puppet/modules/testmodule/
>>>>>>>> manifests/init.pp:6
>>>>>>>> debug: Finishing transaction 69914685668640 with 1 changes
>>>>>>>> debug: Storing state
>>>>>>>> debug: Stored state in 0.01 seconds
>>>>>>>> debug: Format pson not supported for Puppet::Transaction::Report; has
>>>>>>>> not implemented method 'from_pson'
>>>>>>>> debug: Format s not supported for Puppet::Transaction::Report; has not
>>>>>>>> implemented method 'from_s'
>>>>>
>>>>>>>> What can I do ? Did i make a mistake, or is exec broken ?
>>>>>
>>>>>>> Kai, something is definitely broken in Lucid.
>>>>>
>>>>>>> We're seeing all sorts of process exec issues.
>>>>>
>>>>>>> Have you nailed this down at all?
>>>>>
>>>>>> So Kai, we've been doing some experimenting here today, and have
>>>>>> reproduced these hangs in all the Debian Ruby1.8 packages back to
>>>>>> 1.8.7.174-2.
>>>>>
>>>>>> 1.8.7.174-1 we've been unable to reproduce it on though.
>>>>>
>>>>>> From the changelog I'm wondering if the first entry under 174-2 is
>>>>>> responsible. Note this was later removed after upstream integrated it.
>>>>>
>>>>>> ruby1.8 (1.8.7.174-2) unstable; urgency=medium
>>>>>
>>>>>> [ akira yamada ]
>>>>>> * Added debian/patches/090811_thread_and_select.dpatch: threads may
>>>>>> hangup
>>>>>> when IO.select called from two or more threads.
>>>>>> * Added debian/patches/090812_finalizer_at_exit.dpatch: finalizers
>>>>>> should be
>>>>>> run at exit (Closes: #534241)
>>>>>> * Added debian/patches/090812_class_clone_segv.dpatch: avoid segv
>>>>>> when an
>>>>>> object cloned. (Closes: #533329)
>>>>>> * Added debian/patches/090812_eval_long_exp_segv.dpatch: fix segv
>>>>>> when eval
>>>>>> a long expression. (Closes: #510561)
>>>>>> * Added debian/patches/090812_openssl_x509_warning.dpatch: suppress
>>>>>> warning
>>>>>> from OpenSSL::X509::ExtensionFactory. (Closes: #489443)
>>>>>
>>>>>> [ Lucas Nussbaum ]
>>>>>> * Removed Fumitoshi UKAI <[email protected]> from Uploaders. Thanks a
>>>>>> lot for the past help! Closes: #541037
>>>>>
>>>>>> [ Daigo Moriwaki ]
>>>>>> * debian/fixshebang.sh: skip non-text files, which works around
>>>>>> hanging of
>>>>>> sed on scanning gif images.
>>>>>> * Bumped up Standards-Version to 3.8.2.
>>>>>
>>>>>> --
>>>>>> nigel
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>> "Puppet Users" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to
>>>> [email protected].
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/puppet-users?hl=en.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> nigel
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Puppet Users" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected].
>>> For more options, visit this group at
>>> http://groups.google.com/group/puppet-users?hl=en.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Puppet Users" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/puppet-users?hl=en.
>>
>>
>
>
>
> --
> nigel
>
--
nigel
--
You received this message because you are subscribed to the Google Groups
"Puppet Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/puppet-dev?hl=en.