On Mon, Jan 1, 2018 at 10:08 PM, Matt Wise <w...@wiredgeek.net> wrote:

> We're still tuning, but I ended up dropping our PuppetServer JRubyInstance
> count down to 2, and I have the -Xmx setting set to 4GB(!!). I think that
> we have a few libraries loaded in that are causing some major bloat, but we
> haven't had time to track that down yet.
>
> The big concern I have is not the crashing of the servers... we can handle
> that. The main issue is that it seems that the Puppet Agents get into a
> hung state and never recover. Thats not a behavior we ever saw on the older
> Puppet 3.x clients.
>
> On Mon, Jan 1, 2018 at 9:50 PM, John Gelnaw <jgel...@gmail.com> wrote:
>
>> On Monday, January 1, 2018 at 5:52:10 PM UTC-5, Matt Wise wrote:
>>>
>>> *Puppet Agent: 5.3.2*
>>> *Puppet Server: 5.1.4 - Packaged in Docker, running on Amazon ECS*
>>>
>>
>> I'm running a docker-compose based puppet setup, and had the same
>> problem.  Short version was to increase the java heap size for the JRuby
>> instances for puppetserver.
>>
>> Using the docker-compose.yml, I added:
>>
>>     environment:
>>       - PUPPETSERVER_JAVA_ARGS=-Xmx1024m
>>
>> to the puppet stanza, which gets passed to the puppetserver init script.
>>
>> We also increased the number of JRuby instances to 7, but that might be
>> overkill (roughly 200-250 nodes).  That also means 8 gigs of memory on the
>> docker host.
>>
>> The agents would eventually time out, but I seem to recall it was on the
>> order of hours for the timeout.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Puppet Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to puppet-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/puppet-users/20b2d83e-7752-4f87-995f-3ec2fcde5368%40googlegroups.com
>> <https://groups.google.com/d/msgid/puppet-users/20b2d83e-7752-4f87-995f-3ec2fcde5368%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/puppet-users/CA%2B9wXBTFODg4VLw5Zmc0eq9DG-i3YZtR5VSWs_
> krJFkaQzRHMQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/puppet-users/CA%2B9wXBTFODg4VLw5Zmc0eq9DG-i3YZtR5VSWs_krJFkaQzRHMQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

In Puppet 4 we added settings for configuring http connect and read
timeouts independently[1]. Previously they were both controlled by the
configfiletimeout. The default read timeout is unlimited, so the hung agent
may be stuck in a socket read. You might want to strace the stuck agent to
see what it's up to.

In our upcoming 4.10.x/5.3.x releases, we've added a watchdog to kill a
stuck run[2].

Josh

[1] https://tickets.puppetlabs.com/browse/PUP-3666
[2] https://tickets.puppetlabs.com/browse/PUP-7517

-- 
Josh Cooper | Software Engineer
j...@puppet.com | @coopjn

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/CA%2Bu97umYXi5thW62G31TQ9XRuw1quB2POPQop4L6Pd5zRwfi%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to