On Mon, Feb 11, 2019 at 5:42 AM Mike Sharpton <sharpt...@gmail.com> wrote:

> Hey all,
>
> We have recently upgraded our environment from Puppetserver 4.2.2 to
> Puppetserver 6.0.2.  We are running a mix of Puppet 4 and Puppet 6 agents
> until we can get them all upgraded to 6.  We have around 6000 nodes, and we
> had 4 Puppetservers, but we added two more due to capacity issues with
> Puppet 6.  The load is MUCH higher with Puppet 6.  To the question, I am
> seeing longer and longer agent run times after about two days of the
> services running.  The only error in the logs that seems to have any
> relation to this is this string.
>
> 2019-02-11T04:32:28.409-06:00 ERROR [qtp1148783071-4075] [p.r.core]
> Internal Server Error: java.io.IOException:
> java.util.concurrent.TimeoutException: Idle timeout expired: 30001/30000 ms
>
>
> After I restart the puppetserver service, this goes away for about two
> days.  I think Puppetserver is dying a slow death under this load (load
> average of around 5-6).  We are running Puppetserver on vm's that are
> 10X8GB and using 6 Jruby workers per Puppetserver and a 4GB heap.  I have
> not seen any OOM exceptions and the process never crashes.  Has anyone else
> seen anything like this?  I did some Googling and didn't find a ton of
> relevant stuff.  Perhaps we need to upgrade to the latest version to see if
> this helps?  Even more capacity?  Seems silly.  Thanks in advance!
>

Off the top of my head:
1. Have you tried lowering the JRuby workers to JVM heap ratio? (I would
try 1G to 1worker to see if it really is worker performance)
2. That error is most likely from Jetty (it can be tuned with
idle-timeout-milliseconds[1]). Are agent runs failing with a 500 from the
server when that happens? Are clients failing to post their facts or
reports in a timely manner? Is Puppet Server failing its connections to
PuppetDB?
3. Are you managing any other server settings? Having a low
max-requests-per-instance is problematic for newer servers (they more
aggressively compile/optimize the Ruby code the worker loads, so with
shorter lifetimes it does a bunch of work to then throw it a way and start
over - and that can cause much more load).
4. What version of java are you using/do you have any custom tuning of Java
that maybe doesn't work well with newer servers? Server 5+ only has support
for Java 8 and will use more non-heap memory/code cache for those new
optimizations mentioned above.

HTH,
Justin


1.
https://github.com/puppetlabs/trapperkeeper-webserver-jetty9/blob/master/doc/jetty-config.md#idle-timeout-milliseconds


> Mike
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-users/197c0ad5-83c0-4562-833b-82028f0e3e9c%40googlegroups.com
> <https://groups.google.com/d/msgid/puppet-users/197c0ad5-83c0-4562-833b-82028f0e3e9c%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/CA%2B%3DBEqXhSaod%2BkJHx23YpPVd3DMc8gSofvU2D6bbv%3Dt4%3DJKDxQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to