[JIRA] [ssh-slaves] (JENKINS-21771) Apparent increase in thread usage for slave nodes
Greg horvath updated JENKINS-21771 Apparent increase in thread usage for slave nodes Change By: Greg horvath (11/Feb/14 8:07 PM) Description: After upgrading to Jenkins v1.549, ssh-slaves v1.6, git plugin v2.0.1, git-client plugin v1.6.2, we began observing many errors and job failures when running builds on our slave nodes. The errors were typically of the form:{code} Caused by: java.lang.OutOfMemoryError: unable to create new native thread {code}In general, we saw these errors mainly during two stages of the build process: during SCM step (git checkout), and during transfer of a file specified as a job parameter to the slave node. Some examples of these errors are below; however, it is worth noting that we did observe errors at other phases of the build as well (i.e. was not limited to these two phases). {code}00:00:45.338 ERROR: Workspace has a .git repository, but it appears to be corrupt.00:00:45.972 FATAL: java.io.IOException: Remote call on bob0024 failed00:00:45.972 hudson.remoting.RemotingSystemException: java.io.IOException: Remote call on bob0024 failed00:00:45.973 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:183)00:00:45.973 at $Proxy65.hasGitRepo(Unknown Source)00:00:45.973 at org.jenkinsci.plugins.gitclient.RemoteGitImpl.hasGitRepo(RemoteGitImpl.java:250)00:00:45.973 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:824)00:00:45.973 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:872)00:00:45.973 at org.jenkinsci.plugins.multiplescms.MultiSCM.checkout(MultiSCM.java:118)00:00:45.973 at hudson.model.AbstractProject.checkout(AbstractProject.java:1411)00:00:45.973 at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:651)00:00:45.973 at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)00:00:45.973 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:560)00:00:45.973 at hudson.model.Run.execute(Run.java:1670)00:00:45.973 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)00:00:45.973 at hudson.model.ResourceController.execute(ResourceController.java:88)00:00:45.973 at hudson.model.Executor.run(Executor.java:231)00:00:45.973 Caused by: java.io.IOException: Remote call on bob0024 failed00:00:45.973 at hudson.remoting.Channel.call(Channel.java:731)00:00:45.973 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)00:00:45.973 ... 13 more00:00:45.973 Caused by: java.lang.OutOfMemoryError: unable to create new native thread00:00:45.973 at java.lang.Thread.start0(Native Method)00:00:45.973 at java.lang.Thread.start(Thread.java:597)00:00:45.973 at hudson.remoting.AtmostOneThreadExecutor.execute(AtmostOneThreadExecutor.java:88)00:00:45.973 at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:78)00:00:45.973 at hudson.remoting.JarCacheSupport.resolve(JarCacheSupport.java:59)00:00:45.973 at hudson.remoting.ResourceImageBoth.initiateJarRetrieval(ResourceImageBoth.java:40)00:00:45.973 at hudson.remoting.ResourceImageBoth.resolve(ResourceImageBoth.java:22)00:00:45.973 at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:233)00:00:45.973 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)00:00:45.973 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)00:00:45.973 at hudson.util.StreamTaskListener._error(StreamTaskListener.java:132)00:00:45.973 at hudson.util.StreamTaskListener.error(StreamTaskListener.java:141)00:00:45.973 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.hasGitRepo(CliGitAPIImpl.java:126)00:00:45.973 at hudson.plugins.git.GitAPI.hasGitRepo(GitAPI.java:186)00:00:45.973 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)00:00:45.973 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)00:00:45.973 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)00:00:45.973 at java.lang.reflect.Method.invoke(Method.java:597)00:00:45.973 at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:299)00:00:45.973 at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:280)00:00:45.973 at h
[JIRA] [ssh-slaves] (JENKINS-21771) Apparent increase in thread usage for slave nodes
Greg horvath created JENKINS-21771 Apparent increase in thread usage for slave nodes Issue Type: Bug Affects Versions: current Assignee: Kohsuke Kawaguchi Components: ssh-slaves Created: 11/Feb/14 8:02 PM Description: After upgrading to Jenkins v1.549, ssh-slaves v1.6, git plugin v2.0.1, git-client plugin v1.6.2, we began observing many errors and job failures when running builds on our slave nodes. The errors were typically of the form: In general, we saw these errors mainly during two stages of the build process: during SCM step (git checkout), and during transfer of a file specified as a job parameter to the slave node. Some examples of these errors are below; however, it is worth noting that we did observe errors at other phases of the build as well (i.e. was not limited to these two phases). These errors were not dependably reproducible, and were nearly impossible to duplicate in isolation (i.e. could not reproduce when just running jobs on a single slave node). After some time debugging and ruling out some things, we determined that perhaps the user nproc limit (ulimit -u) might be too low, based on the errors that w were seeing and what we were able to rule out after debugging. We increased this limit on our slave nodes and...things went back to normal. None of our build and test jobs changed across the update, meaning that any changes we observed were introduced as a result of the upgrade. My question is then: what has changed recently with slave management that has increased the number of per node threads required? We don't yet have solid watermark data on how high it was actually going, but it was high enough to hit the ceiling for system default (1024). Environment: CentOS 6 Project: Jenkins Labels: slave jenkins performance Priority: Major Reporter: Greg horvath This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.