Thanks Oleg for the analysis, not sure what's causing that particular classloading, I'm actually running in my local jenkins instance, so no other things are happening in the background but the pipeline that triggers 100 parallel stages.
It seems the aws-sdk-plugin was installed in my local jenkins instance, so , the list of jar files which are bigger than 5mb in the docker container are the below ones: root@5075844d2393:/var/jenkins_home# find . -name *.jar | xargs ls -ltrh | cut -d' ' -f5- | grep "[0-9]M" | sort -n 5.3M Jun 27 2019 ./plugins/dtkit-api/WEB-INF/lib/Saxon-HE-9.9.1-2.jar 5.4M Jan 6 10:05 ./plugins/aws-java-sdk/WEB-INF/lib/aws-java-sdk-ec2-1.11.700.jar 6.8M Feb 10 09:05 ./war/WEB-INF/lib/groovy-all-2.4.12.jar 6.8M May 30 2018 ./plugins/ghprb/WEB-INF/lib/groovy-all-2.4.11.jar 6.8M Oct 22 2018 ./plugins/ws-cleanup/WEB-INF/lib/groovy-all-2.4.12.jar 11M Feb 10 09:05 ./war/WEB-INF/lib/jenkins-core-2.209.jar 13M Jun 27 2019 ./plugins/dtkit-api/WEB-INF/lib/icu4j-63.1.jar 35M Jan 6 10:06 ./plugins/aws-java-sdk/WEB-INF/lib/aws-java-sdk-models-1.11.700.jar The only file I know that explicitly is copied when a cloud worker is created is the remoting.jar, I don't see any other explicit references in the google compute engine plugin. So, any ideas how I can go deeper a bit more? Are there any specific configuration (feature flags or something similar) that could help me to tweak the remoting to reduce the classloading saturation? Thanks so much On Monday, 10 February 2020 16:24:20 UTC, Oleg Nenashev wrote: > > From what I see many channels are blocked by classloading, e.g. this one: > Classloading might be a long operation, especially when massive > classloading saturates your master's network. > Just check whether you send big JARs over the network, e.g. AWS SDK > (~100Mb) or similar XXXL ones > > Computer.threadPoolForRemoting [#484] for obs11-linux-pzz7g1 id=23 > Stack Trace is: > java.lang.Thread.State: RUNNABLE > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) > at java.net.SocketOutputStream.write(SocketOutputStream.java:155) > at com.trilead.ssh2.crypto.cipher.CipherOutputStream.internal_write( > CipherOutputStream.java:52) > at com.trilead.ssh2.crypto.cipher.CipherOutputStream.writeBlock( > CipherOutputStream.java:101) > at com.trilead.ssh2.crypto.cipher.CipherOutputStream.write( > CipherOutputStream.java:118) > at com.trilead.ssh2.transport.TransportConnection.sendMessage( > TransportConnection.java:179) > at com.trilead.ssh2.transport.TransportConnection.sendMessage( > TransportConnection.java:107) > at com.trilead.ssh2.transport.TransportManager.sendMessage( > TransportManager.java:690) > - locked <0x00000006d864a0e8> (a java.lang.Object) > at com.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java: > 429) > - locked <0x00000006d9603858> (a java.lang.Object) > at com.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream. > java:63) > at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java: > 90) > at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:85) > at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:54) > at java.io.OutputStream.write(OutputStream.java:75) > at hudson.remoting.ChunkedCommandTransport.writeBlock( > ChunkedCommandTransport.java:45) > at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write( > AbstractSynchronousByteArrayCommandTransport.java:46) > at hudson.remoting.Channel.send(Channel.java:721) > - locked <0x00000006d9602dd8> (a hudson.remoting.Channel) > at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:143) > - locked <0x00000007b7efa6d0> (a hudson.remoting.ProxyOutputStream) > at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:108) > at hudson.remoting.Util.copy(Util.java:56) > at hudson.remoting.JarLoaderImpl.writeJarTo(JarLoaderImpl.java:56) > at sun.reflect.GeneratedMethodAccessor513.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform( > RemoteInvocationHandler.java:929) > at hudson.remoting.RemoteInvocationHandler$RPCRequest.call( > RemoteInvocationHandler.java:903) > at hudson.remoting.RemoteInvocationHandler$RPCRequest.call( > RemoteInvocationHandler.java:855) > at hudson.remoting.UserRequest.perform(UserRequest.java:211) > at hudson.remoting.UserRequest.perform(UserRequest.java:54) > at hudson.remoting.Request$2.run(Request.java:369) > at hudson.remoting.InterceptingExecutorService$1.call( > InterceptingExecutorService.java:72) > at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18 > ) > at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java > :21) > at jenkins.util.ContextResettingExecutorService$2.call( > ContextResettingExecutorService.java:46) > at jenkins.security.ImpersonatingExecutorService$2.call( > ImpersonatingExecutorService.java:71) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. > java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor. > java:624) > at java.lang.Thread.run(Thread.java:748) > > > > > > On Monday, February 10, 2020 at 2:16:42 PM UTC+1, Victor Martinez wrote: >> >> I'm testing the google-compute-engine-plugin and it seems there are a >> bunch of blocked threads related to the below hudson.remoting methods: >> >> >> - hudson.remoting.Request.call(Request.java:154) >> - hudson.remoting.Channel.send(Channel.java:716) >> >> See some of the stacktraces: >> >> Computer.threadPoolForRemoting [#414] >> priority:5 - threadId:0x00007f848800c000 - nativeId:0x1c73 - nativeId >> (decimal):7283 - state:BLOCKED >> stackTrace: >> java.lang.Thread.State: BLOCKED (on object monitor) >> at hudson.remoting.Request.call(Request.java:154) >> - waiting to lock <0x00000007b6b39490> (a hudson.remoting.Channel) >> at hudson.remoting.Channel.call(Channel.java:954) >> at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:624) >> at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:418) >> at >> com.google.jenkins.plugins.computeengine.ComputeEngineComputerLauncher.launch(ComputeEngineComputerLauncher.java:318) >> at >> com.google.jenkins.plugins.computeengine.ComputeEngineComputerLauncher.launch(ComputeEngineComputerLauncher.java:224) >> at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:290) >> at hudson.slaves.SlaveComputer$$Lambda$378/348978491.call(Unknown Source) >> at >> jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) >> at >> jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >> >> Computer.threadPoolForRemoting [#347] for obs11-linux-r6rpli id=25 >> priority:5 - threadId:0x00007f849427b000 - nativeId:0x1a93 - nativeId >> (decimal):6803 - state:BLOCKED >> stackTrace: >> java.lang.Thread.State: BLOCKED (on object monitor) >> at hudson.remoting.Channel.send(Channel.java:716) >> - waiting to lock <0x00000006d82e4570> (a hudson.remoting.Channel) >> at hudson.remoting.Request$2.run(Request.java:382) >> at >> hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) >> at >> org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18) >> at >> hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21) >> at >> jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) >> at >> jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >> >> *Environment:* >> >> - Docker Jenkins instance. >> - Jenkins Version 2.109 >> - Trilead-api 1.05 >> - Testing PR: >> https://github.com/jenkinsci/google-compute-engine-plugin/pull/183 >> - Reported as an issue in the Google Compute Engine plugin: >> https://github.com/jenkinsci/google-compute-engine-plugin/pull/184 >> >> >> I created a few dumps and uploaded them to https://fastthread.io/: >> >> - >> >> https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMDIvMTAvLS1vdXRwdXQtMi5sb2ctLTEyLTEwLTU5OzstLW91dHB1dC0xLmxvZy0tMTItMTAtNTk7Oy0tb3V0cHV0LmxvZy0tMTItMTAtNTk= >> >> >> *Questions:* >> >> - Have you seen that particular behaviour? >> - Any clues how I can debug it further? >> - Is that an expected behaviour in the remoting itself? Or a kind of >> bug? >> >> >> Thanks >> >> >> -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/6d34cb32-f3c3-4fe8-a5b3-8b7dd2e8fd9d%40googlegroups.com.
