Thanks everyone, it's working now (see below for details). kuisathaverat, 
these agents have 96GB total RAM. Thanks for the explanation. Our builds 
are very RAM intensive, and I misunderstood that the builds happened within 
the remoting java process. Sounds like you're saying in this case there's 
no reason to give the agent jvm so much RAM. The Cloudbees JVM Best 
Practices page 
<https://docs.cloudbees.com/docs/admin-resources/latest/jvm-troubleshooting/#recommended-options>
 indicates 
the default min/max heap are 1/64 physical RAM / 1/4 physical RAM, but both 
cap out at 1GB. So, before I was setting these options, my agents should 
have been effectively using 1GB/1GB for min/max. As for the other options 
I'm setting in the agents, these are the same options recommended by the 
page linked above (which I'm also using on master/controller). Do these not 
apply to agents as well as masters/controllers?

Also, on the agent machine, my <JENKINS_HOME>/support/all*.logs and 
<JENKINS_HOME>/remoting/logs/* are still empty,; any suggestions how to get 
more logging on the agents?

I didn't have gc or other logging enabled, so I'm still not yet sure what 
the catastrophic problem was, it might not be a java problem at all, since 
I'm not seeing any problems in syslog indicating problems with the jenkins 
remoting process. These are VMware machines, and they just stop themselves, 
so it seems like a kernel panic or something. I have them autorestarting 
now and the problem seems intermittent.

I think the jvmOptions is working as expected now. I think I may not have 
rebooted the jenkins instance but had only rebooted the agents and had only 
restarted the jenkins service on master/controller machine. So apparently 
the change I made required a reboot of the master/controller. Now, signing 
into the agent and looking at the java process for jenkins remoting, I can 
see all the specified args are there:

```
jenkins@jenkins-testing-agent-1:~$ ps aux | grep java jenkins 2733 5.1 70.4 
73509096 69794284 ? Ssl 11:19 0:26 java -Dhudson.slaves.WorkspaceList=- 
-Dorg.apache.commons.jelly.tags.fmt.timeZone=America/Vancouver -Xmx64g 
-Xms64g -XX:+AlwaysPreTouch -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/home/jenkins/.jenkins/support -XX:+UseG1GC 
-XX:+UseStringDeduplication -XX:+ParallelRefProcEnabled 
-XX:+DisableExplicitGC -XX:+UnlockDiagnosticVMOptions 
-XX:+UnlockExperimentalVMOptions -verbose:gc 
-Xlog:gc:/home/jenkins/.jenkins/support/gc-%t.log -XX:+PrintGC 
-XX:+PrintGCDetails -XX:ErrorFile=/hs_err_%p.log -XX:+LogVMOutput 
-XX:LogFile=/home/jenkins/.jenkins/support/jvm.log -jar remoting.jar 
-workDir /home/jenkins/.jenkins -jar-cache 
/home/jenkins/.jenkins/remoting/jarCache
```

I am also now seeing garbage collection logs in support/ as configured:

```
jenkins@jenkins-testing-agent-1:~$ ls -la .jenkins/support/ total 32 
drwxr-xr-x 2 jenkins jenkins 4096 Sep 23 11:20 . drwxrwxr-x 6 jenkins 
jenkins 4096 Sep 16 00:27 .. -rw-r--r-- 1 jenkins jenkins 0 Sep 22 11:01 
all_2020-09-22_18.01.37.log -rw-r--r-- 1 jenkins jenkins 0 Sep 22 11:03 
all_2020-09-22_18.03.01.log -rw-r--r-- 1 jenkins jenkins 0 Sep 22 13:04 
all_2020-09-22_20.04.15.log -rw-r--r-- 1 jenkins jenkins 0 Sep 22 15:17 
all_2020-09-22_22.17.09.log -rw-r--r-- 1 jenkins jenkins 0 Sep 22 15:32 
all_2020-09-22_22.32.14.log -rw-r--r-- 1 jenkins jenkins 0 Sep 22 15:56 
all_2020-09-22_22.56.18.log -rw-r--r-- 1 jenkins jenkins 1078 Sep 23 11:18 
all_2020-09-23_18.04.43.log -rw-r--r-- 1 jenkins jenkins 0 Sep 23 11:20 
all_2020-09-23_18.20.07.log -rw-r--r-- 1 jenkins jenkins 194 Sep 23 11:04 
gc-2020-09-23_11-04-04.log -rw-r--r-- 1 jenkins jenkins 194 Sep 23 11:04 
gc-2020-09-23_11-04-24.log -rw-r--r-- 1 jenkins jenkins 194 Sep 23 11:19 
gc-2020-09-23_11-19-32.log -rw-r--r-- 1 jenkins jenkins 546 Sep 23 11:22 
gc-2020-09-23_11-19-50.log -rw-r--r-- 1 jenkins jenkins 4096 Sep 23 11:20 
jvm.log 
```
On Wednesday, September 23, 2020 at 10:36:20 AM UTC-7 naresh....@gmail.com 
wrote:

> I think to have those updated settings applied correctly we need to 
> disconnect and launch those agents again instead of just bringing those 
> offline and online, just checking to make sure that we are not missing 
> anything there. 
>
> On Wednesday, September 23, 2020 at 12:01:46 PM UTC-5 kuisat...@gmail.com 
> wrote:
>
>> How much memory those agents have? you set "-Xmx64g -Xms64g" for the 
>> remoting process (not for builds) you agent has to have more than 64GB of 
>> RAM to run any build on it, you grab 64GB only for the remoting process, 
>> and this RAM should be enough to run you builds. The remoting agent usually 
>> does not need more than 256-512MB, this keeps the rest of your agent memory 
>> for builds, agents rarely need JVM options to tune the memory the default 
>> configuration is enough, the only case I will recommend to pass JVM option 
>> is to limit the memory of the agent process.
>>
>> the jvmOptions field should work is tested on unit test, if not is and 
>> issue, Which version of Jenksin and ssh build agents plugin do your use?
>>
>> El miércoles, 23 de septiembre de 2020 a las 1:21:28 UTC+2, 
>> timb...@gmail.com escribió:
>>
>>> I'm using ssh-slaves-plugin 
>>> <https://github.com/jenkinsci/ssh-slaves-plugin> to configure and 
>>> launch 2 ssh agents, and I've specified several java options in these 
>>> agents' config (see photo and text list below), but when these agents are 
>>> launched, the agents' log still shows empty jvmOptions in the ssh launcher 
>>> call. Agent Log excerpt:
>>>
>>> SSHLauncher{host='jenkins-testing-agent-1', port=22, 
>>> credentialsId='jenkins_user_on_linux_agent', *jvmOptions=''*, 
>>> javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', 
>>> launchTimeoutSeconds=30, maxNumRetries=20, retryWaitTime=10, 
>>> sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.NonVerifyingKeyVerificationStrategy,
>>>  
>>> tcpNoDelay=true, trackCredentials=true} 
>>> [09/22/20 15:56:12] [SSH] Opening SSH connection to 
>>> jenkins-testing-agent-1:22. 
>>> [09/22/20 15:56:16] [SSH] WARNING: SSH Host Keys are not being verified. 
>>> Man-in-the-middle attacks may be possible against this connection. 
>>> [09/22/20 15:56:16] [SSH] Authentication successful. 
>>> [09/22/20 15:56:16] [SSH] The remote user's environment is: 
>>> BASH=/usr/bin/bash
>>> .
>>> .
>>> .
>>> [SSH] java -version returned 11.0.8. 
>>> [09/22/20 15:56:16] [SSH] Starting sftp client. [09/22/20 15:56:16] 
>>> [SSH] Copying latest remoting.jar... Source agent hash is 
>>> 0146753DA5ED62106734D59722B1FA2C. Installed agent hash is 
>>> 0146753DA5ED62106734D59722B1FA2C Verified agent jar. No update is 
>>> necessary. Expanded the channel window size to 4MB 
>>> [09/22/20 15:56:16] [SSH] Starting agent process: cd 
>>> "/home/jenkins/.jenkins" && java -jar remoting.jar -workDir 
>>> /home/jenkins/.jenkins -jar-cache /home/jenkins/.jenkins/remoting/jarCache 
>>> Sep 22, 2020 3:56:17 PM org.jenkinsci.remoting.engine.WorkDirManager 
>>> initializeWorkDir INFO: Using /home/jenkins/.jenkins/remoting as a remoting 
>>> work directory Sep 22, 2020 3:56:17 PM 
>>> org.jenkinsci.remoting.engine.WorkDirManager setupLogging INFO: Both error 
>>> and output logs will be printed to /home/jenkins/.jenkins/remoting 
>>> <===[JENKINS REMOTING CAPACITY]===>channel started Remoting version: 4.2 
>>> This is a Unix agent WARNING: An illegal reflective access operation has 
>>> occurred WARNING: Illegal reflective access by 
>>> jenkins.slaves.StandardOutputSwapper$ChannelSwapper to constructor 
>>> java.io.FileDescriptor(int) WARNING: Please consider reporting this to the 
>>> maintainers of jenkins.slaves.StandardOutputSwapper$ChannelSwapper WARNING: 
>>> Use --illegal-access=warn to enable warnings of further illegal reflective 
>>> access operations WARNING: All illegal access operations will be denied in 
>>> a future release Evacuated stdout Agent successfully connected and online 
>>>
>>>
>>> [image: jenkins-ssh-agent-config.PNG]
>>>
>>> This is the full text in the "JVM Options" field for 
>>> jenkins-testing-agent-1 and 2:
>>>
>>> -Dhudson.slaves.WorkspaceList=- 
>>> -Dorg.apache.commons.jelly.tags.fmt.timeZone=America/Vancouver -Xmx64g 
>>> -Xms64g -XX:+AlwaysPreTouch -XX:+HeapDumpOnOutOfMemoryError 
>>> -XX:HeapDumpPath=/home/jenkins/.jenkins/support -XX:+UseG1GC 
>>> -XX:+UseStringDeduplication -XX:+ParallelRefProcEnabled 
>>> -XX:+DisableExplicitGC -XX:+UnlockDiagnosticVMOptions 
>>> -XX:+UnlockExperimentalVMOptions -verbose:gc 
>>> -Xlog:gc:/home/jenkins/.jenkins/support/gc-%t.log -XX:+PrintGC 
>>> -XX:+PrintGCDetails -XX:ErrorFile=/hs_err_%p.log -XX:+LogVMOutput 
>>> -XX:LogFile=/home/jenkins/.jenkins/support/jvm.log
>>>
>>> I am having intermittent catastrophic failures of these agent machines 
>>> during builds and am trying to properly configure java settings per 
>>> Cloudbees best practices, but I cannot seem to get off the ground here. 
>>> Another problem in my agents that's probably related is that the agent-side 
>>> (remoting) logs are all zero bytes.
>>>
>>> Thanks for your help.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/a48d0f18-9ae2-4bdf-a2c4-634dfcf89924n%40googlegroups.com.

Reply via email to