Github user rhtyd commented on the issue:
https://github.com/apache/cloudstack/pull/1549
@swill may I have the mgmt server and agent logs when the failures were
intercepted. This is to make sure it's not your environment specific issue.
I'll also need the JRE version in use (openjdk, or oraclejdk, which versions
specifically?). If possible can you also take a heapdump and share that with me
(run jmap -dump:file=heap.bin <pid of the java process>, gzip and scp this bin
file and please share this somewhere for both mgmt server and agent).
"NIO SSL agent not connecting. when I telnet to 8250, the agent immediately
came up without me having to restart it." -- this is something which I've fixed
in latest master (using timeout on selectors), can you ask them if they are
using latest master?
We've seen this fix deployed in a very large environment with 1000s of
hosts and I've not heard anything from them. We've not gotten any reported on
MLs so far, I would appreciate if those people who are experiencing issues can
share it on public channels. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---