[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352490#comment-15352490
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:
--------------------------------------------

Github user rhtyd commented on the issue:

    https://github.com/apache/cloudstack/pull/1549
  
    @swill may I have the mgmt server and agent logs when the failures were 
intercepted. This is to make sure it's not your environment specific issue. 
I'll also need the JRE version in use (openjdk, or oraclejdk, which versions 
specifically?). If possible can you also take a heapdump and share that with me 
(run jmap -dump:file=heap.bin <pid of the java process>, gzip and scp this bin 
file and please share this somewhere for both mgmt server and agent).
    
    "NIO SSL agent not connecting. when I telnet to 8250, the agent immediately 
came up without me having to restart it." -- this is something which I've fixed 
in latest master (using timeout on selectors), can you ask them if they are 
using latest master?
    
    We've seen this fix deployed in a very large environment with 1000s of 
hosts and I've not heard anything from them. We've not gotten any reported on 
MLs so far, I would appreciate if those people who are experiencing issues can 
share it on public channels. Thanks.


> CloudStack Server degrades when a lot of connections on port 8250
> -----------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9348
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>            Reporter: Rohit Yadav
>            Assignee: Rohit Yadav
>             Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to