[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013507#comment-16013507 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 02311c8bbe85210fae047ca57ff8322096fb0edf in cloudstack's branch refs/heads/4.9 from [~marcaurele] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=02311c8 ] Activate NioTest following changes in CLOUDSTACK-9348 PR #1549 The first PR #1493 re-enabled the NioTest but not the new PR #1549. Signed-off-by: Marc-Aurèle Brothier > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013511#comment-16013511 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit a933f8d96c5c754f56c97530e58aee5d0e17d979 in cloudstack's branch refs/heads/4.9 from [~rajanik] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=a933f8d ] Merge pull request #2027 from exoscale/niotest CLOUDSTACK-9918: Activate NioTest following changes in CLOUDSTACK-9348 PR #1549 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013592#comment-16013592 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit a933f8d96c5c754f56c97530e58aee5d0e17d979 in cloudstack's branch refs/heads/master from [~rajanik] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=a933f8d ] Merge pull request #2027 from exoscale/niotest CLOUDSTACK-9918: Activate NioTest following changes in CLOUDSTACK-9348 PR #1549 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013610#comment-16013610 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 8b3cadb55eefeacd310f97aefbb91276e4ee8b43 in cloudstack's branch refs/heads/master from [~rajanik] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=8b3cadb ] Merge release branch 4.9 to master * 4.9: Do not set gateway to 0.0.0.0 for windows clients CLOUDSTACK-9904: Fix log4j to have @AGENTLOG@ replaced ignore bogus default gateway when a shared network is secondary the default gateway gets overwritten by a bogus one dnsmasq does the right thing and replaces it with its own default which is not good for us so check for '0.0.0.0' Activate NioTest following changes in CLOUDSTACK-9348 PR #1549 CLOUDSTACK-9828: GetDomRVersionCommand fails to get the correct version as output Fix tries to return the output as a single command, instead of appending output from two commands CLOUDSTACK-3223 Exception observed while creating CPVM in VMware Setup with DVS CLOUDSTACK-9787: Fix wrong return value in NetUtils.isNetworkAWithinNetworkB > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275621#comment-16275621 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit e6f32c233e179da5e99318aba8e48146e0ff70c3 in cloudstack's branch refs/heads/debian9-systemvmtemplate from [~rohit.ya...@shapeblue.com] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=e6f32c2 ] CLOUDSTACK-9348: Improve Nio SSH handshake buffers Use a holder class to pass buffers, fixes potential leak. Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284057#comment-16284057 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 315cbab08bad56595cfd0766a91da0e447dcf132 in cloudstack's branch refs/heads/debian9-systemvmtemplate from [~rohit.ya...@shapeblue.com] [ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=315cbab ] CLOUDSTACK-9348: Improve Nio SSH handshake buffers Use a holder class to pass buffers, fixes potential leak. Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351857#comment-15351857 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the issue: https://github.com/apache/cloudstack/pull/1549 I think I am going to revert this PR. We are having intermittent issues with it still and I am not confident running a production environment with this in place at this time, so I don't think I can justify leaving it in without us doing some more testing to figure out what is going on. I have had a few reports like this as people test 4.9, which are concerning: > NIO SSL agent not connecting. when I telnet to 8250, the agent immediately came up without me having to restart it. I am also still periodically getting the `addHost` issue we thought we had resolved previously. After I revert his, can you create a new PR with this same code so we can start getting more concrete testing on it and we can start consolidating some logs when it misbehaves? > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352490#comment-15352490 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1549 @swill may I have the mgmt server and agent logs when the failures were intercepted. This is to make sure it's not your environment specific issue. I'll also need the JRE version in use (openjdk, or oraclejdk, which versions specifically?). If possible can you also take a heapdump and share that with me (run jmap -dump:file=heap.bin , gzip and scp this bin file and please share this somewhere for both mgmt server and agent). "NIO SSL agent not connecting. when I telnet to 8250, the agent immediately came up without me having to restart it." -- this is something which I've fixed in latest master (using timeout on selectors), can you ask them if they are using latest master? We've seen this fix deployed in a very large environment with 1000s of hosts and I've not heard anything from them. We've not gotten any reported on MLs so far, I would appreciate if those people who are experiencing issues can share it on public channels. Thanks. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352522#comment-15352522 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1549#discussion_r68709839 --- Diff: utils/src/main/java/com/cloud/utils/nio/NioConnection.java --- @@ -125,7 +125,7 @@ public boolean isStartup() { public Boolean call() throws NioConnectionException { while (_isRunning) { try { -_selector.select(); +_selector.select(1000); --- End diff -- @swill this change ^^ causes the selector loop to never block indefinitely but at most 1 second. This ensures that reconnections are fast. If anyone experiences the behaviour that tel-netting to port 8250 causes agent to reconnect then, either (1) they are not using latest master or (2) they did not wait for up to one second (i.e. hit an edge case), in which case we can lower this number to 100-500 millisecond at the expense of increasing cpu usage. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353107#comment-15353107 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the issue: https://github.com/apache/cloudstack/pull/1549 This was shared in the "4.9/master Testing Coordination" thread. Simon Weller and his team at ENA have run into this on the latest master in both of their hardware labs while testing master for me in preparation for the 4.9 RC. I am still getting the `addHost` issue periodically which showed up a lot when this had bigger issues. I am concerned about this in production to be honest. How long has this been running in production at your client? > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353375#comment-15353375 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1549 @swill close to two months now. Can you add the mgmt server logs, heap dump and any other dumps which can help me fix the issue. @swill @kiwiflyer Can you please open a JIRA issue where you can put these details, without necessary information we cannot find and fix the issue or conclude that the issue was caused by something else. I've built latest master repository here: http://packages.shapeblue.com/cloudstack/custom/testing I'll continue this discussion on the ML thread. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358686#comment-15358686 ] ASF GitHub Bot commented on CLOUDSTACK-9348: GitHub user rhtyd opened a pull request: https://github.com/apache/cloudstack/pull/1601 CLOUDSTACK-9348: Reduce Nio selector wait time This reduced the Nio loop selector wait time, this way the selector will check frequently (as much as 100ms per iteration) and handle any pending connection/tasks. This would make reconnections very quick at the expense of some CPU usage. /cc @swill @kiwiflyer guys can you please apply this fix in your env and test if you're still able to produce any Nio related error b/w mgmt server(s) and kvm agent(s) not being able to connect quickly. Please also watch out for any increased CPU usage (there should not be any significant change), in which case we may increase the timeout from 100ms to 200-400ms. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shapeblue/cloudstack nio-aggressive-selector Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/1601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1601 commit 0381b7ea185ef753873594216a67b8d376e3d658 Author: Rohit Yadav Date: 2016-07-01T09:02:58Z CLOUDSTACK-9348: Reduce Nio selector wait time This reduced the Nio loop selector wait time, this way the selector will check frequently (as much as 100ms per iteration) and handle any pending connection/tasks. This would make reconnections very quick at the expense of some CPU usage. Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360992#comment-15360992 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user glennwagner commented on the issue: https://github.com/apache/cloudstack/pull/1601 LGTM Testing 4.9 master with pr 1601 1. Original results without PR KVM hosts failing to add , error in logs were NIO connection errors 2. After the PR was applied all KVM hosts (Ubuntu and CentOS) add correctly and KVM agents checked in with an UP status Systems VM's deployed with no errors > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361048#comment-15361048 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1601 @glennwagner can you also comment if you saw any unusual CPU usage? thanks for testing this > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370170#comment-15370170 ] ASF GitHub Bot commented on CLOUDSTACK-9348: GitHub user rhtyd reopened a pull request: https://github.com/apache/cloudstack/pull/1601 CLOUDSTACK-9348: Reduce Nio selector wait time This reduced the Nio loop selector wait time, this way the selector will check frequently (as much as 100ms per iteration) and handle any pending connection/tasks. This would make reconnections very quick at the expense of some CPU usage. /cc @swill @kiwiflyer guys can you please apply this fix in your env and test if you're still able to produce any Nio related error b/w mgmt server(s) and kvm agent(s) not being able to connect quickly. Please also watch out for any increased CPU usage (there should not be any significant change), in which case we may increase the timeout from 100ms to 200-400ms. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shapeblue/cloudstack nio-aggressive-selector Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/1601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1601 commit 0381b7ea185ef753873594216a67b8d376e3d658 Author: Rohit Yadav Date: 2016-07-01T09:02:58Z CLOUDSTACK-9348: Reduce Nio selector wait time This reduced the Nio loop selector wait time, this way the selector will check frequently (as much as 100ms per iteration) and handle any pending connection/tasks. This would make reconnections very quick at the expense of some CPU usage. Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370169#comment-15370169 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd closed the pull request at: https://github.com/apache/cloudstack/pull/1601 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379255#comment-15379255 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1601 @swill before you cut the next RC, please include this as @glennwagner and @PaulAngus found a blocker around addHost API without this fix. Thanks. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381697#comment-15381697 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the issue: https://github.com/apache/cloudstack/pull/1601 ### CI RESULTS ``` Tests Run: 79 Skipped: 0 Failed: 0 Errors: 3 Duration: 7h 54m 18s ``` **Summary of the problem(s):** ``` ERROR: test suite for -- Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 209, in run self.setUp() File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 292, in setUp self.setupContext(ancestor) File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 315, in setupContext try_run(context, names) File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/util.py", line 471, in try_run return func() File "/data/git/cs1/cloudstack/test/integration/smoke/test_internal_lb.py", line 296, in setUpClass cls.template.download(cls.apiclient) File "/usr/lib/python2.7/site-packages/marvin/lib/base.py", line 1350, in download elif 'Downloaded' in template.status: TypeError: argument of type 'NoneType' is not iterable -- Additional details in: /tmp/MarvinLogs/test_network_DJJ0HC/results.txt ``` ``` ERROR: test suite for -- Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 209, in run self.setUp() File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 292, in setUp self.setupContext(ancestor) File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 315, in setupContext try_run(context, names) File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/util.py", line 471, in try_run return func() File "/data/git/cs1/cloudstack/test/integration/smoke/test_vpc_vpn.py", line 293, in setUpClass cls.template.download(cls.apiclient) File "/usr/lib/python2.7/site-packages/marvin/lib/base.py", line 1350, in download elif 'Downloaded' in template.status: TypeError: argument of type 'NoneType' is not iterable -- Additional details in: /tmp/MarvinLogs/test_network_DJJ0HC/results.txt ``` ``` ERROR: test suite for -- Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 209, in run self.setUp() File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 292, in setUp self.setupContext(ancestor) File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 315, in setupContext try_run(context, names) File "/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/util.py", line 471, in try_run return func() File "/data/git/cs1/cloudstack/test/integration/smoke/test_vpc_vpn.py", line 472, in setUpClass cls.template.download(cls.apiclient) File "/usr/lib/python2.7/site-packages/marvin/lib/base.py", line 1350, in download elif 'Downloaded' in template.status: TypeError: argument of type 'NoneType' is not iterable -- Additional details in: /tmp/MarvinLogs/test_network_DJJ0HC/results.txt ``` **Associated Uploads** **`/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8:`** * [dc_entries.obj](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8/dc_entries.obj) * [failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8/failed_plus_exceptions.txt) * [runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8/runinfo.txt) **`/tmp/MarvinLogs/test_network_DJJ0HC:`** * [failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/test_network_DJJ0HC/failed_plus_exceptions.txt) * [re
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381700#comment-15381700 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the issue: https://github.com/apache/cloudstack/pull/1601 These errors are common in this specific environment type (hypervisors nested 3 layers deep). They do not show up when only two hypervisors are nested... This is ready... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381944#comment-15381944 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1601 Thanks @swill > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382757#comment-15382757 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit ea48e95bdd1641c752eb573fe448aac6478cecd1 in cloudstack's branch refs/heads/master from [~williamstev...@gmail.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ea48e95 ] Merge pull request #1601 from shapeblue/nio-aggressive-selector CLOUDSTACK-9348: Reduce Nio selector wait timeThis reduced the Nio loop selector wait time, this way the selector will check frequently (as much as 100ms per iteration) and handle any pending connection/tasks. This would make reconnections very quick at the expense of some CPU usage. /cc @swill @kiwiflyer guys can you please apply this fix in your env and test if you're still able to produce any Nio related error b/w mgmt server(s) and kvm agent(s) not being able to connect quickly. Please also watch out for any increased CPU usage (there should not be any significant change), in which case we may increase the timeout from 100ms to 200-400ms. * pr/1601: CLOUDSTACK-9348: Reduce Nio selector wait time Signed-off-by: Will Stevens > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382758#comment-15382758 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit ea48e95bdd1641c752eb573fe448aac6478cecd1 in cloudstack's branch refs/heads/master from [~williamstev...@gmail.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ea48e95 ] Merge pull request #1601 from shapeblue/nio-aggressive-selector CLOUDSTACK-9348: Reduce Nio selector wait timeThis reduced the Nio loop selector wait time, this way the selector will check frequently (as much as 100ms per iteration) and handle any pending connection/tasks. This would make reconnections very quick at the expense of some CPU usage. /cc @swill @kiwiflyer guys can you please apply this fix in your env and test if you're still able to produce any Nio related error b/w mgmt server(s) and kvm agent(s) not being able to connect quickly. Please also watch out for any increased CPU usage (there should not be any significant change), in which case we may increase the timeout from 100ms to 200-400ms. * pr/1601: CLOUDSTACK-9348: Reduce Nio selector wait time Signed-off-by: Will Stevens > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382763#comment-15382763 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user asfgit closed the pull request at: https://github.com/apache/cloudstack/pull/1601 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241017#comment-15241017 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-209896417 - Tested against KVM, mgmt server - KVM links and clustered management server - NioTest modified to have multiple clients against a server instance with just one worker and 10 malicious clients (they simply do a secure connect to the server and don't do anything else) trying to connect server per valid client - Ran Marvin smoke tests successfully against KVM > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241740#comment-15241740 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-210103368 I've created two commits to show: (1) test to prove denial of service behavior due to blocking main IO loop, (2) the fix (as mentioned earlier long term fix would require migration to a better framework). > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241906#comment-15241906 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59790778 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", , 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241916#comment-15241916 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59791280 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", , 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241929#comment-15241929 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59792529 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", , 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241953#comment-15241953 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59795416 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", , 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242006#comment-15242006 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59799743 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", , 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243107#comment-15243107 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59892756 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243120#comment-15243120 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59894203 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", , 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243165#comment-15243165 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59900210 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243167#comment-15243167 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59900296 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243166#comment-15243166 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59900221 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243293#comment-15243293 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59912439 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243296#comment-15243296 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59912710 --- Diff: utils/src/test/java/com/cloud/utils/backoff/impl/ConstantTimeBackoffTest.java --- @@ -94,7 +94,7 @@ public void wakeupNotExisting() { @Test public void wakeupExisting() throws InterruptedException { final ConstantTimeBackoff backoff = new ConstantTimeBackoff(); -backoff.setTimeToWait(10); +backoff.setTimeToWait(1000); --- End diff -- is it 1000 seconds or miliseconds? Does it need to be that high? > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243304#comment-15243304 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59913051 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243306#comment-15243306 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59913073 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243310#comment-15243310 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59913436 --- Diff: utils/src/test/java/com/cloud/utils/backoff/impl/ConstantTimeBackoffTest.java --- @@ -94,7 +94,7 @@ public void wakeupNotExisting() { @Test public void wakeupExisting() throws InterruptedException { final ConstantTimeBackoff backoff = new ConstantTimeBackoff(); -backoff.setTimeToWait(10); +backoff.setTimeToWait(1000); --- End diff -- I was trying to diagnose why this test was failing on Travis, so added a large value. Removed now. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243315#comment-15243315 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59913697 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243314#comment-15243314 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59913672 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +f
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243323#comment-15243323 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59914101 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243331#comment-15243331 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59915144 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +f
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1524#comment-1524 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59915248 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243335#comment-15243335 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59915315 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243337#comment-15243337 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59915385 --- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java --- @@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean isClient) throws GeneralSecurity return sslContext; } -public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, boolean isClient) throws IOException { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: begin Handshake, isClient: " + isClient); +public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int sessionProposedCapacity) { +if (buffer == null || sessionProposedCapacity < 0) { +return buffer; } - -SSLEngineResult engResult; -SSLSession sslSession = sslEngine.getSession(); -HandshakeStatus hsStatus; -ByteBuffer in_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer in_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -ByteBuffer out_pkgBuf = ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40); -ByteBuffer out_appBuf = ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40); -int count; -ch.socket().setSoTimeout(60 * 1000); -InputStream inStream = ch.socket().getInputStream(); -// Use readCh to make sure the timeout on reading is working -ReadableByteChannel readCh = Channels.newChannel(inStream); - -if (isClient) { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP; +if (sessionProposedCapacity > buffer.capacity()) { +buffer = ByteBuffer.allocate(sessionProposedCapacity); } else { -hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP; +buffer = ByteBuffer.allocate(buffer.capacity() * 2); } +return buffer; +} -while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) { -if (s_logger.isTraceEnabled()) { -s_logger.trace("SSL: Handshake status " + hsStatus); +public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, ByteBuffer buffer) { +if (engine == null || buffer == null) { +return buffer; +} +if (buffer.position() < buffer.limit()) { +return buffer; +} +ByteBuffer replaceBuffer = enlargeBuffer(buffer, engine.getSession().getPacketBufferSize()); +buffer.flip(); +replaceBuffer.put(buffer); +return replaceBuffer; +} + +private static boolean doHandshakeUnwrap(final SocketChannel socketChannel, final SSLEngine sslEngine, + ByteBuffer peerAppData, ByteBuffer peerNetData, final int appBufferSize) throws IOException { +if (socketChannel == null || sslEngine == null || peerAppData == null || peerNetData == null || appBufferSize < 0) { +return false; +} +if (socketChannel.read(peerNetData) < 0) { +if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) { +return false; } -engResult = null; -if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) { -out_pkgBuf.clear(); -out_appBuf.clear(); -out_appBuf.put("Hello".getBytes()); -engResult = sslEngine.wrap(out_appBuf, out_pkgBuf); -out_pkgBuf.flip(); -int remain = out_pkgBuf.limit(); -while (remain != 0) { -remain -= ch.write(out_pkgBuf); -if (remain < 0) { -throw new IOException("Too much bytes sent?"); -} -} -} else if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_UNWRAP) { -in_appBuf.clear(); -// One packet may contained multiply operation -if (in_pkgBuf.position() == 0 || !in_pkgBuf.hasRemaining()) { -in_pkgBuf.clear(); -count = 0; -try { -count = readCh.read(in_pkgBuf); -} catch (SocketTimeoutException ex) { -if (s_logger
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243354#comment-15243354 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59917981 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +f
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243359#comment-15243359 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-210576373 Thanks all for the review, I've update the commits; please re-review and advise other outstanding issue. Thanks again. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243365#comment-15243365 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59918834 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,215 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +/* NioTest --- End diff -- If you are going to use some kind of documenting, I believe the java doc style would be more appropriate. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243383#comment-15243383 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rafaelweingartner commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r59920010 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,198 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; -/** - * - * - * - * - */ +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; + +public class NioTest { -public class NioTest extends TestCase { +private static final Logger LOGGER = Logger.getLogger(NioTest.class); -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 10; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; } return result; } -private void getOneMoreTest() { -synchronized (this) { -_testCount++; -} -} - private void oneMoreTestDone() { synchronized (this) { -_completedCount++; +completedTestCount++; } } -@Override +@Before public void setUp() { -s_logger.info("Test"); +LOGGER.info("Setting up Benchmark Test"); -_testCount = 0; -_completedCount = 0; - -_server = new NioServer("NioTestServer", , 5, new NioTestServer()); -try { -_server.start(); -} catch (final NioConnectionException e) { -fail(e.getMessage()); -} +completedTestCount = 0; +testBytes = new byte[100]; +randomGenerator.nextBytes(testBytes); -_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new NioTestClient()); +// Server configured with one worker +server = new NioServer("NioTestServer", 0, 1, new NioTestServer()); try { -_client.start(); +server.start(); } catch (final NioConnectionException e) { -fail(e.getMessage()); +Assert.fail(e.getMessage()); } -while (_clientLink == null) { -try { -s_logger.debug("Link is not up! Waiting ..."); -Thread.sleep(1000); -} catch (final InterruptedException e) { -// TODO Auto-generated catch block -e.printStackTrace(); +// 5 malicious clients per valid client +for (int i = 0; i < totalTestCount; i++) { +
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245292#comment-15245292 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-211261793 @swill I've fixed the outstanding issues, can you run your CI on this and help merge? thanks > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249385#comment-15249385 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-212284556 @jburwell fixed use of test timeout within \@Test annotation > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253546#comment-15253546 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user bhaisaab commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-213327217 @jburwell @GabrielBrascher @rafaelweingartner @swill if you're done with review, LGTM please or share what else should be fixed. Thanks. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261540#comment-15261540 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215309861 Failed to build. ``` --- T E S T S --- Running com.cloud.utils.testcase.NioTest 2016-04-28 06:23:24,581 INFO [utils.testcase.NioTest] (main:) Setting up Benchmark Test 2016-04-28 06:23:24,879 INFO [utils.nio.NioServer] (main:) NioConnection started and listening on /0:0:0:0:0:0:0:0:58798 2016-04-28 06:23:24,886 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-1:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,886 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-2:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,887 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-4:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,890 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-5:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,891 INFO [utils.nio.NioClient] (NioClientHandler-1:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,892 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-6:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,892 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-3:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,893 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-7:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,894 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-8:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,895 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-10:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,924 DEBUG [utils.crypt.EncryptionSecretKeyChecker] (pool-1-thread-1:) Encryption Type: null 2016-04-28 06:23:24,928 INFO [utils.nio.NioClient] (NioClientHandler-2:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,933 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-11:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,933 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-04-28 06:23:24,939 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-13:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,941 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-14:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,944 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-12:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,944 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-15:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,944 INFO [utils.nio.NioClient] (NioClientHandler-3:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,945 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-16:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,946 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-9:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,946 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-17:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,946 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-18:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,947 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-19:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,947 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-20:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,948 INFO [utils.nio.NioClient] (NioClientHandler-4:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,949 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-21:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,949 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-22:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,949 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-23:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,977 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-25:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,977 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-24:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,981 INFO [utils.nio.NioClient] (NioClientHandler-5:) Connecting to 127.0.0.1:58798 2016-04-28 06:23:24,996 DEBUG [utils.testcase.NioTest] (Thread-0:) 0/5 tests done. Waiting for completion 2016-04-28 06:23:25,103 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-04-28 06:23:25,161 WARN [utils.nio.Link] (pool-1-th
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261545#comment-15261545 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215310531 BTW, I built with `-T 2C`, if that is relevant to help you understand why it failed... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261739#comment-15261739 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215350128 @swill there are in total 25 malicious clients that can block for 60s for all 5 (max.) server worker threads; so worst case we should have waited for at least 25*60/5 (300 seconds); I've fixed the test with max. possible timeout value, previously the value was chosen for an average case > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263857#comment-15263857 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215675646 @swill can you try again with your CI? @agneya2001 @jburwell @wido @kiwiflyer @nvazquez @DaanHoogland and others - please review and share your LGTM, thanks > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264140#comment-15264140 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user kiwiflyer commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215741459 @rhtyd - We'll pull this in for functional testing. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264342#comment-15264342 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215815079 thanks @kiwiflyer > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264355#comment-15264355 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user jburwell commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-215818023 LGTM for code review > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266558#comment-15266558 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216228508 This PR is ready for merge, /cc @swill tag:mergeready > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266953#comment-15266953 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216289074 @kiwiflyer do you have test results on this one? Thanks... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267000#comment-15267000 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user kiwiflyer commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216296997 @swill I'm a bit behind. I'm building this now. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267007#comment-15267007 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216298669 No worries. Thanks... I also am a bit behind. I apparently have to just assume I won't get any work done on mondays. :P > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267575#comment-15267575 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user kiwiflyer commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216375175 I pulled this into a hardware lab on 4.8.1. I setup a number of fake clients and hammered 8250. Prior to the patch the agents end up in a disconnected state after a few minutes. I applied the patch and my little DOS test is unable to affect the connectivity between the management server and the agents. I also tested some provisioning activities and made sure the agent survived taking the management server down and then bringing it back up. LGTM > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267589#comment-15267589 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216378613 Thank you @kiwiflyer. 👍 @rhtyd can you force push this PR again to try to get Jenkins green? Thanks... Otherwise this one is ready... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267620#comment-15267620 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user serverchief commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216382896 @kiwiflyer, My testing with this patch - if you have at least several hundred KVM nodes connected to 2 MS via VIP and take 1 MS down, you will notice that KVM agents will shift to second MS in mater of seconds - with no noise. Without this patch, depending on the scale - it may take upto 10 minutes to reconnect all hosts and also lots of noise about hosts being down! > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267963#comment-15267963 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216421909 @swill forced pushed; the Jenkins server is not reliable -- as long as Travis is green we are alright; the only additional check Jenkins does is the rat check, which I think Travis can do as well Thanks @serverchief for sharing your experience with this fix tag:mergeready > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268478#comment-15268478 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216489209 @swill all green now > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268965#comment-15268965 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-216581383 Perfect, this one is queued up to be merged... Thanks... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270725#comment-15270725 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user asfgit closed the pull request at: https://github.com/apache/cloudstack/pull/1493 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270721#comment-15270721 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit ba77a692391856df468a141f98687ec71373a3d3 in cloudstack's branch refs/heads/master from [~rohit.ya...@shapeblue.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ba77a69 ] CLOUDSTACK-9348: Use non-blocking SSL handshake - Uses non-blocking socket config in NioClient and NioServer/NioConnection - Scalable connectivity from agents and peer clustered-management server - Removes blocking ssl handshake code with a non-blocking code - Protects from denial-of-service issues that can degrade mgmt server responsiveness due to an aggressive/malicious client - Uses separate executor services for handling ssl handshakes Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270723#comment-15270723 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 7ce0e10fbcd949375e43535aae168421ecdaa562 in cloudstack's branch refs/heads/master from [~williamstev...@gmail.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=7ce0e10 ] Merge pull request #1493 from shapeblue/nio-fix CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection - Scalable connectivity from agents and peer clustered-management server - Removes blocking ssl handshake code with a non-blocking code - Protects from denial-of-service issues that can degrade mgmt server responsiveness due to an aggressive/malicious client - Uses separate executor services for handling connect/accept events Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher * pr/1493: CLOUDSTACK-9348: Use non-blocking SSL handshake CLOUDSTACK-9348: Unit test to demonstrate denial of service attack Signed-off-by: Will Stevens > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270720#comment-15270720 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 0154da6417ab1dc0fa9719df4543e72ca5f2c178 in cloudstack's branch refs/heads/master from [~rohit.ya...@shapeblue.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=0154da6 ] CLOUDSTACK-9348: Unit test to demonstrate denial of service attack The NioConnection uses blocking handlers for various events such as connect, accept, read, write. In case a client connects NioServer (used by agent mgr to service agents on port 8250) but fails to participate in SSL handshake or just sits idle, this would block the main IO/selector loop in NioConnection. Such a client could be either malicious or aggresive. This unit test demonstrates such a malicious client that can perform a denial-of-service attack on NioServer that blocks it to serve any other client. Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270722#comment-15270722 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 7ce0e10fbcd949375e43535aae168421ecdaa562 in cloudstack's branch refs/heads/master from [~williamstev...@gmail.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=7ce0e10 ] Merge pull request #1493 from shapeblue/nio-fix CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection - Scalable connectivity from agents and peer clustered-management server - Removes blocking ssl handshake code with a non-blocking code - Protects from denial-of-service issues that can degrade mgmt server responsiveness due to an aggressive/malicious client - Uses separate executor services for handling connect/accept events Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher * pr/1493: CLOUDSTACK-9348: Use non-blocking SSL handshake CLOUDSTACK-9348: Unit test to demonstrate denial of service attack Signed-off-by: Will Stevens > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270724#comment-15270724 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 7ce0e10fbcd949375e43535aae168421ecdaa562 in cloudstack's branch refs/heads/master from [~williamstev...@gmail.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=7ce0e10 ] Merge pull request #1493 from shapeblue/nio-fix CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection - Scalable connectivity from agents and peer clustered-management server - Removes blocking ssl handshake code with a non-blocking code - Protects from denial-of-service issues that can degrade mgmt server responsiveness due to an aggressive/malicious client - Uses separate executor services for handling connect/accept events Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher * pr/1493: CLOUDSTACK-9348: Use non-blocking SSL handshake CLOUDSTACK-9348: Unit test to demonstrate denial of service attack Signed-off-by: Will Stevens > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272518#comment-15272518 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217188135 @rhtyd I am still having problems with the tests in this PR, but now it is in master. This is causing builds to fail... ``` testConnection(com.cloud.utils.testcase.NioTest) Time elapsed: 300.073 sec <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 30 milliseconds at java.lang.Thread.sleep(Native Method) at com.cloud.utils.testcase.NioTest.testConnection(NioTest.java:146) ``` > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272521#comment-15272521 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217188537 Is there a reason we need to spend 5 minutes waiting for this test every build? > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272538#comment-15272538 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r62207167 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,208 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; /** - * - * - * - * + * NioTest demonstrates that NioServer can function without getting its main IO + * loop blocked when an aggressive or malicious client connects to the server but + * fail to participate in SSL handshake. In this test, we run bunch of clients + * that send a known payload to the server, to which multiple malicious clients + * also try to connect and hang. + * A malicious client could cause denial-of-service if the server's main IO loop + * along with SSL handshake was blocking. A passing tests shows that NioServer + * can still function in case of connection load and that the main IO loop along + * with SSL handshake is non-blocking with some internal timeout mechanism. */ -public class NioTest extends TestCase { +public class NioTest { + +private static final Logger LOGGER = Logger.getLogger(NioTest.class); + +// Test should fail in due time instead of looping forever +private static final int TESTTIMEOUT = 30; -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 5; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; --- End diff -- Isn't this wrong? Shouldn't it be: ``` result = (totalTestCount -1) == completedTestCount; ``` You are are only launching `totalTestCount` tests `0 to totalTestCount-1`. `completedTestCount` is also `0` based, so when they all complete it should max out at `totalTestCount-1`. Can you clarify? > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272539#comment-15272539 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r62207228 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,208 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; /** - * - * - * - * + * NioTest demonstrates that NioServer can function without getting its main IO + * loop blocked when an aggressive or malicious client connects to the server but + * fail to participate in SSL handshake. In this test, we run bunch of clients + * that send a known payload to the server, to which multiple malicious clients + * also try to connect and hang. + * A malicious client could cause denial-of-service if the server's main IO loop + * along with SSL handshake was blocking. A passing tests shows that NioServer + * can still function in case of connection load and that the main IO loop along + * with SSL handshake is non-blocking with some internal timeout mechanism. */ -public class NioTest extends TestCase { +public class NioTest { + +private static final Logger LOGGER = Logger.getLogger(NioTest.class); + +// Test should fail in due time instead of looping forever +private static final int TESTTIMEOUT = 30; -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 5; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; --- End diff -- @rhtyd ^ > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272663#comment-15272663 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1493#discussion_r62219725 --- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java --- @@ -19,146 +19,208 @@ package com.cloud.utils.testcase; -import java.nio.channels.ClosedChannelException; -import java.util.Random; - -import junit.framework.TestCase; - -import org.apache.log4j.Logger; -import org.junit.Assert; - +import com.cloud.utils.concurrency.NamedThreadFactory; import com.cloud.utils.exception.NioConnectionException; import com.cloud.utils.nio.HandlerFactory; import com.cloud.utils.nio.Link; import com.cloud.utils.nio.NioClient; import com.cloud.utils.nio.NioServer; import com.cloud.utils.nio.Task; import com.cloud.utils.nio.Task.Type; +import org.apache.log4j.Logger; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; +import java.net.InetSocketAddress; +import java.nio.channels.ClosedChannelException; +import java.nio.channels.Selector; +import java.nio.channels.SocketChannel; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; /** - * - * - * - * + * NioTest demonstrates that NioServer can function without getting its main IO + * loop blocked when an aggressive or malicious client connects to the server but + * fail to participate in SSL handshake. In this test, we run bunch of clients + * that send a known payload to the server, to which multiple malicious clients + * also try to connect and hang. + * A malicious client could cause denial-of-service if the server's main IO loop + * along with SSL handshake was blocking. A passing tests shows that NioServer + * can still function in case of connection load and that the main IO loop along + * with SSL handshake is non-blocking with some internal timeout mechanism. */ -public class NioTest extends TestCase { +public class NioTest { + +private static final Logger LOGGER = Logger.getLogger(NioTest.class); + +// Test should fail in due time instead of looping forever +private static final int TESTTIMEOUT = 30; -private static final Logger s_logger = Logger.getLogger(NioTest.class); +final private int totalTestCount = 5; +private int completedTestCount = 0; -private NioServer _server; -private NioClient _client; +private NioServer server; +private List clients = new ArrayList<>(); +private List maliciousClients = new ArrayList<>(); -private Link _clientLink; +private ExecutorService clientExecutor = Executors.newFixedThreadPool(totalTestCount, new NamedThreadFactory("NioClientHandler"));; +private ExecutorService maliciousExecutor = Executors.newFixedThreadPool(5*totalTestCount, new NamedThreadFactory("MaliciousNioClientHandler"));; -private int _testCount; -private int _completedCount; +private Random randomGenerator = new Random(); +private byte[] testBytes; private boolean isTestsDone() { boolean result; synchronized (this) { -result = _testCount == _completedCount; +result = totalTestCount == completedTestCount; --- End diff -- @swill I'll try to reproduce and fix with a patch to reduce the numbers. Test count 0 to len -1 is still a total `len` counts so this is correct. Consider then, 0 to 4 is `0, 1, 2, 3 , 4` --> they are 5 runs/rounds/counts > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272691#comment-15272691 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217220334 @rhtyd but `totalTestCount = 5` and I don't think that `completedTestCount` will ever be larger than `4`, so I don't know how that check could be right... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272724#comment-15272724 ] ASF GitHub Bot commented on CLOUDSTACK-9348: GitHub user rhtyd opened a pull request: https://github.com/apache/cloudstack/pull/1534 CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop - Reduces SSL handshake timeout to 15s, previously this was only 10s in commit debfcdef788ce0d51be06db0ef10f6815f9b563b - Adds an aggresive explicit wakeup to save the Nio main IO loop/handler from getting blocked - Fix NioTest to fail/succeed in about 60s, previously this was 300s - Due to aggresive wakeup usage, NioTest should complete in less than 5s on most systems. On virtualized environment this may slightly increase due to thread, CPU burst/scheduling delays. /cc @swill please review and merge. Sorry about the previous values, they were not optimized for virtualized env. The aggressive selector.wakeup will ensure main IO loop does not get blocked even by malicious users, even for any timeout (ssl handshake etc). You can merge this pull request into a Git repository by running: $ git pull https://github.com/shapeblue/cloudstack niotest-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/1534.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1534 commit ea22869593f68a3a34b12aeb23c2bb6c34efd365 Author: Rohit Yadav Date: 2016-05-05T17:49:33Z CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop - Reduces SSL handshake timeout to 15s, previously this was only 10s in commit debfcdef788ce0d51be06db0ef10f6815f9b563b - Adds an aggresive explicit wakeup to save the Nio main IO loop/handler from getting blocked - Fix NioTest to fail/succeed in about 60s, previously this was 300s - Due to aggresive wakeup usage, NioTest should complete in less than 5s on most systems. On virtualized environment this may slightly increase due to thread, CPU burst/scheduling delays. Signed-off-by: Rohit Yadav > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272725#comment-15272725 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217226167 @swill I've ran my tests, please review and merge this -- https://github.com/apache/cloudstack/pull/1534 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272727#comment-15272727 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217226501 @DaanHoogland @jburwell @wido @swill and others -- please review, this mainly fixes the NioTest which was failing so if it's okay and works for Travis and Will's CI let's merge this. Thanks. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272718#comment-15272718 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217225573 @swill I'm pushing a fix for you. The initial value is 0, as clients send data it's incremented by 1. At the end it's expected that total number of data sent matches data received by server. If test count is 5, then completed test count is also 5; as the loop runs 5 clients with indexes/ids - 0, 1, 2, 3, 4 <- count no. of clients created. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272732#comment-15272732 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217227191 In my lasts run, not a single test passed in the time frame something is wrong. Previously it was failing on 4/5, but this time it timed out without a single test passing of the 5... ``` 2016-05-05 19:46:17,659 DEBUG [utils.testcase.NioTest] (Time-limited test:) 0/5 tests done. Waiting for completion 2016-05-05 19:46:18,660 DEBUG [utils.testcase.NioTest] (Time-limited test:) 0/5 tests done. Waiting for completion 2016-05-05 19:46:19,660 DEBUG [utils.testcase.NioTest] (Time-limited test:) 0/5 tests done. Waiting for completion 2016-05-05 19:46:20,367 INFO [utils.testcase.NioTest] (main:) Clients stopped. 2016-05-05 19:46:20,367 INFO [utils.testcase.NioTest] (main:) Server stopped. Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 300.095 sec <<< FAILURE! - in com.cloud.utils.testcase.NioTest testConnection(com.cloud.utils.testcase.NioTest) Time elapsed: 300.095 sec <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 30 milliseconds at java.lang.Thread.sleep(Native Method) at com.cloud.utils.testcase.NioTest.testConnection(NioTest.java:146) ``` > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272733#comment-15272733 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217227342 Thanks, will review... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272918#comment-15272918 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217252920 @rhtyd I have some bad news on this PR. I have been having issues in CI ever since this got merged into master. When the tests don't run (and fail which causes the CI run to fail), then the DeployDatacenter script will fail. It looks like this code is treating the hosts as a malicious client. We get a handshake and then things fail. We basically get a `Failed to add host` error. I can get you more details if you need. I will test the #1534 PR to see if that fixes things, but I am a bit concerned about this PR right now... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273025#comment-15273025 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217270975 @swill sure thanks, please try with PR #1534 and if you still hit the issue, please revert the commit locally first; run against your environment and confirm that your environment works without the Nio fix (make sure both mgmt server and KVM agent have the both the PR fixes, or in case you revert make sure to rebuild mgmt server and kvm agent with reverted commits) in which case I'll try to reproduce and fix the addHost error. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273037#comment-15273037 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217272766 I confirmed that reverting this PR locally does fix my DeployDatacenter issues. I did an initial test with #1534 and it did get past the DeployDatacenter phase and started testing, but it did not run the Nio tests (apparently it only runs that test sometimes?). I stopped that run and cleaned everything up and am running the CI against #1534 again to see if I can get it to run the tests (and pass) and also come back with a clean CI run. I will update that PR with the status later tonight. Thanks for looking into this quickly to unblock our ability to do CI. 👍 > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273081#comment-15273081 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217278185 Well the test runs a lot faster/cleaner now. 👍 ``` Running com.cloud.utils.testcase.NioTest 2016-05-05 22:53:54,828 INFO [utils.testcase.NioTest] (main:) Setting up Benchmark Test 2016-05-05 22:53:54,861 INFO [utils.nio.NioServer] (main:) NioConnection started and listening on /0:0:0:0:0:0:0:0:41317 2016-05-05 22:53:54,874 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-1:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,874 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-3:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,874 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-4:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,882 DEBUG [utils.testcase.NioTest] (Time-limited test:) 0/4 tests done. Waiting for completion 2016-05-05 22:53:54,875 INFO [utils.testcase.NioTest] (MaliciousNioClientHandler-2:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,885 INFO [utils.nio.NioClient] (NioClientHandler-4:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,884 INFO [utils.nio.NioClient] (NioClientHandler-3:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,878 INFO [utils.nio.NioClient] (NioClientHandler-2:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,877 INFO [utils.nio.NioClient] (NioClientHandler-1:) Connecting to 127.0.0.1:41317 2016-05-05 22:53:54,899 DEBUG [utils.crypt.EncryptionSecretKeyChecker] (pool-1-thread-1:) Encryption Type: null 2016-05-05 22:53:54,902 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,039 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,045 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,054 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,112 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,119 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,126 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,145 WARN [utils.nio.Link] (pool-1-thread-1:) SSL: Fail to find the generated keystore. Loading fail-safe one to continue. 2016-05-05 22:53:55,886 DEBUG [utils.testcase.NioTest] (Time-limited test:) 0/4 tests done. Waiting for completion 2016-05-05 22:53:56,152 INFO [utils.nio.NioClient] (NioClientHandler-3:) SSL: Handshake done 2016-05-05 22:53:56,152 INFO [utils.nio.NioClient] (NioClientHandler-3:) Connected to 127.0.0.1:41317 2016-05-05 22:53:56,198 INFO [utils.testcase.NioTest] (NioTestClient-2-Handler-1:) Client: Received CONNECT task 2016-05-05 22:53:56,258 INFO [utils.testcase.NioTest] (NioTestClient-2-Handler-1:) Sending data to server 2016-05-05 22:53:56,236 INFO [utils.nio.NioClient] (NioClientHandler-1:) SSL: Handshake done 2016-05-05 22:53:56,259 INFO [utils.nio.NioClient] (NioClientHandler-1:) Connected to 127.0.0.1:41317 2016-05-05 22:53:56,232 INFO [utils.nio.NioClient] (NioClientHandler-4:) SSL: Handshake done 2016-05-05 22:53:56,260 INFO [utils.nio.NioClient] (NioClientHandler-4:) Connected to 127.0.0.1:41317 2016-05-05 22:53:56,225 INFO [utils.nio.NioClient] (NioClientHandler-2:) SSL: Handshake done 2016-05-05 22:53:56,260 INFO [utils.nio.NioClient] (NioClientHandler-2:) Connected to 127.0.0.1:41317 2016-05-05 22:53:56,260 INFO [utils.testcase.NioTest] (NioTestClient-0-Handler-1:) Client: Received CONNECT task 2016-05-05 22:53:56,285 INFO [utils.testcase.NioTest] (NioTestClient-0-Handler-1:) Sending data to server 2016-05-05 22:53:56,285 INFO [utils.testcase.NioTest] (NioTestClient-3-Handler-1:) Client: Received CONNECT task 2016-05-05 22:53:56,331 INFO [utils.testcase.NioTest] (NioTestClient-3-Handler-1:) Sending data to server 2016-05-05 22:53:56,284 INFO [utils.testcase.NioTest] (NioTestClient-1-Handler-1:) Client: Received CONNECT task 2016-05-05 22:53:56,368 INFO [utils.testcase.NioTest] (NioTestClient-1-Handler-1:) Sending data to server 2016-05-05 22:53:56,286 INFO [utils.testcase.NioTes
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273133#comment-15273133 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1493#issuecomment-217286830 @swill the NioTest is a unit test and will only run during compilation. If you hit any issues, feel free to revert the commit with some details on how I may be able to reproduce your issues and fix them. Thanks. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273138#comment-15273138 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217287468 @swill thanks for sharing, I made the NioConnection's main IO loop aggressive and reduced the SSL handshake timeout to 15s (this was previously 10s, but over last year instead of fixing core issue as we did not know that cause and details, I had increased it to 60s in Link class). This sort of optimization will help CloudStack perform speedy re-connection and handling of clients, even 1000s of malicious clients won't be able to block the main IO loop from handling other requests. > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273922#comment-15273922 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217413262 @swill this PR is ready for CI test run and merge, thanks > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274027#comment-15274027 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user jburwell commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217442413 LGTM based on code review > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274498#comment-15274498 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217519767 ### CI RESULTS ``` Tests Run: 85 Skipped: 0 Failed: 2 Errors: 0 Duration: 4h 31m 04s ``` **Summary of the problem(s):** ``` FAIL: Test redundant router internals -- Traceback (most recent call last): File "/data/git/cs1/cloudstack/test/integration/smoke/test_routers_network_ops.py", line 290, in test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true "Attempt to retrieve google.com index page should be successful!" AssertionError: Attempt to retrieve google.com index page should be successful! -- Additional details in: /tmp/MarvinLogs/test_network_C0JJZR/results.txt ``` ``` FAIL: test_02_vpc_privategw_static_routes (integration.smoke.test_privategw_acl.TestPrivateGwACL) -- Traceback (most recent call last): File "/data/git/cs1/cloudstack/test/integration/smoke/test_privategw_acl.py", line 253, in test_02_vpc_privategw_static_routes self.performVPCTests(vpc_off) File "/data/git/cs1/cloudstack/test/integration/smoke/test_privategw_acl.py", line 304, in performVPCTests privateGw_1 = self.createPvtGw(vpc_1, "10.0.3.100", "10.0.3.101", acl1.id, vlan_1) File "/data/git/cs1/cloudstack/test/integration/smoke/test_privategw_acl.py", line 472, in createPvtGw self.fail("Failed to create Private Gateway ==> %s" % e) AssertionError: Failed to create Private Gateway ==> Execute cmd: createprivategateway failed, due to: errorCode: 431, errorText:Network with vlan vlan://100 already exists in zone 1 -- Additional details in: /tmp/MarvinLogs/test_network_C0JJZR/results.txt ``` **Associated Uploads** **`/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL:`** * [dc_entries.obj](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL/dc_entries.obj) * [failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL/failed_plus_exceptions.txt) * [runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL/runinfo.txt) **`/tmp/MarvinLogs/test_network_C0JJZR:`** * [failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_network_C0JJZR/failed_plus_exceptions.txt) * [results.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_network_C0JJZR/results.txt) * [runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_network_C0JJZR/runinfo.txt) **`/tmp/MarvinLogs/test_vpc_routers_7C30C4:`** * [failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_vpc_routers_7C30C4/failed_plus_exceptions.txt) * [results.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_vpc_routers_7C30C4/results.txt) * [runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_vpc_routers_7C30C4/runinfo.txt) Uploads will be available until `2016-07-06 02:00:00 +0200 CEST` *Comment created by [`upr comment`](https://github.com/cloudops/upr).* > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274509#comment-15274509 ] ASF GitHub Bot commented on CLOUDSTACK-9348: Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1534#issuecomment-217521378 I am not concerned about the two failures. One happens randomly in my environment and one is a cleanup issue between test runs which is not related to this PR. Since `master` is currently broken due to some issues with #1493, I am going to merge this right away... > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250
[ https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274517#comment-15274517 ] ASF subversion and git services commented on CLOUDSTACK-9348: - Commit 9f970f28b18534dffe33196ead60ea861f501fa9 in cloudstack's branch refs/heads/master from [~williamstev...@gmail.com] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=9f970f2 ] Merge pull request #1534 from shapeblue/niotest-fix CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop- Reduces SSL handshake timeout to 15s, previously this was only 10s in commit debfcdef788ce0d51be06db0ef10f6815f9b563b - Adds an aggresive explicit wakeup to save the Nio main IO loop/handler from getting blocked - Fix NioTest to fail/succeed in about 60s, previously this was 300s - Due to aggresive wakeup usage, NioTest should complete in less than 5s on most systems. On virtualized environment this may slightly increase due to thread, CPU burst/scheduling delays. /cc @swill please review and merge. Sorry about the previous values, they were not optimized for virtualized env. The aggressive selector.wakeup will ensure main IO loop does not get blocked even by malicious users, even for any timeout (ssl handshake etc). * pr/1534: CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop Signed-off-by: Will Stevens > CloudStack Server degrades when a lot of connections on port 8250 > - > > Key: CLOUDSTACK-9348 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) >Reporter: Rohit Yadav >Assignee: Rohit Yadav > Fix For: 4.9.0 > > > An intermittent issue was found with a large CloudStack deployment, where > servers could not keep agents connected on port 8250. > All connections are handled by accept() in NioConnection: > https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125 > A new connection is handled by accept() which does blocking SSL handshake. A > good fix would be to make this non-blocking and handle expensive tasks in > separate threads/pool. This way the main IO loop won't be blocked and can > continue to serve other agents/clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)