[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2017-05-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013507#comment-16013507
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 02311c8bbe85210fae047ca57ff8322096fb0edf in cloudstack's branch 
refs/heads/4.9 from [~marcaurele]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=02311c8 ]

Activate NioTest following changes in CLOUDSTACK-9348 PR #1549

The first PR #1493 re-enabled the NioTest but not the new PR #1549.

Signed-off-by: Marc-Aurèle Brothier 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2017-05-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013511#comment-16013511
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit a933f8d96c5c754f56c97530e58aee5d0e17d979 in cloudstack's branch 
refs/heads/4.9 from [~rajanik]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=a933f8d ]

Merge pull request #2027 from exoscale/niotest

CLOUDSTACK-9918: Activate NioTest following changes in CLOUDSTACK-9348 PR #1549

> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2017-05-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013592#comment-16013592
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit a933f8d96c5c754f56c97530e58aee5d0e17d979 in cloudstack's branch 
refs/heads/master from [~rajanik]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=a933f8d ]

Merge pull request #2027 from exoscale/niotest

CLOUDSTACK-9918: Activate NioTest following changes in CLOUDSTACK-9348 PR #1549

> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2017-05-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013610#comment-16013610
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 8b3cadb55eefeacd310f97aefbb91276e4ee8b43 in cloudstack's branch 
refs/heads/master from [~rajanik]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=8b3cadb ]

Merge release branch 4.9 to master

* 4.9:
  Do not set gateway to 0.0.0.0 for windows clients
  CLOUDSTACK-9904: Fix log4j to have @AGENTLOG@ replaced
  ignore bogus default gateway   when a shared network is secondary the default 
gateway gets overwritten by a bogus one   dnsmasq does the right thing and 
replaces it with its own default which is not good for us   so check for 
'0.0.0.0'
  Activate NioTest following changes in CLOUDSTACK-9348 PR #1549
  CLOUDSTACK-9828: GetDomRVersionCommand fails to get the correct version as 
output Fix tries to return the output as a single command, instead of appending 
output from two commands
  CLOUDSTACK-3223 Exception observed while creating CPVM in VMware Setup with 
DVS
  CLOUDSTACK-9787: Fix wrong return value in NetUtils.isNetworkAWithinNetworkB


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2017-12-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275621#comment-16275621
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit e6f32c233e179da5e99318aba8e48146e0ff70c3 in cloudstack's branch 
refs/heads/debian9-systemvmtemplate from [~rohit.ya...@shapeblue.com]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=e6f32c2 ]

CLOUDSTACK-9348: Improve Nio SSH handshake buffers

Use a holder class to pass buffers, fixes potential leak.

Signed-off-by: Rohit Yadav 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2017-12-08 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284057#comment-16284057
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 315cbab08bad56595cfd0766a91da0e447dcf132 in cloudstack's branch 
refs/heads/debian9-systemvmtemplate from [~rohit.ya...@shapeblue.com]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=315cbab ]

CLOUDSTACK-9348: Improve Nio SSH handshake buffers

Use a holder class to pass buffers, fixes potential leak.

Signed-off-by: Rohit Yadav 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351857#comment-15351857
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the issue:

https://github.com/apache/cloudstack/pull/1549
  
I think I am going to revert this PR.  We are having intermittent issues 
with it still and I am not confident running a production environment with this 
in place at this time, so I don't think I can justify leaving it in without us 
doing some more testing to figure out what is going on.

I have had a few reports like this as people test 4.9, which are concerning:
> NIO SSL agent not connecting. when I telnet to 8250, the agent 
immediately came up without me having to restart it.

I am also still periodically getting the `addHost` issue we thought we had 
resolved previously.

After I revert his, can you create a new PR with this same code so we can 
start getting more concrete testing on it and we can start consolidating some 
logs when it misbehaves?  


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-06-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352490#comment-15352490
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the issue:

https://github.com/apache/cloudstack/pull/1549
  
@swill may I have the mgmt server and agent logs when the failures were 
intercepted. This is to make sure it's not your environment specific issue. 
I'll also need the JRE version in use (openjdk, or oraclejdk, which versions 
specifically?). If possible can you also take a heapdump and share that with me 
(run jmap -dump:file=heap.bin , gzip and scp this bin 
file and please share this somewhere for both mgmt server and agent).

"NIO SSL agent not connecting. when I telnet to 8250, the agent immediately 
came up without me having to restart it." -- this is something which I've fixed 
in latest master (using timeout on selectors), can you ask them if they are 
using latest master?

We've seen this fix deployed in a very large environment with 1000s of 
hosts and I've not heard anything from them. We've not gotten any reported on 
MLs so far, I would appreciate if those people who are experiencing issues can 
share it on public channels. Thanks.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-06-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352522#comment-15352522
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1549#discussion_r68709839
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/NioConnection.java ---
@@ -125,7 +125,7 @@ public boolean isStartup() {
 public Boolean call() throws NioConnectionException {
 while (_isRunning) {
 try {
-_selector.select();
+_selector.select(1000);
--- End diff --

@swill this change ^^ causes the selector loop to never block indefinitely 
but at most 1 second. This ensures that reconnections are fast. If anyone 
experiences the behaviour that tel-netting to port 8250 causes agent to 
reconnect then, either (1) they are not using latest master or (2) they did not 
wait for up to one second (i.e. hit an edge case), in which case we can lower 
this number to 100-500 millisecond at the expense of increasing cpu usage.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-06-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353107#comment-15353107
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the issue:

https://github.com/apache/cloudstack/pull/1549
  
This was shared in the "4.9/master Testing Coordination" thread.  Simon 
Weller and his team at ENA have run into this on the latest master in both of 
their hardware labs while testing master for me in preparation for the 4.9 RC.

I am still getting the `addHost` issue periodically which showed up a lot 
when this had bigger issues.

I am concerned about this in production to be honest.  How long has this 
been running in production at your client?


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-06-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353375#comment-15353375
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the issue:

https://github.com/apache/cloudstack/pull/1549
  
@swill close to two months now. Can you add the mgmt server logs, heap dump 
and any other dumps which can help me fix the issue.
@swill @kiwiflyer Can you please open a JIRA issue where you can put these 
details, without necessary information we cannot find and fix the issue or 
conclude that the issue was caused by something else. I've built latest master 
repository here: http://packages.shapeblue.com/cloudstack/custom/testing
I'll continue this discussion on the ML thread.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358686#comment-15358686
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


GitHub user rhtyd opened a pull request:

https://github.com/apache/cloudstack/pull/1601

CLOUDSTACK-9348: Reduce Nio selector wait time

This reduced the Nio loop selector wait time, this way the selector will
check frequently (as much as 100ms per iteration) and handle any pending
connection/tasks. This would make reconnections very quick at the expense of
some CPU usage.

/cc @swill @kiwiflyer guys can you please apply this fix in your env and 
test if you're still able to produce any Nio related error b/w mgmt server(s) 
and kvm agent(s) not being able to connect quickly. Please also watch out for 
any increased CPU usage (there should not be any significant change), in which 
case we may increase the timeout from 100ms to 200-400ms.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shapeblue/cloudstack nio-aggressive-selector

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cloudstack/pull/1601.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1601


commit 0381b7ea185ef753873594216a67b8d376e3d658
Author: Rohit Yadav 
Date:   2016-07-01T09:02:58Z

CLOUDSTACK-9348: Reduce Nio selector wait time

This reduced the Nio loop selector wait time, this way the selector will
check frequently (as much as 100ms per iteration) and handle any pending
connection/tasks. This would make reconnections very quick at the expense of
some CPU usage.

Signed-off-by: Rohit Yadav 




> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360992#comment-15360992
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user glennwagner commented on the issue:

https://github.com/apache/cloudstack/pull/1601
  
LGTM

Testing 4.9 master with pr 1601

1. Original results without PR  KVM hosts failing to add , error in logs 
were NIO connection errors 

2. After the PR was applied all KVM hosts (Ubuntu and CentOS) add correctly 
and KVM agents checked in with an UP status

Systems VM's deployed with no errors 




> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361048#comment-15361048
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the issue:

https://github.com/apache/cloudstack/pull/1601
  
@glennwagner can you also comment if you saw any unusual CPU usage? thanks 
for testing this


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370170#comment-15370170
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


GitHub user rhtyd reopened a pull request:

https://github.com/apache/cloudstack/pull/1601

CLOUDSTACK-9348: Reduce Nio selector wait time

This reduced the Nio loop selector wait time, this way the selector will
check frequently (as much as 100ms per iteration) and handle any pending
connection/tasks. This would make reconnections very quick at the expense of
some CPU usage.

/cc @swill @kiwiflyer guys can you please apply this fix in your env and 
test if you're still able to produce any Nio related error b/w mgmt server(s) 
and kvm agent(s) not being able to connect quickly. Please also watch out for 
any increased CPU usage (there should not be any significant change), in which 
case we may increase the timeout from 100ms to 200-400ms.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shapeblue/cloudstack nio-aggressive-selector

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cloudstack/pull/1601.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1601


commit 0381b7ea185ef753873594216a67b8d376e3d658
Author: Rohit Yadav 
Date:   2016-07-01T09:02:58Z

CLOUDSTACK-9348: Reduce Nio selector wait time

This reduced the Nio loop selector wait time, this way the selector will
check frequently (as much as 100ms per iteration) and handle any pending
connection/tasks. This would make reconnections very quick at the expense of
some CPU usage.

Signed-off-by: Rohit Yadav 




> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370169#comment-15370169
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd closed the pull request at:

https://github.com/apache/cloudstack/pull/1601


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379255#comment-15379255
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the issue:

https://github.com/apache/cloudstack/pull/1601
  
@swill before you cut the next RC, please include this as @glennwagner and 
@PaulAngus found a blocker around addHost API without this fix. Thanks.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381697#comment-15381697
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the issue:

https://github.com/apache/cloudstack/pull/1601
  


### CI RESULTS

```
Tests Run: 79
  Skipped: 0
   Failed: 0
   Errors: 3
 Duration: 7h 54m 18s
```

**Summary of the problem(s):**
```
ERROR: test suite for 
--
Traceback (most recent call last):
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
209, in run
self.setUp()
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
292, in setUp
self.setupContext(ancestor)
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
315, in setupContext
try_run(context, names)
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/util.py", line 471, 
in try_run
return func()
  File 
"/data/git/cs1/cloudstack/test/integration/smoke/test_internal_lb.py", line 
296, in setUpClass
cls.template.download(cls.apiclient)
  File "/usr/lib/python2.7/site-packages/marvin/lib/base.py", line 1350, in 
download
elif 'Downloaded' in template.status:
TypeError: argument of type 'NoneType' is not iterable
--
Additional details in: /tmp/MarvinLogs/test_network_DJJ0HC/results.txt
```

```
ERROR: test suite for 
--
Traceback (most recent call last):
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
209, in run
self.setUp()
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
292, in setUp
self.setupContext(ancestor)
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
315, in setupContext
try_run(context, names)
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/util.py", line 471, 
in try_run
return func()
  File "/data/git/cs1/cloudstack/test/integration/smoke/test_vpc_vpn.py", 
line 293, in setUpClass
cls.template.download(cls.apiclient)
  File "/usr/lib/python2.7/site-packages/marvin/lib/base.py", line 1350, in 
download
elif 'Downloaded' in template.status:
TypeError: argument of type 'NoneType' is not iterable
--
Additional details in: /tmp/MarvinLogs/test_network_DJJ0HC/results.txt
```

```
ERROR: test suite for 
--
Traceback (most recent call last):
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
209, in run
self.setUp()
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
292, in setUp
self.setupContext(ancestor)
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/suite.py", line 
315, in setupContext
try_run(context, names)
  File 
"/usr/lib/python2.7/site-packages/nose-1.3.7-py2.7.egg/nose/util.py", line 471, 
in try_run
return func()
  File "/data/git/cs1/cloudstack/test/integration/smoke/test_vpc_vpn.py", 
line 472, in setUpClass
cls.template.download(cls.apiclient)
  File "/usr/lib/python2.7/site-packages/marvin/lib/base.py", line 1350, in 
download
elif 'Downloaded' in template.status:
TypeError: argument of type 'NoneType' is not iterable
--
Additional details in: /tmp/MarvinLogs/test_network_DJJ0HC/results.txt
```



**Associated Uploads**

**`/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8:`**
* 
[dc_entries.obj](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8/dc_entries.obj)
* 
[failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8/failed_plus_exceptions.txt)
* 
[runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/DeployDataCenter__Jul_15_2016_19_39_28_2AHFM8/runinfo.txt)

**`/tmp/MarvinLogs/test_network_DJJ0HC:`**
* 
[failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1601/tmp/MarvinLogs/test_network_DJJ0HC/failed_plus_exceptions.txt)
* 
[re

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381700#comment-15381700
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the issue:

https://github.com/apache/cloudstack/pull/1601
  
These errors are common in this specific environment type (hypervisors 
nested 3 layers deep).  They do not show up when only two hypervisors are 
nested...  This is ready...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381944#comment-15381944
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the issue:

https://github.com/apache/cloudstack/pull/1601
  
Thanks @swill 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382757#comment-15382757
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit ea48e95bdd1641c752eb573fe448aac6478cecd1 in cloudstack's branch 
refs/heads/master from [~williamstev...@gmail.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ea48e95 ]

Merge pull request #1601 from shapeblue/nio-aggressive-selector

CLOUDSTACK-9348: Reduce Nio selector wait timeThis reduced the Nio loop 
selector wait time, this way the selector will
check frequently (as much as 100ms per iteration) and handle any pending
connection/tasks. This would make reconnections very quick at the expense of
some CPU usage.

/cc @swill @kiwiflyer guys can you please apply this fix in your env and test 
if you're still able to produce any Nio related error b/w mgmt server(s) and 
kvm agent(s) not being able to connect quickly. Please also watch out for any 
increased CPU usage (there should not be any significant change), in which case 
we may increase the timeout from 100ms to 200-400ms.

* pr/1601:
  CLOUDSTACK-9348: Reduce Nio selector wait time

Signed-off-by: Will Stevens 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382758#comment-15382758
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit ea48e95bdd1641c752eb573fe448aac6478cecd1 in cloudstack's branch 
refs/heads/master from [~williamstev...@gmail.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ea48e95 ]

Merge pull request #1601 from shapeblue/nio-aggressive-selector

CLOUDSTACK-9348: Reduce Nio selector wait timeThis reduced the Nio loop 
selector wait time, this way the selector will
check frequently (as much as 100ms per iteration) and handle any pending
connection/tasks. This would make reconnections very quick at the expense of
some CPU usage.

/cc @swill @kiwiflyer guys can you please apply this fix in your env and test 
if you're still able to produce any Nio related error b/w mgmt server(s) and 
kvm agent(s) not being able to connect quickly. Please also watch out for any 
increased CPU usage (there should not be any significant change), in which case 
we may increase the timeout from 100ms to 200-400ms.

* pr/1601:
  CLOUDSTACK-9348: Reduce Nio selector wait time

Signed-off-by: Will Stevens 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-07-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382763#comment-15382763
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user asfgit closed the pull request at:

https://github.com/apache/cloudstack/pull/1601


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241017#comment-15241017
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-209896417
  
- Tested against KVM, mgmt server - KVM links and clustered management 
server
- NioTest modified to have multiple clients against a server instance with 
just one worker and 10 malicious clients (they simply do a secure connect to 
the server and don't do anything else) trying to connect server per valid client
- Ran Marvin smoke tests successfully against KVM


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241740#comment-15241740
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-210103368
  
I've created two commits to show: (1) test to prove denial of service 
behavior due to blocking main IO loop, (2) the fix (as mentioned earlier long 
term fix would require migration to a better framework).


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241906#comment-15241906
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59790778
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", , 1, new 
NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+  

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241916#comment-15241916
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59791280
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", , 1, new 
NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+ 

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241929#comment-15241929
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59792529
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", , 1, new 
NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241953#comment-15241953
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59795416
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", , 1, new 
NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+ 

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242006#comment-15242006
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59799743
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", , 1, new 
NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243107#comment-15243107
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59892756
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+  

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243120#comment-15243120
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59894203
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", , 1, new 
NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+  

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243165#comment-15243165
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user jburwell commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59900210
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243167#comment-15243167
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user jburwell commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59900296
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243166#comment-15243166
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user jburwell commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59900221
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243293#comment-15243293
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59912439
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if 

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243296#comment-15243296
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59912710
  
--- Diff: 
utils/src/test/java/com/cloud/utils/backoff/impl/ConstantTimeBackoffTest.java 
---
@@ -94,7 +94,7 @@ public void wakeupNotExisting() {
 @Test
 public void wakeupExisting() throws InterruptedException {
 final ConstantTimeBackoff backoff = new ConstantTimeBackoff();
-backoff.setTimeToWait(10);
+backoff.setTimeToWait(1000);
--- End diff --

is it 1000 seconds or miliseconds?
Does it need to be that high?


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243304#comment-15243304
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user jburwell commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59913051
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243306#comment-15243306
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59913073
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243310#comment-15243310
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59913436
  
--- Diff: 
utils/src/test/java/com/cloud/utils/backoff/impl/ConstantTimeBackoffTest.java 
---
@@ -94,7 +94,7 @@ public void wakeupNotExisting() {
 @Test
 public void wakeupExisting() throws InterruptedException {
 final ConstantTimeBackoff backoff = new ConstantTimeBackoff();
-backoff.setTimeToWait(10);
+backoff.setTimeToWait(1000);
--- End diff --

I was trying to diagnose why this test was failing on Travis, so added a 
large value. Removed now.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243315#comment-15243315
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59913697
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if 

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243314#comment-15243314
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59913672
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+f

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243323#comment-15243323
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59914101
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243331#comment-15243331
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59915144
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+f

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1524#comment-1524
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59915248
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243335#comment-15243335
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59915315
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243337#comment-15243337
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59915385
  
--- Diff: utils/src/main/java/com/cloud/utils/nio/Link.java ---
@@ -453,115 +449,192 @@ public static SSLContext initSSLContext(boolean 
isClient) throws GeneralSecurity
 return sslContext;
 }
 
-public static void doHandshake(SocketChannel ch, SSLEngine sslEngine, 
boolean isClient) throws IOException {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: begin Handshake, isClient: " + isClient);
+public static ByteBuffer enlargeBuffer(ByteBuffer buffer, final int 
sessionProposedCapacity) {
+if (buffer == null || sessionProposedCapacity < 0) {
+return buffer;
 }
-
-SSLEngineResult engResult;
-SSLSession sslSession = sslEngine.getSession();
-HandshakeStatus hsStatus;
-ByteBuffer in_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer in_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-ByteBuffer out_pkgBuf = 
ByteBuffer.allocate(sslSession.getPacketBufferSize() + 40);
-ByteBuffer out_appBuf = 
ByteBuffer.allocate(sslSession.getApplicationBufferSize() + 40);
-int count;
-ch.socket().setSoTimeout(60 * 1000);
-InputStream inStream = ch.socket().getInputStream();
-// Use readCh to make sure the timeout on reading is working
-ReadableByteChannel readCh = Channels.newChannel(inStream);
-
-if (isClient) {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_WRAP;
+if (sessionProposedCapacity > buffer.capacity()) {
+buffer = ByteBuffer.allocate(sessionProposedCapacity);
 } else {
-hsStatus = SSLEngineResult.HandshakeStatus.NEED_UNWRAP;
+buffer = ByteBuffer.allocate(buffer.capacity() * 2);
 }
+return buffer;
+}
 
-while (hsStatus != SSLEngineResult.HandshakeStatus.FINISHED) {
-if (s_logger.isTraceEnabled()) {
-s_logger.trace("SSL: Handshake status " + hsStatus);
+public static ByteBuffer handleBufferUnderflow(final SSLEngine engine, 
ByteBuffer buffer) {
+if (engine == null || buffer == null) {
+return buffer;
+}
+if (buffer.position() < buffer.limit()) {
+return buffer;
+}
+ByteBuffer replaceBuffer = enlargeBuffer(buffer, 
engine.getSession().getPacketBufferSize());
+buffer.flip();
+replaceBuffer.put(buffer);
+return replaceBuffer;
+}
+
+private static boolean doHandshakeUnwrap(final SocketChannel 
socketChannel, final SSLEngine sslEngine,
+ ByteBuffer peerAppData, 
ByteBuffer peerNetData, final int appBufferSize) throws IOException {
+if (socketChannel == null || sslEngine == null || peerAppData == 
null || peerNetData == null || appBufferSize < 0) {
+return false;
+}
+if (socketChannel.read(peerNetData) < 0) {
+if (sslEngine.isInboundDone() && sslEngine.isOutboundDone()) {
+return false;
 }
-engResult = null;
-if (hsStatus == SSLEngineResult.HandshakeStatus.NEED_WRAP) {
-out_pkgBuf.clear();
-out_appBuf.clear();
-out_appBuf.put("Hello".getBytes());
-engResult = sslEngine.wrap(out_appBuf, out_pkgBuf);
-out_pkgBuf.flip();
-int remain = out_pkgBuf.limit();
-while (remain != 0) {
-remain -= ch.write(out_pkgBuf);
-if (remain < 0) {
-throw new IOException("Too much bytes sent?");
-}
-}
-} else if (hsStatus == 
SSLEngineResult.HandshakeStatus.NEED_UNWRAP) {
-in_appBuf.clear();
-// One packet may contained multiply operation
-if (in_pkgBuf.position() == 0 || 
!in_pkgBuf.hasRemaining()) {
-in_pkgBuf.clear();
-count = 0;
-try {
-count = readCh.read(in_pkgBuf);
-} catch (SocketTimeoutException ex) {
-if (s_logger

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243354#comment-15243354
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59917981
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+f

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243359#comment-15243359
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-210576373
  
Thanks all for the review, I've update the commits; please re-review and 
advise other outstanding issue. Thanks again.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243365#comment-15243365
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59918834
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,215 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+/* NioTest
--- End diff --

If you are going to use some kind of documenting, I believe the java doc 
style would be more appropriate.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243383#comment-15243383
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rafaelweingartner commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r59920010
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,198 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
 
-/**
- *
- *
- *
- *
- */
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+public class NioTest {
 
-public class NioTest extends TestCase {
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 10;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
 }
 return result;
 }
 
-private void getOneMoreTest() {
-synchronized (this) {
-_testCount++;
-}
-}
-
 private void oneMoreTestDone() {
 synchronized (this) {
-_completedCount++;
+completedTestCount++;
 }
 }
 
-@Override
+@Before
 public void setUp() {
-s_logger.info("Test");
+LOGGER.info("Setting up Benchmark Test");
 
-_testCount = 0;
-_completedCount = 0;
-
-_server = new NioServer("NioTestServer", , 5, new 
NioTestServer());
-try {
-_server.start();
-} catch (final NioConnectionException e) {
-fail(e.getMessage());
-}
+completedTestCount = 0;
+testBytes = new byte[100];
+randomGenerator.nextBytes(testBytes);
 
-_client = new NioClient("NioTestServer", "127.0.0.1", , 5, new 
NioTestClient());
+// Server configured with one worker
+server = new NioServer("NioTestServer", 0, 1, new NioTestServer());
 try {
-_client.start();
+server.start();
 } catch (final NioConnectionException e) {
-fail(e.getMessage());
+Assert.fail(e.getMessage());
 }
 
-while (_clientLink == null) {
-try {
-s_logger.debug("Link is not up! Waiting ...");
-Thread.sleep(1000);
-} catch (final InterruptedException e) {
-// TODO Auto-generated catch block
-e.printStackTrace();
+// 5 malicious clients per valid client
+for (int i = 0; i < totalTestCount; i++) {
+

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245292#comment-15245292
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-211261793
  
@swill I've fixed the outstanding issues, can you run your CI on this and 
help merge? thanks


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249385#comment-15249385
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-212284556
  
@jburwell fixed use of test timeout within \@Test annotation


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253546#comment-15253546
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user bhaisaab commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-213327217
  
@jburwell @GabrielBrascher @rafaelweingartner @swill if you're done with 
review, LGTM please or share what else should be fixed. Thanks.



> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261540#comment-15261540
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215309861
  
Failed to build.
```
---
 T E S T S
---
Running com.cloud.utils.testcase.NioTest
2016-04-28 06:23:24,581 INFO  [utils.testcase.NioTest] (main:) Setting up 
Benchmark Test
2016-04-28 06:23:24,879 INFO  [utils.nio.NioServer] (main:) NioConnection 
started and listening on /0:0:0:0:0:0:0:0:58798
2016-04-28 06:23:24,886 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-1:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,886 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-2:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,887 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-4:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,890 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-5:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,891 INFO  [utils.nio.NioClient] (NioClientHandler-1:) 
Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,892 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-6:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,892 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-3:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,893 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-7:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,894 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-8:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,895 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-10:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,924 DEBUG [utils.crypt.EncryptionSecretKeyChecker] 
(pool-1-thread-1:) Encryption Type: null
2016-04-28 06:23:24,928 INFO  [utils.nio.NioClient] (NioClientHandler-2:) 
Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,933 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-11:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,933 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-04-28 06:23:24,939 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-13:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,941 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-14:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,944 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-12:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,944 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-15:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,944 INFO  [utils.nio.NioClient] (NioClientHandler-3:) 
Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,945 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-16:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,946 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-9:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,946 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-17:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,946 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-18:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,947 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-19:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,947 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-20:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,948 INFO  [utils.nio.NioClient] (NioClientHandler-4:) 
Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,949 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-21:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,949 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-22:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,949 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-23:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,977 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-25:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,977 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-24:) Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,981 INFO  [utils.nio.NioClient] (NioClientHandler-5:) 
Connecting to 127.0.0.1:58798
2016-04-28 06:23:24,996 DEBUG [utils.testcase.NioTest] (Thread-0:) 0/5 
tests done. Waiting for completion
2016-04-28 06:23:25,103 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-04-28 06:23:25,161 WARN  [utils.nio.Link] (pool-1-th

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261545#comment-15261545
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215310531
  
BTW, I built with `-T 2C`, if that is relevant to help you understand why 
it failed...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261739#comment-15261739
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215350128
  
@swill there are in total 25 malicious clients that can block for 60s for 
all 5 (max.) server worker threads; so worst case we should have waited for at 
least 25*60/5 (300 seconds); I've fixed the test with max. possible timeout 
value, previously the value was chosen for an average case


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263857#comment-15263857
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215675646
  
@swill can you try again with your CI?

@agneya2001 @jburwell @wido @kiwiflyer @nvazquez @DaanHoogland and others - 
please review and share your LGTM, thanks


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264140#comment-15264140
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user kiwiflyer commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215741459
  
@rhtyd - We'll pull this in for functional testing.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264342#comment-15264342
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215815079
  
thanks @kiwiflyer 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264355#comment-15264355
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user jburwell commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-215818023
  
LGTM for code review


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266558#comment-15266558
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216228508
  
This PR is ready for merge, /cc @swill 

tag:mergeready


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266953#comment-15266953
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216289074
  
@kiwiflyer do you have test results on this one?  Thanks...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267000#comment-15267000
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user kiwiflyer commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216296997
  
@swill  I'm a bit behind. I'm building this now.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267007#comment-15267007
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216298669
  
No worries.  Thanks...  I also am a bit behind.  I apparently have to just 
assume I won't get any work done on mondays.  :P


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267575#comment-15267575
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user kiwiflyer commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216375175
  
I pulled this into a hardware lab on 4.8.1.  I setup a number of fake 
clients and hammered 8250. Prior to the patch the agents end up in a 
disconnected state after a few minutes.

I applied the patch and my little DOS test is unable to affect the 
connectivity between the management server and the agents.

I also tested some provisioning activities and made sure the agent survived 
taking the management server down and then bringing it back up.

LGTM


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267589#comment-15267589
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216378613
  
Thank you @kiwiflyer.  👍 

@rhtyd can you force push this PR again to try to get Jenkins green?  
Thanks...  Otherwise this one is ready...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267620#comment-15267620
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user serverchief commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216382896
  
@kiwiflyer, 

My testing with this patch -  if you have at least several hundred KVM 
nodes connected to 2 MS via VIP and take 1 MS down, you will notice that KVM 
agents will shift to second MS in mater of seconds - with no noise.

Without this patch, depending on the scale - it may take upto 10 minutes to 
reconnect all hosts and also lots of noise about hosts being down!


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267963#comment-15267963
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216421909
  
@swill forced pushed; the Jenkins server is not reliable -- as long as 
Travis is green we are alright; the only additional check Jenkins does is the 
rat check, which I think Travis can do as well

Thanks @serverchief for sharing your experience with this fix

tag:mergeready


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268478#comment-15268478
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216489209
  
@swill all green now


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268965#comment-15268965
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-216581383
  
Perfect, this one is queued up to be merged...  Thanks...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270725#comment-15270725
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user asfgit closed the pull request at:

https://github.com/apache/cloudstack/pull/1493


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270721#comment-15270721
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit ba77a692391856df468a141f98687ec71373a3d3 in cloudstack's branch 
refs/heads/master from [~rohit.ya...@shapeblue.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ba77a69 ]

CLOUDSTACK-9348: Use non-blocking SSL handshake

- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server 
responsiveness
  due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes

Signed-off-by: Rohit Yadav 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270723#comment-15270723
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 7ce0e10fbcd949375e43535aae168421ecdaa562 in cloudstack's branch 
refs/heads/master from [~williamstev...@gmail.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=7ce0e10 ]

Merge pull request #1493 from shapeblue/nio-fix

CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses 
non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server 
responsiveness
  due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events

Changes are covered the NioTest so I did not write a new test, advise how we 
can improve this. Further, I tried to invest time on writing a benchmark test 
to reproduce a degraded server but could not write it deterministic-ally 
(sometimes fails/passes but not always). Review, CI testing and feedback 
requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma 
@rafaelweingartner @GabrielBrascher

* pr/1493:
  CLOUDSTACK-9348: Use non-blocking SSL handshake
  CLOUDSTACK-9348: Unit test to demonstrate denial of service attack

Signed-off-by: Will Stevens 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270720#comment-15270720
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 0154da6417ab1dc0fa9719df4543e72ca5f2c178 in cloudstack's branch 
refs/heads/master from [~rohit.ya...@shapeblue.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=0154da6 ]

CLOUDSTACK-9348: Unit test to demonstrate denial of service attack

The NioConnection uses blocking handlers for various events such as connect,
accept, read, write. In case a client connects NioServer (used by
agent mgr to service agents on port 8250) but fails to participate in SSL
handshake or just sits idle, this would block the main IO/selector loop in
NioConnection. Such a client could be either malicious or aggresive.

This unit test demonstrates such a malicious client that can perform a
denial-of-service attack on NioServer that blocks it to serve any other client.

Signed-off-by: Rohit Yadav 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270722#comment-15270722
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 7ce0e10fbcd949375e43535aae168421ecdaa562 in cloudstack's branch 
refs/heads/master from [~williamstev...@gmail.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=7ce0e10 ]

Merge pull request #1493 from shapeblue/nio-fix

CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses 
non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server 
responsiveness
  due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events

Changes are covered the NioTest so I did not write a new test, advise how we 
can improve this. Further, I tried to invest time on writing a benchmark test 
to reproduce a degraded server but could not write it deterministic-ally 
(sometimes fails/passes but not always). Review, CI testing and feedback 
requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma 
@rafaelweingartner @GabrielBrascher

* pr/1493:
  CLOUDSTACK-9348: Use non-blocking SSL handshake
  CLOUDSTACK-9348: Unit test to demonstrate denial of service attack

Signed-off-by: Will Stevens 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270724#comment-15270724
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 7ce0e10fbcd949375e43535aae168421ecdaa562 in cloudstack's branch 
refs/heads/master from [~williamstev...@gmail.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=7ce0e10 ]

Merge pull request #1493 from shapeblue/nio-fix

CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses 
non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server 
responsiveness
  due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events

Changes are covered the NioTest so I did not write a new test, advise how we 
can improve this. Further, I tried to invest time on writing a benchmark test 
to reproduce a degraded server but could not write it deterministic-ally 
(sometimes fails/passes but not always). Review, CI testing and feedback 
requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma 
@rafaelweingartner @GabrielBrascher

* pr/1493:
  CLOUDSTACK-9348: Use non-blocking SSL handshake
  CLOUDSTACK-9348: Unit test to demonstrate denial of service attack

Signed-off-by: Will Stevens 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272518#comment-15272518
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217188135
  
@rhtyd I am still having problems with the tests in this PR, but now it is 
in master.

This is causing builds to fail...

```
testConnection(com.cloud.utils.testcase.NioTest)  Time elapsed: 300.073 sec 
 <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 30 
milliseconds
at java.lang.Thread.sleep(Native Method)
at com.cloud.utils.testcase.NioTest.testConnection(NioTest.java:146)
```


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272521#comment-15272521
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217188537
  
Is there  a reason we need to spend 5 minutes waiting for this test every 
build?


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272538#comment-15272538
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r62207167
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,208 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
 
 /**
- *
- *
- *
- *
+ * NioTest demonstrates that NioServer can function without getting its 
main IO
+ * loop blocked when an aggressive or malicious client connects to the 
server but
+ * fail to participate in SSL handshake. In this test, we run bunch of 
clients
+ * that send a known payload to the server, to which multiple malicious 
clients
+ * also try to connect and hang.
+ * A malicious client could cause denial-of-service if the server's main 
IO loop
+ * along with SSL handshake was blocking. A passing tests shows that 
NioServer
+ * can still function in case of connection load and that the main IO loop 
along
+ * with SSL handshake is non-blocking with some internal timeout mechanism.
  */
 
-public class NioTest extends TestCase {
+public class NioTest {
+
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
+
+// Test should fail in due time instead of looping forever
+private static final int TESTTIMEOUT = 30;
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 5;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
--- End diff --

Isn't this wrong?  

Shouldn't it be: 
```
result = (totalTestCount -1) == completedTestCount;
```

You are are only launching `totalTestCount` tests `0 to totalTestCount-1`.  
`completedTestCount` is also `0` based, so when they all complete it should max 
out at `totalTestCount-1`.

Can you clarify?



> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272539#comment-15272539
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r62207228
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,208 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
 
 /**
- *
- *
- *
- *
+ * NioTest demonstrates that NioServer can function without getting its 
main IO
+ * loop blocked when an aggressive or malicious client connects to the 
server but
+ * fail to participate in SSL handshake. In this test, we run bunch of 
clients
+ * that send a known payload to the server, to which multiple malicious 
clients
+ * also try to connect and hang.
+ * A malicious client could cause denial-of-service if the server's main 
IO loop
+ * along with SSL handshake was blocking. A passing tests shows that 
NioServer
+ * can still function in case of connection load and that the main IO loop 
along
+ * with SSL handshake is non-blocking with some internal timeout mechanism.
  */
 
-public class NioTest extends TestCase {
+public class NioTest {
+
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
+
+// Test should fail in due time instead of looping forever
+private static final int TESTTIMEOUT = 30;
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 5;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
--- End diff --

@rhtyd ^


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272663#comment-15272663
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1493#discussion_r62219725
  
--- Diff: utils/src/test/java/com/cloud/utils/testcase/NioTest.java ---
@@ -19,146 +19,208 @@
 
 package com.cloud.utils.testcase;
 
-import java.nio.channels.ClosedChannelException;
-import java.util.Random;
-
-import junit.framework.TestCase;
-
-import org.apache.log4j.Logger;
-import org.junit.Assert;
-
+import com.cloud.utils.concurrency.NamedThreadFactory;
 import com.cloud.utils.exception.NioConnectionException;
 import com.cloud.utils.nio.HandlerFactory;
 import com.cloud.utils.nio.Link;
 import com.cloud.utils.nio.NioClient;
 import com.cloud.utils.nio.NioServer;
 import com.cloud.utils.nio.Task;
 import com.cloud.utils.nio.Task.Type;
+import org.apache.log4j.Logger;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.nio.channels.ClosedChannelException;
+import java.nio.channels.Selector;
+import java.nio.channels.SocketChannel;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
 
 /**
- *
- *
- *
- *
+ * NioTest demonstrates that NioServer can function without getting its 
main IO
+ * loop blocked when an aggressive or malicious client connects to the 
server but
+ * fail to participate in SSL handshake. In this test, we run bunch of 
clients
+ * that send a known payload to the server, to which multiple malicious 
clients
+ * also try to connect and hang.
+ * A malicious client could cause denial-of-service if the server's main 
IO loop
+ * along with SSL handshake was blocking. A passing tests shows that 
NioServer
+ * can still function in case of connection load and that the main IO loop 
along
+ * with SSL handshake is non-blocking with some internal timeout mechanism.
  */
 
-public class NioTest extends TestCase {
+public class NioTest {
+
+private static final Logger LOGGER = Logger.getLogger(NioTest.class);
+
+// Test should fail in due time instead of looping forever
+private static final int TESTTIMEOUT = 30;
 
-private static final Logger s_logger = Logger.getLogger(NioTest.class);
+final private int totalTestCount = 5;
+private int completedTestCount = 0;
 
-private NioServer _server;
-private NioClient _client;
+private NioServer server;
+private List clients = new ArrayList<>();
+private List maliciousClients = new ArrayList<>();
 
-private Link _clientLink;
+private ExecutorService clientExecutor = 
Executors.newFixedThreadPool(totalTestCount, new 
NamedThreadFactory("NioClientHandler"));;
+private ExecutorService maliciousExecutor = 
Executors.newFixedThreadPool(5*totalTestCount, new 
NamedThreadFactory("MaliciousNioClientHandler"));;
 
-private int _testCount;
-private int _completedCount;
+private Random randomGenerator = new Random();
+private byte[] testBytes;
 
 private boolean isTestsDone() {
 boolean result;
 synchronized (this) {
-result = _testCount == _completedCount;
+result = totalTestCount == completedTestCount;
--- End diff --

@swill I'll try to reproduce and fix with a patch to reduce the numbers. 
Test count 0 to len -1 is still a total `len` counts so this is correct. 
Consider then, 0 to 4 is `0, 1, 2, 3 , 4` --> they are 5 runs/rounds/counts


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272691#comment-15272691
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217220334
  
@rhtyd but `totalTestCount = 5` and I don't think that `completedTestCount` 
will ever be larger than `4`, so I don't know how that check could be right...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272724#comment-15272724
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


GitHub user rhtyd opened a pull request:

https://github.com/apache/cloudstack/pull/1534

CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop

- Reduces SSL handshake timeout to 15s, previously this was only 10s in
  commit debfcdef788ce0d51be06db0ef10f6815f9b563b
- Adds an aggresive explicit wakeup to save the Nio main IO loop/handler 
from
  getting blocked
- Fix NioTest to fail/succeed in about 60s, previously this was 300s
- Due to aggresive wakeup usage, NioTest should complete in less than 5s on 
most
  systems. On virtualized environment this may slightly increase due to 
thread,
  CPU burst/scheduling delays.

/cc @swill  please review and merge.
Sorry about the previous values, they were not optimized for virtualized 
env. The aggressive selector.wakeup will ensure main IO loop does not get 
blocked even by malicious users, even for any timeout (ssl handshake etc).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shapeblue/cloudstack niotest-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cloudstack/pull/1534.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1534


commit ea22869593f68a3a34b12aeb23c2bb6c34efd365
Author: Rohit Yadav 
Date:   2016-05-05T17:49:33Z

CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop

- Reduces SSL handshake timeout to 15s, previously this was only 10s in
  commit debfcdef788ce0d51be06db0ef10f6815f9b563b
- Adds an aggresive explicit wakeup to save the Nio main IO loop/handler 
from
  getting blocked
- Fix NioTest to fail/succeed in about 60s, previously this was 300s
- Due to aggresive wakeup usage, NioTest should complete in less than 5s on 
most
  systems. On virtualized environment this may slightly increase due to 
thread,
  CPU burst/scheduling delays.

Signed-off-by: Rohit Yadav 




> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272725#comment-15272725
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217226167
  
@swill I've ran my tests, please review and merge this -- 
https://github.com/apache/cloudstack/pull/1534



> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272727#comment-15272727
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217226501
  
@DaanHoogland @jburwell @wido @swill and others -- please review, this 
mainly fixes the NioTest which was failing so if it's okay and works for Travis 
and Will's CI let's merge this. Thanks.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272718#comment-15272718
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217225573
  
@swill I'm pushing a fix for you.
The initial value is 0, as clients send data it's incremented by 1. At the 
end it's expected that total number of data sent matches data received by 
server. If test count is 5, then completed test count is also 5; as the loop 
runs 5 clients with indexes/ids - 0, 1, 2, 3, 4 <- count no. of clients created.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272732#comment-15272732
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217227191
  
In my lasts run, not a single test passed in the time frame something is 
wrong.  Previously it was failing on 4/5, but this time it timed out without a 
single test passing of the 5...

```
2016-05-05 19:46:17,659 DEBUG [utils.testcase.NioTest] (Time-limited test:) 
0/5 tests done. Waiting for completion
2016-05-05 19:46:18,660 DEBUG [utils.testcase.NioTest] (Time-limited test:) 
0/5 tests done. Waiting for completion
2016-05-05 19:46:19,660 DEBUG [utils.testcase.NioTest] (Time-limited test:) 
0/5 tests done. Waiting for completion
2016-05-05 19:46:20,367 INFO  [utils.testcase.NioTest] (main:) Clients 
stopped.
2016-05-05 19:46:20,367 INFO  [utils.testcase.NioTest] (main:) Server 
stopped.
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 300.095 sec 
<<< FAILURE! - in com.cloud.utils.testcase.NioTest
testConnection(com.cloud.utils.testcase.NioTest)  Time elapsed: 300.095 sec 
 <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 30 
milliseconds
at java.lang.Thread.sleep(Native Method)
at com.cloud.utils.testcase.NioTest.testConnection(NioTest.java:146)
```


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272733#comment-15272733
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217227342
  
Thanks, will review...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272918#comment-15272918
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217252920
  
@rhtyd I have some bad news on this PR.  I have been having issues in CI 
ever since this got merged into master.  When the tests don't run (and fail 
which causes the CI run to fail), then the DeployDatacenter script will fail.  
It looks like this code is treating the hosts as a malicious client.  We get a 
handshake and then things fail.  We basically get a `Failed to add host` error.

I can get you more details if you need.  

I will test the #1534 PR to see if that fixes things, but I am a bit 
concerned about this PR right now...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273025#comment-15273025
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217270975
  
@swill sure thanks, please try with PR #1534 and if you still hit the 
issue, please revert the commit locally first; run against your environment and 
confirm that your environment works without the Nio fix (make sure both mgmt 
server and KVM agent have the both the PR fixes, or in case you revert make 
sure to rebuild mgmt server and kvm agent with reverted commits) in which case 
I'll try to reproduce and fix the addHost error. 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273037#comment-15273037
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217272766
  
I confirmed that reverting this PR locally does fix my DeployDatacenter 
issues.

I did an initial test with #1534 and it did get past the DeployDatacenter 
phase and started testing, but it did not run the Nio tests (apparently it only 
runs that test sometimes?).  I stopped that run and cleaned everything up and 
am running the CI against #1534 again to see if I can get it to run the tests 
(and pass) and also come back with a clean CI run.  I will update that PR with 
the status later tonight.

Thanks for looking into this quickly to unblock our ability to do CI.  👍 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273081#comment-15273081
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217278185
  
Well the test runs a lot faster/cleaner now.  👍 

```
Running com.cloud.utils.testcase.NioTest
2016-05-05 22:53:54,828 INFO  [utils.testcase.NioTest] (main:) Setting up 
Benchmark Test
2016-05-05 22:53:54,861 INFO  [utils.nio.NioServer] (main:) NioConnection 
started and listening on /0:0:0:0:0:0:0:0:41317
2016-05-05 22:53:54,874 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-1:) Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,874 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-3:) Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,874 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-4:) Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,882 DEBUG [utils.testcase.NioTest] (Time-limited test:) 
0/4 tests done. Waiting for completion
2016-05-05 22:53:54,875 INFO  [utils.testcase.NioTest] 
(MaliciousNioClientHandler-2:) Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,885 INFO  [utils.nio.NioClient] (NioClientHandler-4:) 
Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,884 INFO  [utils.nio.NioClient] (NioClientHandler-3:) 
Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,878 INFO  [utils.nio.NioClient] (NioClientHandler-2:) 
Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,877 INFO  [utils.nio.NioClient] (NioClientHandler-1:) 
Connecting to 127.0.0.1:41317
2016-05-05 22:53:54,899 DEBUG [utils.crypt.EncryptionSecretKeyChecker] 
(pool-1-thread-1:) Encryption Type: null
2016-05-05 22:53:54,902 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,039 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,045 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,054 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,112 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,119 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,126 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,145 WARN  [utils.nio.Link] (pool-1-thread-1:) SSL: Fail 
to find the generated keystore. Loading fail-safe one to continue.
2016-05-05 22:53:55,886 DEBUG [utils.testcase.NioTest] (Time-limited test:) 
0/4 tests done. Waiting for completion
2016-05-05 22:53:56,152 INFO  [utils.nio.NioClient] (NioClientHandler-3:) 
SSL: Handshake done
2016-05-05 22:53:56,152 INFO  [utils.nio.NioClient] (NioClientHandler-3:) 
Connected to 127.0.0.1:41317
2016-05-05 22:53:56,198 INFO  [utils.testcase.NioTest] 
(NioTestClient-2-Handler-1:) Client: Received CONNECT task
2016-05-05 22:53:56,258 INFO  [utils.testcase.NioTest] 
(NioTestClient-2-Handler-1:) Sending data to server
2016-05-05 22:53:56,236 INFO  [utils.nio.NioClient] (NioClientHandler-1:) 
SSL: Handshake done
2016-05-05 22:53:56,259 INFO  [utils.nio.NioClient] (NioClientHandler-1:) 
Connected to 127.0.0.1:41317
2016-05-05 22:53:56,232 INFO  [utils.nio.NioClient] (NioClientHandler-4:) 
SSL: Handshake done
2016-05-05 22:53:56,260 INFO  [utils.nio.NioClient] (NioClientHandler-4:) 
Connected to 127.0.0.1:41317
2016-05-05 22:53:56,225 INFO  [utils.nio.NioClient] (NioClientHandler-2:) 
SSL: Handshake done
2016-05-05 22:53:56,260 INFO  [utils.nio.NioClient] (NioClientHandler-2:) 
Connected to 127.0.0.1:41317
2016-05-05 22:53:56,260 INFO  [utils.testcase.NioTest] 
(NioTestClient-0-Handler-1:) Client: Received CONNECT task
2016-05-05 22:53:56,285 INFO  [utils.testcase.NioTest] 
(NioTestClient-0-Handler-1:) Sending data to server
2016-05-05 22:53:56,285 INFO  [utils.testcase.NioTest] 
(NioTestClient-3-Handler-1:) Client: Received CONNECT task
2016-05-05 22:53:56,331 INFO  [utils.testcase.NioTest] 
(NioTestClient-3-Handler-1:) Sending data to server
2016-05-05 22:53:56,284 INFO  [utils.testcase.NioTest] 
(NioTestClient-1-Handler-1:) Client: Received CONNECT task
2016-05-05 22:53:56,368 INFO  [utils.testcase.NioTest] 
(NioTestClient-1-Handler-1:) Sending data to server
2016-05-05 22:53:56,286 INFO  [utils.testcase.NioTes

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273133#comment-15273133
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1493#issuecomment-217286830
  
@swill the NioTest is a unit test and will only run during compilation. If 
you hit any issues, feel free to revert the commit with some details on how I 
may be able to reproduce your issues and fix them. Thanks.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273138#comment-15273138
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217287468
  
@swill thanks for sharing, I made the NioConnection's main IO loop 
aggressive and reduced the SSL handshake timeout to 15s (this was previously 
10s, but over last year instead of fixing core issue as we did not know that 
cause and details, I had increased it to 60s in Link class). This sort of 
optimization will help CloudStack perform speedy re-connection and handling of 
clients, even 1000s of malicious clients won't be able to block the main IO 
loop from handling other requests.


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273922#comment-15273922
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user rhtyd commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217413262
  
@swill this PR is ready for CI test run and merge, thanks


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274027#comment-15274027
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user jburwell commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217442413
  
LGTM based on code review


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274498#comment-15274498
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217519767
  


### CI RESULTS

```
Tests Run: 85
  Skipped: 0
   Failed: 2
   Errors: 0
 Duration: 4h 31m 04s
```

**Summary of the problem(s):**
```
FAIL: Test redundant router internals
--
Traceback (most recent call last):
  File 
"/data/git/cs1/cloudstack/test/integration/smoke/test_routers_network_ops.py", 
line 290, in test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true
"Attempt to retrieve google.com index page should be successful!"
AssertionError: Attempt to retrieve google.com index page should be 
successful!
--
Additional details in: /tmp/MarvinLogs/test_network_C0JJZR/results.txt
```

```
FAIL: test_02_vpc_privategw_static_routes 
(integration.smoke.test_privategw_acl.TestPrivateGwACL)
--
Traceback (most recent call last):
  File 
"/data/git/cs1/cloudstack/test/integration/smoke/test_privategw_acl.py", line 
253, in test_02_vpc_privategw_static_routes
self.performVPCTests(vpc_off)
  File 
"/data/git/cs1/cloudstack/test/integration/smoke/test_privategw_acl.py", line 
304, in performVPCTests
privateGw_1 = self.createPvtGw(vpc_1, "10.0.3.100", "10.0.3.101", 
acl1.id, vlan_1)
  File 
"/data/git/cs1/cloudstack/test/integration/smoke/test_privategw_acl.py", line 
472, in createPvtGw
self.fail("Failed to create Private Gateway ==> %s" % e)
AssertionError: Failed to create Private Gateway ==> Execute cmd: 
createprivategateway failed, due to: errorCode: 431, errorText:Network with 
vlan vlan://100 already exists in zone 1
--
Additional details in: /tmp/MarvinLogs/test_network_C0JJZR/results.txt
```



**Associated Uploads**

**`/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL:`**
* 
[dc_entries.obj](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL/dc_entries.obj)
* 
[failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL/failed_plus_exceptions.txt)
* 
[runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/DeployDataCenter__May_06_2016_15_29_07_FHJDSL/runinfo.txt)

**`/tmp/MarvinLogs/test_network_C0JJZR:`**
* 
[failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_network_C0JJZR/failed_plus_exceptions.txt)
* 
[results.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_network_C0JJZR/results.txt)
* 
[runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_network_C0JJZR/runinfo.txt)

**`/tmp/MarvinLogs/test_vpc_routers_7C30C4:`**
* 
[failed_plus_exceptions.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_vpc_routers_7C30C4/failed_plus_exceptions.txt)
* 
[results.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_vpc_routers_7C30C4/results.txt)
* 
[runinfo.txt](https://objects-east.cloud.ca/v1/e465abe2f9ae4478b9fff416eab61bd9/PR1534/tmp/MarvinLogs/test_vpc_routers_7C30C4/runinfo.txt)


Uploads will be available until `2016-07-06 02:00:00 +0200 CEST`

*Comment created by [`upr comment`](https://github.com/cloudops/upr).*



> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio

[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274509#comment-15274509
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9348:


Github user swill commented on the pull request:

https://github.com/apache/cloudstack/pull/1534#issuecomment-217521378
  
I am not concerned about the two failures.  One happens randomly in my 
environment and one is a cleanup issue between test runs which is not related 
to this PR.

Since `master` is currently broken due to some issues with #1493, I am 
going to merge this right away...


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-9348) CloudStack Server degrades when a lot of connections on port 8250

2016-05-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274517#comment-15274517
 ] 

ASF subversion and git services commented on CLOUDSTACK-9348:
-

Commit 9f970f28b18534dffe33196ead60ea861f501fa9 in cloudstack's branch 
refs/heads/master from [~williamstev...@gmail.com]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=9f970f2 ]

Merge pull request #1534 from shapeblue/niotest-fix

CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop- Reduces SSL 
handshake timeout to 15s, previously this was only 10s in
  commit debfcdef788ce0d51be06db0ef10f6815f9b563b
- Adds an aggresive explicit wakeup to save the Nio main IO loop/handler from
  getting blocked
- Fix NioTest to fail/succeed in about 60s, previously this was 300s
- Due to aggresive wakeup usage, NioTest should complete in less than 5s on most
  systems. On virtualized environment this may slightly increase due to thread,
  CPU burst/scheduling delays.

/cc @swill  please review and merge.
Sorry about the previous values, they were not optimized for virtualized env. 
The aggressive selector.wakeup will ensure main IO loop does not get blocked 
even by malicious users, even for any timeout (ssl handshake etc).

* pr/1534:
  CLOUDSTACK-9348: Optimize NioTest and NioConnection main loop

Signed-off-by: Will Stevens 


> CloudStack Server degrades when a lot of connections on port 8250
> -
>
> Key: CLOUDSTACK-9348
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9348
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Reporter: Rohit Yadav
>Assignee: Rohit Yadav
> Fix For: 4.9.0
>
>
> An intermittent issue was found with a large CloudStack deployment, where 
> servers could not keep agents connected on port 8250.
> All connections are handled by accept() in NioConnection:
> https://github.com/apache/cloudstack/blob/master/utils/src/main/java/com/cloud/utils/nio/NioConnection.java#L125
> A new connection is handled by accept() which does blocking SSL handshake. A 
> good fix would be to make this non-blocking and handle expensive tasks in 
> separate threads/pool. This way the main IO loop won't be blocked and can 
> continue to serve other agents/clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >