We have done more investigations. An error happens when the script
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh is
executed:
MGMT LOGS:
2016-08-29 16:00:11,342 DEBUG [c.c.a.t.Request]
(AgentManager-Handler-12:null) (logid:) Seq 8-1837187172990452475:
Processing: { Ans: , MgmtId: 90520741415395, via: 8, Ver: v1, Flags:
10,
[{"com.cloud.agent.api.GetRouterAlertsAnswer":{"result":false,"details":"/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh:
1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh:
h!/bin/bash: not found","wait":0}}] }
2016-08-29 16:00:11,345 DEBUG [c.c.a.t.Request]
(RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Seq
8-1837187172990452475: Received: { Ans: , MgmtId: 90520741415395, via:
8(kvm702), Ver: v1, Flags: 10, { GetRouterAlertsAnswer } }
*2016-08-29 16:00:11,345 DEBUG [c.c.a.m.AgentManagerImpl]
(RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Details from
executing class com.cloud.agent.api.routing.GetRouterAlertsCommand:
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1:
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh:
h!/bin/bash: not found*
2016-08-29 16:00:11,345 WARN
[c.c.n.r.VirtualNetworkApplianceManagerImpl]
(RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Unable to get
alerts from router r-221-VM
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1:
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh:
h!/bin/bash: not found
2016-08-29 16:00:11,411 INFO [o.a.c.e.o.NetworkOrchestrator]
(Network-Scavenger-1:ctx-64537bea) (logid:0c5d8104)
NetworkGarbageCollector uses '600' seconds for GC interval.
2016-08-29 16:00:11,416 DEBUG [o.a.c.e.o.NetworkOrchestrator]
(Network-Scavenger-1:ctx-64537bea) (logid:0c5d8104) We found network 219
to be free for the first time. Adding it to the list: 1472479211
2016-08-29 16:00:12,392 DEBUG [c.c.s.StatsCollector]
(StatsCollector-6:ctx-c1aeb528) (logid:a364cc05) StorageCollector is
running...
2016-08-29 16:00:12,403 DEBUG [c.c.h.o.r.Ovm3HypervisorGuru]
(StatsCollector-6:ctx-c1aeb528) (logid:a364cc05)
getCommandHostDelegation: class com.cloud.agent.api.GetStorageStatsCommand
For debugging we modified the "router_proxy.sh" slightly on the KVM Host:
kvm# grep -A 2 DEBUG
/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh -n
46:# DEBUG CS Upgrade
*47-echo "$(date): ssh -p 3922 -q -o StrictHostKeyChecking=no -i $cert
root@$domRIp /opt/cloud/bin/$script $*" >> /tmp/router.txt*
48-ssh -p 3922 -q -o StrictHostKeyChecking=no -i $cert root@$domRIp
"/opt/cloud/bin/$script $*"
kvm# tail /tmp/router.txt
Mon Aug 29 16:10:11 CEST 2016: ssh -p 3922 -q -o
StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud [email protected]
/opt/cloud/bin/netusage.sh -g
Mon Aug 29 16:10:11 CEST 2016: ssh -p 3922 -q -o
StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud [email protected]
/opt/cloud/bin/netusage.sh -g
On the router with IP 169.254.2.67 we see the failed logins as well:
root@r-221-VM:~# tail -n 2 /var/log/auth.log
Aug 29 14:10:11 r-221-VM sshd[7074]: Connection closed by 169.254.0.1
[preauth]
Aug 29 14:10:11 r-221-VM sshd[7075]: Connection closed by 169.254.0.1
[preauth]
These commands did not succeed. If we execute the command manually we
get prompted for the key passphrase.
kvm# ssh -p 3922 -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud
[email protected]
Enter passphrase for key '/root/.ssh/id_rsa.cloud':
can someone confirm that the rsa key is protected with a passhrase? this
looks suspicious to us...
regards
martin
On 08/29/2016 10:09 AM, martin kolly wrote:
> thanks Simon and Rohit for the valuable inputs! After applying the
> following procedure the ssl errors are gone and host state is UP.
>
> # service cloudstack-management stop
>
> mysql> delete from configuration where name = "ssl.keystore" ;
> # mv /etc/cloudstack/management/cloudmanagementserver.keystore
> /etc/cloudstack/management/cloudmanagementserver.keystore.old
>
> # service cloudstack-management start
>
> # file /etc/cloudstack/management/cloudmanagementserver.keystore
> /etc/cloudstack/management/cloudmanagementserver.keystore: Java KeyStore
> # file /etc/cloudstack/management/cloudmanagementserver.keystore.old
> /etc/cloudstack/management/cloudmanagementserver.keystore.old: data
>
> #java -version
> java version "1.7.0_111"
>
> However new routers are not deployed, we still see entries that KVM
> hosts are unreachable (logs underneath).
>
> After starting a new router it is shown with "virsh list" on the KVM:
> kvm# virsh list | grep r-233-VM
> 111 r-233-VM running
>
> After some seconds the router is deleted,
>
> kvm# virsh list | grep r-233-VM
> #
>
> kvm#virsh -v
> 1.2.2
>
> (libvirt was already restarted)
>
>
>
> ############ Logs/Info KVM #################
>
> kvm # dpkg -l | grep cloudstack
> ii cloudstack-agent
> 4.9.0 all CloudStack agent
> ii cloudstack-common
> 4.9.0 all A common package
> which contains files which are shared by several CloudStack packages
>
> kvm# df -kh | grep cloud
> 10.100.12.9:/export/cloud 188G 137G 51G 73%
> /mnt/5db02c19-1e8f-3591-bdb4-02608362521e
>
> kvm# tail -f /var/log/cloudstack/agent/agent.log
> ...
> 2016-08-29 06:55:06.203+0000: 4425: error :
> qemuMonitorFindBalloonObjectPath:1032 : internal error: Cannot determine
> balloon device path
> 2016-08-29 06:55:06.223+0000: 4426: error :
> qemuMonitorFindBalloonObjectPath:1032 : internal error: Cannot determine
> balloon device path
> 2016-08-29 06:55:24.062+0000: 4425: warning : qemuDomainObjTaint:1628 :
> Domain id=112 name='r-233-VM' uuid=d9dcd37a-242d-43ac-a18e-79a4bfa86ebb
> is tainted: high-privileges
> 2016-08-29 06:55:49.066+0000: 4423: error : qemuMonitorIO:656 : internal
> error: End of file from monitor
>
>
>
>
> ############ Logs/Info Management Server ########
>
> # dpkg -l | grep cloudstack
> ii cloudstack-agent
> 4.9.0 all CloudStack agent
> ii cloudstack-common
> 4.9.0 all A common package
> which contains files which are shared by several CloudStack packages
> ii cloudstack-management
> 4.9.0 all CloudStack server library
> ii cloudstack-usage
> 4.9.0 all CloudStack usage monitor
>
> # mysql -u root cloud -e "select id,name,path from cloud.storage_pool
> where pool_type='Filesystem'"
> +----+----------------------+-------------------------+
> | id | name | path |
> +----+----------------------+-------------------------+
> | 1 | kvm704 Local Storage | /var/lib/libvirt/images |
> | 2 | kvm701 Local Storage | /var/lib/libvirt/images |
> | 3 | kvm702 Local Storage | /var/lib/libvirt/images |
> | 4 | kvm703 Local Storage | /var/lib/libvirt/images |
> +----+----------------------+-------------------------+
>
> #tail -f /var/log/cloudstack/management/management-server.log | grep -v
> DEBUG
> ...
> 2016-08-29 08:55:42,484 WARN [o.a.c.alerts]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) alertType:: 9 // dataCenterId:: 1 // podId:: 1 //
> clusterId:: null // message:: Command:
> com.cloud.agent.api.GetDomRVersionCommand failed while starting virtual
> router
> 2016-08-29 08:55:42,494 ERROR
> [c.c.n.r.VirtualNetworkApplianceManagerImpl]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) GetDomRVersionCmd failed
> 2016-08-29 08:55:42,495 WARN
> [c.c.n.r.VirtualNetworkApplianceManagerImpl]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) Command: com.cloud.agent.api.GetDomRVersionCommand
> failed while starting virtual router
> 2016-08-29 08:55:42,495 INFO [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) The guru did not like the answers so stopping
> VM[DomainRouter|r-233-VM]
> 2016-08-29 08:55:49,184 INFO [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-5bd7c179) (logid:86c0376e) Begin cleanup
> expired async-jobs
> 2016-08-29 08:55:49,192 INFO [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-5bd7c179) (logid:86c0376e) End cleanup
> expired async-jobs
> 2016-08-29 08:55:50,861 ERROR [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) Failed to start instance VM[DomainRouter|r-233-VM]
> com.cloud.utils.exception.ExecutionException: Unable to start
> VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart,
> not retrying
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084)
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
> at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
> at
> com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753)
> at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-08-29 08:55:51,136 ERROR [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) Invocation exception, caused by:
> com.cloud.exception.AgentUnavailableException: Resource [Host:8] is
> unreachable: Host 8: Unable to start instance due to Unable to start
> VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart,
> not retrying
> 2016-08-29 08:55:51,136 INFO [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a)
> (logid:337b354e) Rethrow exception
> com.cloud.exception.AgentUnavailableException: Resource [Host:8] is
> unreachable: Host 8: Unable to start instance due to Unable to start
> VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart,
> not retrying
> 2016-08-29 08:55:51,137 ERROR [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739) (logid:337b354e)
> Unable to complete AsyncJobVO {id:2739, userId: 2, accountId: 2,
> instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart,
> cmdInfo:
> rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAA6XQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAAAAAADHcIAAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw,
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 90520741415395, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: Mon Aug 29 08:55:21 CEST 2016}, job
> origin:2738
> com.cloud.exception.AgentUnavailableException: Resource [Host:8] is
> unreachable: Host 8: Unable to start instance due to Unable to start
> VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart,
> not retrying
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120)
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
> at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
> at
> com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753)
> at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: com.cloud.utils.exception.ExecutionException: Unable to
> start VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in
> finalizeStart, not retrying
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084)
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
> at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source)
> ... 17 more
> 2016-08-29 08:55:51,168 INFO [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739) (logid:337b354e)
> Remove job-2739 from job monitoring
> 2016-08-29 08:55:53,197 INFO [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-113:ctx-6411b64b job-2738/job-2740) (logid:db064d99)
> Add job-2740 into job monitoring
> 2016-08-29 08:55:53,218 INFO [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-113:ctx-6411b64b job-2738/job-2740) (logid:337b354e)
> Remove job-2740 from job monitoring
> 2016-08-29 08:55:53,247 ERROR [c.c.a.ApiAsyncJobDispatcher]
> (API-Job-Executor-68:ctx-d2eb7557 job-2738) (logid:337b354e) Unexpected
> exception while executing
> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
> com.cloud.exception.AgentUnavailableException: Resource [Host:8] is
> unreachable: Host 8: Unable to start instance due to Unable to start
> VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart,
> not retrying
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120)
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
> at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
> at
> com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753)
> at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: com.cloud.utils.exception.ExecutionException: Unable to
> start VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in
> finalizeStart, not retrying
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084)
> at
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
> at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source)
> ... 17 more
> 2016-08-29 08:55:53,256 INFO [o.a.c.f.j.i.AsyncJobMonitor]
> (API-Job-Executor-68:ctx-d2eb7557 job-2738) (logid:337b354e) Remove
> job-2738 from job monitoring
>
>
>
>
> regards
> Martin
>
>
>
>
> On 08/26/2016 08:47 PM, Rohit Yadav wrote:
>> Hi Martin,
>>
>>
>> I checked my openssl connect output and my management server on port 8250
>> was indeed returning valid certificates. On further investigation, when
>> connections to management server (NioServer, handled by Link class that
>> reads /etc/cloudstack/management/cloudmanagementserver.keystore) are
>> received it uses a keystore file
>> (/etc/cloudstack/management/cloudmanagementserver.keystore). Please check if
>> that's valid, or backup your keystore file and do what Simon has suggested.
>>
>>
>>
>> For reference, here the output from my 4.9.0 mgmt server running on Ubuntu
>> 14.04.4:
>>
>>
>> $ openssl s_client -tls1 -connect 192.168.1.11:8250
>> CONNECTED(00000003)
>> depth=0 C = Unknown, O = bluebox, OU = bluebox, CN = Cloudstack User
>> verify error:num=18:self signed certificate
>> verify return:1
>> depth=0 C = Unknown, O = bluebox, OU = bluebox, CN = Cloudstack User
>> verify return:1
>> ---
>> Certificate chain
>> 0 s:/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User
>> i:/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User
>> ---
>> Server certificate
>> -----BEGIN CERTIFICATE-----
>> MIIDPzCCAiegAwIBAgIEFHHX2DANBgkqhkiG9w0BAQsFADBQMRAwDgYDVQQGEwdV
>> bmtub3duMRAwDgYDVQQKEwdibHVlYm94MRAwDgYDVQQLEwdibHVlYm94MRgwFgYD
>> VQQDEw9DbG91ZHN0YWNrIFVzZXIwHhcNMTUwNTA2MTM0MjMzWhcNMjUwNTAzMTM0
>> MjMzWjBQMRAwDgYDVQQGEwdVbmtub3duMRAwDgYDVQQKEwdibHVlYm94MRAwDgYD
>> VQQLEwdibHVlYm94MRgwFgYDVQQDEw9DbG91ZHN0YWNrIFVzZXIwggEiMA0GCSqG
>> SIb3DQEBAQUAA4IBDwAwggEKAoIBAQC0wN8kfOJMzwlbrOnBj/jjvjjIwDVpYLtH
>> WoKkNB+rzzKEUxaYwaQxe6E3M536ZuqcaJBIqcYPwTIkWyulvuHuJpSQak4VbuDV
>> f7dqt5RacLFT0jUciqTvL5QDCrk0uNugKkWgEvNtokVGSBwLPVEcdcGWpku1EpeH
>> vMYmpOkcWgbC8Z9D7QTlVw6oEWbPAtKr+gDrXdOFnpPPI45rteatIIgKm1Q6JjZM
>> qrUKfqt7s8ts6ZgdAN2WmtieSsnUX1su9SJMYg2J8LK7UJeGqiNtE+g944GPqtnW
>> aDldjharq54e79ug2ktxw29I3ulpRD/vgxwZmcPJrePUKwY91KEjAgMBAAGjITAf
>> MB0GA1UdDgQWBBSUJvLY8RL/1fTVqj1rT8136da4yzANBgkqhkiG9w0BAQsFAAOC
>> AQEAgDdFIgLvOH/UgRp2nnFUVcMp+uchSLj8CbCkukJBrUwrmJHp3Os+H1ggk8Vt
>> j3conj06zJBNN/E0J8pcpagE1aR+l4R8WxF3g/Oc7bNyrUlkGSQ82vavg9sEkwHY
>> eQY/4wj8CprICs9JilgZ6keeWNWgAW1goLZSzGVwz5eE0lPuc2Dg3laR5RsuTxie
>> dgQhpbOx3UZun+dhuP5NUHc+KWyrNvSZNN8FruO602KWZwm0Hndl7RVbkNEd0kxq
>> FhFK4Scc2HBrKMUrPTzO1nGCgR1gA015C2MFfmjeW49VTi95WnY8DDG2euUYAtpl
>> lLVuNxJxq7eDJfP/M9kxSIKgrQ==
>> -----END CERTIFICATE-----
>> subject=/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User
>> issuer=/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User
>> ---
>> No client certificate CA names sent
>> Server Temp Key: ECDH, P-256, 256 bits
>> ---
>> SSL handshake has read 1325 bytes and written 331 bytes
>> ---
>> New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-SHA
>> Server public key is 2048 bit
>> Secure Renegotiation IS supported
>> Compression: NONE
>> Expansion: NONE
>> No ALPN negotiated
>> SSL-Session:
>> Protocol : TLSv1
>> Cipher : ECDHE-RSA-AES256-SHA
>> Session-ID:
>> 57C08E5D841D9B74AB533A0000BDBE4275424640B5B7D1985D438B275598436F
>> Session-ID-ctx:
>> Master-Key:
>> E1DE7D7464766DCECD82088561A7D928519CA20BA91FDF4AA38BB2E51AB7C27D3F1ABA5788ADED2AD49FE8597EC4A344
>> Key-Arg : None
>> PSK identity: None
>> PSK identity hint: None
>> SRP username: None
>> Start Time: 1472237149
>> Timeout : 7200 (sec)
>> Verify return code: 18 (self signed certificate)
>> ---
>>
>>
>>
>>
>> Regards.
>>
>> ________________________________
>> From: martin kolly <[email protected]>
>> Sent: 26 August 2016 20:36:02
>> To: [email protected]
>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601
>>
>>
>> good point, thanks Simon! with openssl we receive a response on port 8250.
>>
>> # telnet 10.100.12.10 8250
>> Trying 10.100.12.10...
>> Connected to 10.100.12.10.
>> Escape character is '^]'.
>> Connection closed by foreign host.
>>
>> # nc -zv 10.100.12.10 8250
>> Connection to 10.100.12.10 8250 port [tcp/*] succeeded!
>>
>> # openssl s_client -tls1 -connect 10.100.12.10:8250
>> CONNECTED(00000003)
>> write:errno=104
>> ---
>> no peer certificate available
>> ---
>> No client certificate CA names sent
>> ---
>> SSL handshake has read 0 bytes and written 0 bytes
>> ---
>> New, (NONE), Cipher is (NONE)
>> Secure Renegotiation IS NOT supported
>> Compression: NONE
>> Expansion: NONE
>> SSL-Session:
>> Protocol : TLSv1
>> Cipher : 0000
>> Session-ID:
>> Session-ID-ctx:
>> Master-Key:
>> Key-Arg : None
>> PSK identity: None
>> PSK identity hint: None
>> SRP username: None
>> Start Time: 1472223447
>> Timeout : 7200 (sec)
>> Verify return code: 0 (ok)
>> ---
>>
>>
>> On 08/26/2016 04:49 PM, Simon Weller wrote:
>>> Martin,
>>>
>>>
>>> Are you able to actually telnet to 8250 from the host to the mgmt server?
>>>
>>>
>>> - Si
>>>
>>>
>>> ________________________________
>>> From: martin kolly <[email protected]>
>>> Sent: Friday, August 26, 2016 9:41 AM
>>> To: [email protected]
>>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601
>>>
>>> Hi Rohit
>>>
>>> We highly appreciate your efforts! Unfortunately it still does not work.
>>> - ulimit is increased on mgmt server
>>> - jar file replaced
>>> - we confirm that cloudstack-agent 4.9.0 is installed
>>>
>>> MGMT Server
>>> # wget
>>> https://github.com/rhtyd/cloudstack/releases/download/4.9.0-nioinbound/cloud-utils-4.9.0.jar
>>> -O cloud-utils-4.9.0.jar.patch
>>> # md5sum cloud-utils-4.9.0.jar.patch
>>> c4496f42cc6741f562ac645c3a3d8a0c cloud-utils-4.9.0.jar.patch
>>> # md5sum
>>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar
>>> c4496f42cc6741f562ac645c3a3d8a0c
>>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar
>>>
>>> # ulimit -a
>>> core file size (blocks, -c) 0
>>> data seg size (kbytes, -d) unlimited
>>> scheduling priority (-e) 0
>>> file size (blocks, -f) unlimited
>>> pending signals (-i) 64109
>>> max locked memory (kbytes, -l) 64
>>> max memory size (kbytes, -m) unlimited
>>> open files (-n) 10240
>>> pipe size (512 bytes, -p) 8
>>> POSIX message queues (bytes, -q) 819200
>>> real-time priority (-r) 0
>>> stack size (kbytes, -s) 8192
>>> cpu time (seconds, -t) unlimited
>>> max user processes (-u) 64109
>>> virtual memory (kbytes, -v) unlimited
>>> file locks (-x) unlimited
>>>
>>> KVM Server
>>> # wget
>>> https://github.com/rhtyd/cloudstack/releases/download/4.9.0-nioinbound/cloud-utils-4.9.0.jar
>>> -O cloud-utils-4.9.0.jar.patch
>>> # md5sum cloud-utils-4.9.0.jar.patch
>>> c4496f42cc6741f562ac645c3a3d8a0c cloud-utils-4.9.0.jar.patch
>>> # md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar
>>> c4496f42cc6741f562ac645c3a3d8a0c
>>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar
>>>
>>> # apt-cache policy cloudstack-agent
>>> cloudstack-agent:
>>> Installed: 4.9.0
>>> Candidate: 4.9.0
>>> Version table:
>>> *** 4.9.0 0
>>> 500 http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/
>>> ./ Packages
>>> 100 /var/lib/dpkg/status
>>>
>>> The logs are attached. By the way: the error message "Caught the Exception
>>> in VmIpFetchTask" was already there with 4.8 release.
>>>
>>> Thanks
>>> Martin
>>>
>>> On 08/25/2016 07:44 PM, Rohit Yadav wrote:
>>>
>>> Hi Martin,
>>>
>>>
>>> Thanks for sharing. Alright, I'm not sure what's causing issue but based on
>>> the logs seems like only KVM agents are having issues while connecting to
>>> mgmt server as I don't see any Nio related exceptions in the management
>>> server logs.
>>>
>>>
>>> I could not see the cloudstack-agent version in the logs, I'm assuming that
>>> they were all upgraded to 4.9.0, and there are no conflicting jars at
>>> /usr/share/cloudstack-agent/lib.
>>>
>>>
>>> First, can you make sure mgmt server has enough ulimit. I found that
>>> Ubuntu/Debian's init.d script don't override this while CentOS
>>> initd/systemd script sets ulimit. On your mgmt server, edit
>>> /etc/init.d/cloudstack-management and add ulimit -n 10240 just before the
>>> mgmt server is started in the 'state' section (for me it was at around line
>>> #147 where it logs a message that it's starting the cloudstack-management
>>> server).
>>>
>>>
>>> Next, if this still does not solve the issue -- I created a special
>>> cloud-utils.jar for you that you need to place on your mgmt server and on
>>> the KVM agents and restart the mgmt server. This will increase verbosity of
>>> the error while reduce the Nio polling loop timeout (from 100ms to 10ms).
>>> On KVM agents, the error from the logs is that during SSL handshake inbound
>>> connection/stream gets closed, and we want to know the exception message.
>>> Please get the jar from here:
>>>
>>> https://github.com/rhtyd/cloudstack/releases/tag/4.9.0-nioinbound and place
>>> them at:
>>>
>>> /usr/share/cloudstack-agent/lib/ (on kvm host)
>>>
>>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/ (on mgmt
>>> server host)
>>>
>>>
>>> Let me know what worked for you, and if it still failed can you share the
>>> mgmt server and agent logs once again. Thanks.
>>>
>>>
>>> Regards.
>>>
>>> ________________________________
>>> From: martin kolly
>>> <[email protected]><mailto:[email protected]>
>>> Sent: 25 August 2016 20:50:08
>>> To: [email protected]<mailto:[email protected]>
>>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601
>>>
>>> Hi Rohit
>>>
>>> We are running java version 1.7.0.111 on KVM and management server.
>>> mgmt# java -version
>>> java version "1.7.0_111"
>>> kvm# java -version
>>> java version "1.7.0_111"
>>>
>>> We get the same error message. Attached are the logs with TRACE enabled.
>>>
>>> "success consists of going from failure to failure without loss of
>>> enthusiasm."
>>>
>>> regards
>>> martin
>>>
>>> On 08/25/2016 02:02 PM, Rohit Yadav wrote:
>>>
>>> Hi Martin,
>>>
>>>
>>> Thanks for sharing, on the surface there does not seem to be any issue in
>>> configuration causing the failures. I'm personally running KVM and Ubuntu
>>> hosts based env without issues, I'm on Ubuntu 14.04.4 (Linux bluebox
>>> 3.16.0-45-generic #60~14.04.1-Ubuntu) and java 1.7.0_79. Can you try
>>> upgrading your JRE7 to latest (openjdk-7-jre, 7u111-2.6.7-0ubuntu0.14.04.3)
>>> on all mgmt server and kvm hosts?
>>>
>>>
>>> If upgrading your JRE does not help, can you increase the logging verbosity
>>> for both the agent and management server (in /etc/cloudstack/{agent,
>>> management} there would be a log4j file, edit that and replace DEBUG/INFO
>>> with TRACE for class/keys com.cloud and org.apache.cloudstack) and re-share
>>> logs when the failures occur? I want to see what additional information we
>>> can get from logs when it tries to connect to host 10.100.12.10 on port:
>>> 8250.
>>>
>>>
>>> Regards.
>>>
>>> ________________________________
>>> From: martin kolly
>>> <[email protected]><mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> Sent: 25 August 2016 17:11:06
>>> To:
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601
>>>
>>>
>>> @Simon: We have one management server with local DB. KVMs connect
>>> directly to the management server without any security/loadbalancing
>>> device.
>>>
>>> Thanks
>>> Martin
>>>
>>> On 08/25/2016 12:41 PM, Simon Weller wrote:
>>>
>>>
>>> Martin,
>>>
>>> Can you provide more detail about your haproxy setup?
>>> Are you running it on separate servers, or on the management server itself?
>>>
>>> - Si
>>>
>>> Simon Weller/ENA
>>> (615) 312-6068
>>>
>>> [email protected]<mailto:[email protected]>
>>> www.shapeblue.com<http://www.shapeblue.com>
>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK
>>> @shapeblue
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: martin kolly
>>> [[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>]
>>> Received: Thursday, 25 Aug 2016, 5:04AM
>>> To: Rohit Yadav
>>> [[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>];
>>>
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>>
>>> [[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>]
>>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601
>>>
>>>
>>> thanks for your reply.
>>>
>>> This morning we repeated the upgrade process from 4.8 to 4.9 with the
>>> following repository:
>>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/.
>>> <http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/><http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/><http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/><http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/>
>>> Unfortunately we run into the same issue:
>>>
>>> /2016-08-25 09:49:00,660 INFO [utils.nio.NioClient] (main:null)
>>> (logid:) Connecting to 10.100.12.10:8250//
>>> //2016-08-25 09:49:00,668 WARN [utils.nio.Link] (main:null) (logid:)
>>> This SSL engine was forced to close inbound due to end of stream.//
>>> //2016-08-25 09:49:00,668 ERROR [utils.nio.NioClient] (main:null)
>>> (logid:) SSL Handshake failed while connecting to host: 10.100.12.10
>>> port: 8250//
>>> //2016-08-25 09:49:00,668 ERROR [utils.nio.NioConnection] (main:null)
>>> (logid:) Unable to initialize the threads.//
>>> //java.io.IOException: SSL Handshake failed while connecting to host:
>>> 10.100.12.10 port: 8250//
>>> // at com.cloud.utils.nio.NioClient.init(NioClient.java:67)//
>>> // at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88)//
>>> // at com.cloud.agent.Agent.start(Agent.java:237)//
>>> // at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399)//
>>> // at
>>> com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367)//
>>> // at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351)//
>>> // at com.cloud.agent.AgentShell.start(AgentShell.java:456)//
>>> // at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)//
>>> // at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)//
>>> // at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)//
>>> // at java.lang.reflect.Method.invoke(Method.java:606)//
>>> // at
>>> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)//
>>> //2016-08-25 09:49:00,669 INFO [utils.exception.CSExceptionErrorCode]
>>> (main:null) (logid:) Could not find exception:
>>> com.cloud.utils.exception.NioConnectionException in error code list for
>>> exceptions//
>>> //2016-08-25 09:49:00,669 WARN [cloud.agent.Agent] (main:null) (logid:)
>>> NIO Connection Exception
>>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed
>>> while connecting to host: 10.100.12.10 port: 8250//
>>> //2016-08-25 09:49:00,670 INFO [cloud.agent.Agent] (main:null) (logid:)
>>> Attempted to connect to the server, but received an unexpected
>>> exception, trying again.../
>>>
>>> *KVM Hosts:
>>> */# java -version
>>> java version "1.7.0_95"
>>> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
>>> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
>>>
>>> # dpkg --get-selections | grep -e 'jdk' -e 'java'
>>> ca-certificates-java install
>>> java-common install
>>> libcommons-daemon-java install
>>> openjdk-7-jre-headless:amd64 install
>>> tzdata-java install
>>>
>>> # apt-cache policy cloudstack-agent
>>> cloudstack-agent:
>>> Installed: 4.9.0
>>> Candidate: 4.9.0
>>> Version table:
>>> *** 4.9.0 0
>>> 500
>>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages
>>> 100 /var/lib/dpkg/status
>>>
>>> # find /usr/share/ -name "cloud-utils*.jar"
>>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar
>>> # md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar
>>> a8de7306d7c80b5a73e93b83afdd119f
>>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar
>>>
>>>
>>> /*Management Server:
>>> */# java -version//
>>> //java version "1.7.0_95"//
>>> //OpenJDK Runtime Environment (IcedTea 2.6.4)
>>> (7u95-2.6.4-0ubuntu0.14.04.1)//
>>> //OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)//
>>> //
>>> //# dpkg --get-selections | grep -e 'jdk' -e 'java'//
>>> //ca-certificates-java install//
>>> //java-common install//
>>> //libcommons-collections3-java install//
>>> //libcommons-daemon-java install//
>>> //libcommons-dbcp-java install//
>>> //libcommons-pool-java install//
>>> //libecj-java install//
>>> //libgeronimo-jta-1.1-spec-java install//
>>> //libmysql-java install//
>>> //libservlet2.5-java install//
>>> //libtomcat6-java install//
>>> //openjdk-7-jre-headless:amd64 install//
>>> //tzdata-java install//
>>> //
>>> //# apt-cache policy cloudstack-management//
>>> //cloudstack-management://
>>> // Installed: 4.9.0//
>>> // Candidate: 4.9.0//
>>> // Version table://
>>> // *** 4.9.0 0//
>>> // 500
>>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages//
>>> // 100 /var/lib/dpkg/status///
>>>
>>> /# find /usr/share/ -name "cloud-utils*.jar"//
>>> ///usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar//
>>> ///usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar//
>>> ///usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar//
>>> //# md5sum
>>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar//
>>> //a8de7306d7c80b5a73e93b83afdd119f
>>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar//
>>> //# md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar//
>>> //a8de7306d7c80b5a73e93b83afdd119f
>>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar//
>>> //# md5sum /usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar//
>>> //a8de7306d7c80b5a73e93b83afdd119f
>>> /usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar/
>>>
>>> The classpath.conf was not modified:
>>> /# cat /etc/cloudstack/management/classpath.conf
>>> #!/bin/bash
>>> #...
>>>
>>> SYSTEMJARS=""
>>> SCP=$(build-classpath $SYSTEMJARS 2>/dev/null) ; if [ $? != 0 ] ; then
>>> export SCP="" ; fi
>>> MCP=""
>>> DCP="/usr/share/tomcat6/bin/bootstrap.jar:/usr/share/tomcat6/bin/tomcat-juli.jar"
>>> CLASSPATH=$SCP:$DCP:$MCP:/etc/cloudstack/management:/usr/share/cloudstack-management/setup
>>> for jarfile in ""/* ; do
>>> if [ ! -e "$jarfile" ] ; then continue ; fi
>>> CLASSPATH=$jarfile:$CLASSPATH
>>> done
>>> for plugin in ""/* ; do
>>> if [ ! -e "$plugin" ] ; then continue ; fi
>>> CLASSPATH=$plugin:$CLASSPATH
>>> done
>>> for vendorconf in "/etc/cloudstack/management"/vendor/* ; do
>>> if [ ! -d "$vendorconf" ] ; then continue ; fi
>>> CLASSPATH=$vendorconf:$CLASSPATH
>>> done
>>> export CLASSPATH
>>> if ([ -z "$JAVA_HOME" ] || [ ! -d "$JAVA_HOME" ]) && [ -d
>>> /usr/lib/jvm/jre-1.7.0 ]; then
>>> export JAVA_HOME=/usr/lib/jvm/jre-1.7.0
>>> fi
>>> PATH=$JAVA_HOME/bin:/sbin:/usr/sbin:$PATH
>>> export PATH/
>>>
>>> Regards
>>> Martin
>>>
>>> On 08/24/2016 06:56 PM, Rohit Yadav wrote:
>>>
>>>
>>> Martin,
>>>
>>>
>>> Were you able to fix your issue after installing packages from the
>>> repo Will shared and restarting the services?
>>>
>>> I've not personally tested the apt-get.eu repo, but I had earlier
>>> built this repo which I'm personally using in my local KVM-trusty
>>> based cloud: http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/
>>>
>>>
>>> If you're still getting the error, can you share the JRE version
>>> you're running, both on the mgmt server and on the KVM hosts? You can
>>> run java -version, or share output of "dpkg --get-selections | grep -e
>>> 'jdk' -e 'java'". Are you running CloudStack with any additional plugins?
>>>
>>> From the logs, looks like there are mixed jar files,
>>> NioConnectionException class was not found -- something's wrong with
>>> your installation. there must be a cloud-utils jar file make sure your
>>> installation don't have multiple copies/versions of jars
>>> (somewhere) in the in /usr/share/cloudstack-common and in
>>> /usr/share/cloudstack-management/webapps/client/ paths:
>>>
>>> Could not find exception:
>>> com.cloud.utils.exception.NioConnectionException in error code list for
>>> exceptions
>>> The error "Unable to initialize the threads." suggests, JVM was not
>>> able to spawn threads. I would like to know your JRE version and any
>>> other settings configured in /etc/cloudstack/management/classpath.conf
>>> (and there are bunch of other files where JAVA_OPTS might have been
>>> overridden). Note: For now you should only be using JRE1.7.
>>>
>>>
>>> Regards.
>>>
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> www.shapeblue.com<http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com>
>>> @shapeblue
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* martin kolly
>>> <[email protected]><mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> *Sent:* 24 August 2016 19:53:26
>>> *To:*
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>;
>>> Rohit Yadav
>>> *Subject:* Re: CS 4.9 NIO Selector wait time PR-1601
>>>
>>> Thanks Will!
>>>
>>> yes the repo is pointing to 4.9 release for all KVMs and for the
>>> management server:
>>> /cloudstack:~# cat /etc/apt/sources.list.d/cloudstack.list //
>>> //deb http://cloudstack.apt-get.eu/ubuntu trusty 4.9/
>>>
>>> All KVM agents and the mgmt server are upgraded to release 4.9 based
>>> on the documentation.We have restarted all the cloudstack-agents and
>>> the cloudstack-management service as well.
>>>
>>> Network traces are showing packets from KVM <-> Mgmt on port 8250.
>>> there is no security device in between.
>>>
>>> thanks
>>> fanfarlo
>>>
>>>
>>>
>>>
>>> On 08/24/2016 04:13 PM, Will Stevens wrote:
>>>
>>>
>>> @rohit, I am guessing they should be installing the cloudstack-agent using
>>> the following repo right? That is what is described in the upgrade (trusty
>>> instead of precise though).
>>>
>>> http://cloudstack.apt-get.eu/ubuntu/dists/trusty/4.9/
>>>
>>> @fanfarlo, are your repo's setup to point to the new 4.9 version?
>>>
>>> cheers,
>>>
>>> will
>>>
>>> On Wed, Aug 24, 2016 at 9:46 AM, Rohit Yadav
>>> <[email protected]><mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> wrote:
>>>
>>>
>>>
>>> The PR and fix already exists in 4.9.0 release. Please make sure to
>>> upgrade all of your management server(s) and KVM agents and then also
>>> restart them after the upgrade.
>>>
>>>
>>> If you are seeing SSL handshake failures, it could be due to network or
>>> security issue and most likely due to mismatch between CloudStack mgmt
>>> server and KVM agent version.
>>>
>>>
>>> Regards.
>>>
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> www.shapeblue.com<http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com>
>>> @shapeblue
>>>
>>>
>>>
>>> ------------------------------
>>> *From:* Will Stevens
>>> <[email protected]><mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> *Sent:* 24 August 2016 18:17:17
>>> *To:*
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>;
>>> Rohit Yadav
>>> *Subject:* Re: CS 4.9 NIO Selector wait time PR-1601
>>>
>>>
>>> That PR is already merged, so you don't have to do anything to get that
>>> code, you already have it.
>>>
>>> @rohit, can you review this? I think this is a similar to the issue Simon
>>> reported earlier.
>>>
>>> Will
>>>
>>> On Aug 24, 2016 6:56 AM, "fanfarlo"
>>> <[email protected]><mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> wrote:
>>>
>>>
>>>
>>> hi all
>>>
>>> We have the following environment:
>>> - OS: Debian 14.04 (hypervisors and management)
>>> - 4 KVM Hosts
>>> - Cloudstack Release 4.9 with local database
>>>
>>> Since we upgraded to Release 4.9 the KVM hosts no longer connect to the
>>> management Server. Upgrade procedure was followed as described:
>>> http://docs.cloudstack.apache.org/projects/cloudstack-releas
>>> e-notes/en/4.9.0/upgrade/upgrade-4.8.html
>>>
>>>
>>> On the KVM hosts we have the following error message:
>>> /2016-08-24 10:42:49,678 INFO [utils.exception.CSExceptionErrorCode]
>>> (main:null) (logid:) Could not find exception:
>>> com.cloud.utils.exception.NioConnectionException in error code list for
>>> exceptions
>>> 2016-08-24 10:42:49,678 WARN [cloud.agent.Agent] (main:null) (logid:)
>>> NIO Connection Exception
>>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed
>>> while connecting to host: 10.100.12.10 port: 8250
>>> 2016-08-24 10:42:49,678 INFO [cloud.agent.Agent] (main:null) (logid:)
>>> Attempted to connect to the server, but received an unexpected
>>> exception, trying again...
>>> 2016-08-24 10:42:54,679 INFO [utils.nio.NioClient] (main:null) (logid:)
>>> Connecting to 10.100.12.10:8250
>>> 2016-08-24 10:42:54,684 WARN [utils.nio.Link] (main:null) (logid:) This
>>> SSL engine was forced to close inbound due to end of stream.
>>> 2016-08-24 10:42:54,684 ERROR [utils.nio.NioClient] (main:null) (logid:)
>>> SSL Handshake failed while connecting to host: 10.100.12.10 port: 8250
>>> 2016-08-24 10:42:54,685 ERROR [utils.nio.NioConnection] (main:null)
>>> (logid:) Unable to initialize the threads.
>>> java.io.IOException: SSL Handshake failed while connecting to host:
>>> 10.100.12.10 port: 8250
>>> at com.cloud.utils.nio.NioClient.init(NioClient.java:67)
>>> at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88)
>>> at com.cloud.agent.Agent.start(Agent.java:237)
>>> at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399)
>>> at
>>> com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367)
>>> at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351)
>>> at com.cloud.agent.AgentShell.start(AgentShell.java:456)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> org.apache.commons.daemon.support.DaemonLoader.start(DaemonL
>>> oader.java:243)
>>> 2016-08-24 10:42:54,685 INFO [utils.exception.CSExceptionErrorCode]
>>> (main:null) (logid:) Could not find exception:
>>> com.cloud.utils.exception.NioConnectionException in error code list for
>>> exceptions
>>> 2016-08-24 10:42:54,685 WARN [cloud.agent.Agent] (main:null) (logid:)
>>> NIO Connection Exception
>>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed
>>> while connecting to host: 10.100.12.10 port: 8250
>>> 2016-08-24 10:42:54,686 INFO [cloud.agent.Agent] (main:null) (logid:)
>>> Attempted to connect to the server, but received an unexpected
>>> exception, trying again.../
>>>
>>>
>>> Port is open on the management server, there is no firewall in between.
>>> We found that there was a bug report here:
>>> https://issues.apache.org/jira/browse/CLOUDSTACK-9348. There is a PR
>>> changing the NIO Selector wait time:
>>> https://github.com/apache/cloudstack/pull/1601 which was merged into
>>> master branch.
>>>
>>> Since we installed Release 4.9 we probably need to patch the
>>> NioConection.class as described in PR1601 , right?
>>>
>>> kvm03# unzip -v /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar |
>>> grep NioConnection
>>> 3923 Defl:N 1778 55% 2016-08-02 09:28 05aaf7d5
>>> com/cloud/utils/nio/NioConnection$1.class
>>> 881 Defl:N 495 44% 2016-08-02 09:28 e378984c
>>> com/cloud/utils/nio/NioConnection$ChangeRequest.class
>>> 15410 Defl:N 7130 54% 2016-08-02 09:28 b3281f5a
>>> com/cloud/utils/nio/NioConnection.class
>>> 1134 Defl:N 584 49% 2016-08-02 09:28 8d5cb4a8
>>> com/cloud/utils/exception/NioConnectionException.class
>>>
>>> Due to a lack of java expertise we have some basic questions:
>>> - Is there a patched jar file available ? public build server?
>>> - Do we need to create the jar from sources ? procedure?
>>> - How do we apply the patch ?
>>>
>>> many thanks!
>>> fanfarlo
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> [email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>
>>> www.shapeblue.com<http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com>
>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK
>>> @shapeblue
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> [email protected]
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK
>> @shapeblue
>>
>>
>>
>>
>