copy mail thread to @dev for seeking more help.


-------- Forwarded Message --------
Subject:        Re: XenServer is disconnected after CS hosts shutdown
Date:   Wed, 22 Jul 2015 21:03:13 +0800
From:   tony_caot...@163.com
Reply-To:       us...@cloudstack.apache.org
To:     us...@cloudstack.apache.org, opsrunb...@gmail.com



Hey!  help please...

some news.
I think the cause is that the ACS host can't communicate with XenServer
host.
ACS continues outputing logs like this

2015-07-22 20:42:13,555 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-7:null) Seq 5-8174877748607582212: Forwarding Seq
5-8174877748607582212:  { Cmd , MgmtId: 279278805451459, via: 5, Ver:
v1, Flags: 100111, [{"com.cloud.agent.api.MaintainCommand":{"wait":0}}]
} to 280345368052992

I am not sure that if the ACS status is wrong or some services on
xenserver are not opend.

on xenserver , I found *xenheartbeat.sh is not running.*
*(/bin/bash /opt/cloud/bin/xenheartbeat.sh
00d8e0d0-8561-4b3d-9044-cbc496ff22cc 120 60)*

As some operations about xenserver was pendingļ¼Œ xenserver can not be
deleted from web UI.

I got a temporary solution

1. delete jobs from DB cloud.vm_work_job.
2. delete xenserver from DB cloud.host.
3. add xenserver host back from web UI.

then it works.

Does anyone have a idea for this?

Could anyone tell what things does ACS do on xenserver host when adding
a xenserver ?

Thanks,

-----------
Cao Tong

On 07/22/2015 04:26 PM, tony_caot...@163.com wrote:

@prashant, following it the answer of you questions

1. Yes, primary storage is connected fine for my xenserver.

2. No, Xenserver's password is not changed.

3. yes, web UI is fine, and I can login.

4.  before reboot, I unmanaged and disabled resources,  and after
reboot I have enabled all of them.

5.  hosts is states is UP.

6. No yum update in anywhere.

7.  system VMs status is fine, i think.

-----------
Cao Tong

On 07/22/2015 04:13 PM, tony_caot...@163.com wrote:

Hi,

After reinstall, I got the problem again

So, I will describe once again.

WHAT my environment looks like:

I have a ACS server host and a xenserver host, After both reboot, I
can not create a VM on xenserver through ACS.
A KVM and A NFS are running together in ACS manager host.

the status of new VM is always 'staring' on the WEB, but I can create
new VM using xencenter.

------------- ERR LOGS ----------
2015-07-22 15:56:56,357 DEBUG [c.c.s.StorageManagerImpl]
(StatsCollector-3:ctx-1aa2e8c9) Unable to send storage pool command
to Pool[4|NetworkFilesystem] via 4
com.cloud.exception.OperationTimedoutException: Commands
2829104990918803478 to Host 4 timed out after 3600

2015-07-22 15:56:56,358 INFO  [c.c.s.StatsCollector]
(StatsCollector-3:ctx-1aa2e8c9) Unable to reach
Pool[4|NetworkFilesystem]
com.cloud.exception.StorageUnavailableException: Resource
[StoragePool:4] is unreachable: Unable to send command to the pool


------------- and there are lots of DEBUG infos  ------- repeat again
and again -----------

2015-07-22 15:36:12,887 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-14:null) Seq 4-8064821032713715922: Forwarding
Seq 4-8064821032713715922:  { Cmd , MgmtId: 227448510156211, via: 4,
Ver: v1, Flags: 100111,
[{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
116784073679673
2015-07-22 15:36:12,889 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-10:null) Seq 4-8064821032713715883: Forwarding
Seq 4-8064821032713715883:  { Cmd , MgmtId: 227448510156211, via: 4,
Ver: v1, Flags: 100111,
[{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/1/5/af949612-838f-3a6d-931b-312e612db740.vhd","origUrl":"http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2","uuid":"80b60e46-3017-11e5-8736-00259091a13a","id":5,"format":"VHD","accountId":1,"checksum":"905cec879afd9c9d22ecc8036131a180","hvm":false,"displayText":"CentOS
5.6(64-bit) no GUI
(XenServer)","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.0.0.100/storage/secondary","_role":"Image"}},"name":"centos56-x86_64-xen","hypervisorType":"XenServer"}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"origUrl":"http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2","uuid":"80b60e46-3017-11e5-8736-00259091a13a","id":5,"format":"VHD","accountId":1,"checksum":"905cec879afd9c9d22ecc8036131a180","hvm":false,"displayText":"CentOS
5.6(64-bit) no GUI
(XenServer)","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"2df26406-31bf-3a95-8a61-f5008defd9a0","id":4,"poolType":"NetworkFilesystem","host":"10.0.0.100","path":"/storage/xen/primary","port":2049,"url":"NetworkFilesystem://10.0.0.100/storage/xen/primary/?ROLE=Primary&STOREUUID=2df26406-31bf-3a95-8a61-f5008defd9a0"}},"name":"centos56-x86_64-xen","hypervisorType":"XenServer"}},"executeInSequence":true,"options":{},"wait":10800}}]
} to 116784073679673


-----------------------------------------

Anyone have Any ideas?  thanks.

-----------
Cao Tong

On 07/21/2015 06:14 PM, tony_caot...@163.com wrote:

Thanks all,

I have already reinstall my hosts for preparing a new clear
environment to restart my research.

-----------
Cao Tong

On 07/20/2015 09:24 PM, Prashant s wrote:
some questions :

can you please tell ...

1. is your NFS storage or your primary Storage Repository in connected
mode with no red cross mark on them in xencenter.
2. did you change any passwords on the xenservers ?
3. is the cloudstack web ui up , can you login to the cloudstack
web page.
4. *are the zone , pod, or clusters in unmanaged or disabled state ? *
*5. is all the hosts in connected state  ? *
*6. did you run  yum update on host reboot on the cs manager vm ? *
*7. system vms are stateless you can kill them and cs will recreate
a new
one .. so dont worry :-) *


*thanks *
*prashant *



On Mon, Jul 20, 2015 at 3:47 AM, <tony_caot...@163.com> wrote:

Hi, I restartd All hosts (one mgr and xenserver) again.


Following is the error log.


2015-07-20 15:33:49,688 INFO [c.c.u.e.CSExceptionErrorCode]
(StatsCollector-3:ctx-692a5392) Could not find exception:
com.cloud.exception.OperationTimedoutException in error code list for
exceptions
2015-07-20 15:33:49,688 WARN  [c.c.a.m.AgentAttache]
(StatsCollector-3:ctx-692a5392) Seq 1-3176445112179752972: Timed
out on null
2015-07-20 15:33:49,689 DEBUG [c.c.a.m.AgentAttache]
(StatsCollector-3:ctx-692a5392) Seq 1-3176445112179752972:
Cancelling.
2015-07-20 15:33:49,689 DEBUG [c.c.s.StorageManagerImpl]
(StatsCollector-3:ctx-692a5392) Unable to send storage pool
command to
Pool[1|NetworkFilesystem] via 1
com.cloud.exception.OperationTimedoutException: Commands
3176445112179752972 to Host 1 timed out after 3600
         at
com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:436)
         at
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:433)

         at
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:362)

         at
com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1000)

         at
com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:392)

         at
com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:406)

         at
com.cloud.server.StatsCollector$StorageCollector.runInContext(StatsCollector.java:642)

         at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)

         at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)

         at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)

         at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)

         at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)

         at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

         at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
         at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

         at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at java.lang.Thread.run(Thread.java:745)
2015-07-20 15:33:49,689 INFO  [c.c.s.StatsCollector]
(StatsCollector-3:ctx-692a5392) Unable to reach
Pool[1|NetworkFilesystem]
com.cloud.exception.StorageUnavailableException: Resource
[StoragePool:1]
is unreachable: Unable to send command to the pool
         at
com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:1010)

         at
com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:392)

         at
com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:406)

         at
com.cloud.server.StatsCollector$StorageCollector.runInContext(StatsCollector.java:642)

         at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)

         at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)

         at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)

         at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)

         at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)

         at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

         at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
         at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

         at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

-----------
Cao Tong


On 07/20/2015 02:52 PM, tony_caot...@163.com wrote:

No, no one's IP was changed.

1. In xenserver I can not login systemvms using the internal IP like
'169.254.1.112',  There shoud be a bridge network for this
right?  it is
gone.

2. I try to delete xenserver host from CS on web, it also failed
with
lots of logs like following, then memory is full and mangement
down...

2015-07-20 14:47:30,580 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-15:null) Seq 1-7282039122481381399:
Forwarding Seq
1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,
Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
192405008094602
2015-07-20 14:47:30,582 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-5:null) Seq 1-7282039122481381399:
Forwarding Seq
1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,
Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
192405008094602
2015-07-20 14:47:30,583 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-1:null) Seq 1-7282039122481381399:
Forwarding Seq
1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,
Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
192405008094602
2015-07-20 14:47:30,584 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentManager-Handler-14:null) Seq 1-7282039122481381399:
Forwarding Seq
1-7282039122481381399:  { Cmd , MgmtId: 104062526015411, via: 1,
Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.MaintainCommand":{"wait":0}}] } to
192405008094602


I guess that,  is there some service or daemons working for CS is
not up
on Xenserver ?


-----------
Cao Tong
On 07/20/2015 02:35 PM, Rajani Karuturi wrote:

Did the management server ip change?
management server ip in the configuration table is used my
systemvms.
select * from configuration where name like 'host';

If it changed, correct the value in db and restart systemvms.


~Rajani

On Mon, Jul 20, 2015 at 11:56 AM,<tony_caot...@163.com>  wrote:

  Hello,
I shutdown my cs-manager and xenserver last weekend, And now
the ssvm
and cpvm is disconnect, thost two was runing on xenserver. so What
should i do right now ?
Please anybody help me and thanks.

In xenserver  I found that the three system VMs are not running.
my xenserver seems can not reconnect to CS-manager. and it
seams not
under control of CS.


What is the right steps of shutdown all CS group machines and
resume
them?
How can i let my xenserver reconnected ?


Thanks,

--
-----------
Cao Tong
















Reply via email to