We found that secondary storage nfs was not working well. We then
restore the nfs service and rebooting the machine on alert. Now it works!
Il 26/01/2015 16:32, Somesh Naidu ha scritto:
From the logs it appears that agent got connected but can't say what happened
next. Need further logs.
There are quite a few things that you could verify/check, like,
1. netstat shows a connection between mgmt. server (on port 8250) and systemvm.
2. the disk on the systemvm hasn't run out of space.
You could perform a stop/start of the VM to see if that recovers from the
situation.
You may also try various other checks including running the health check script
mentioned here,
https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting.
Regards,
Somesh
-----Original Message-----
From: Ugo Vasi [mailto:ugo.v...@procne.it]
Sent: Monday, January 26, 2015 10:05 AM
To: users@cloudstack.apache.org
Subject: agent host in alert state
Hi all,
we have installed a cloudstack 4.3.0 in advanced network mode on ubuntu
systems with only kvm hypervisor.
Today we received these series of notification (email):
1) Host disconnected, name: agent_name (id:7), availability zone:
zone_name, pod: pod_name
2) Host is down, name: agent_name (id:7), availability zone:
zone_name, pod: pod_name
3) Host disconnected, name: agent_name (id:7), availability zone:
zone_name, pod: pod_name
4) Host is down, name: agent_name (id:7), availability zone:
zone_name, pod: pod_name
5) Unable to restart vm_name which was running on host name:
agent_name(id:7), availability zone: zone_name, pod: pod_name
The server agent was not shut down nor rebooted and the virtual machines
are still running.
In agent log I found messages like these:
2015-01-26 15:04:40,728 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
Cannot connect because we still have 5 commands in progress.
2015-01-26 15:04:45,729 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
Lost connection to the server. Dealing with the remaining commands...
2015-01-26 15:04:45,729 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
Cannot connect because we still have 5 commands in progress.
2015-01-26 15:04:50,729 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
Lost connection to the server. Dealing with the remaining commands...
I tried to restart the agent service and after some minutes the log says:
2015-01-26 15:05:42,207 INFO [utils.nio.NioClient]
(Agent-Selector:null) Connecting to manager_ip:8250
2015-01-26 15:05:42,489 INFO [utils.nio.NioClient]
(Agent-Selector:null) SSL: Handshake done
2015-01-26 15:05:42,489 INFO [utils.nio.NioClient]
(Agent-Selector:null) Connected to manager_ip:8250
But in manager interface I see this agent in Alert state.
Any idea to resolve this problem?
--
U g o V a s i <ugo.v...@procne.it>
P r o c n e s.r.l >)
via Cotonificio 45 33010 Tavagnacco IT
phone: +390432486523 fax: +390432486523
Le informazioni contenute in questo messaggio sono riservate e
confidenziali ed è vietata la diffusione in qualunque modo eseguita.
Qualora Lei non fosse la persona a cui il presente messaggio è
destinato, La invitiamo ad eliminarlo e a non leggerlo, dandocene
gentilmente comunicazione.
Per qualsiasi informazione si prega di contattare supp...@procne.it .
Rif. D.L. 196/2003