Define “reinstall the host” - do you just mean 'yum remove ovirt* vdsm*’ then ‘yum install ovirt* vdsm*’, or completely reinstall the OS, reset-up Gluster, etc.?
On Jan 6, 2016, at 4:15 AM, Eliraz Levi <el...@redhat.com<mailto:el...@redhat.com>> wrote: Hi Will how are you? The log is first pointing about certifications issues: 2016-01-04 00:02:11,259 ERROR [org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer] (DefaultQuartzScheduler_Worker-81) [] Failed to get peer certification for host 'ovirt-node-02': SSL session is invalid 2016-01-04 00:02:11,259 ERROR [org.ovirt.engine.core.bll.CertificationValidityChecker] (DefaultQuartzScheduler_Worker-81) [] Failed to retrieve peer certifications for host 'ovirt-node-02' So first thing we should do is to try and solve this problem. Please try to re install the host. Thanks. Eliraz :) ----- Original Message ----- From: "Will Dennis" <wden...@nec-labs.com<mailto:wden...@nec-labs.com>> To: "Eliraz Levi" <el...@redhat.com<mailto:el...@redhat.com>>, "users" <users@ovirt.org<mailto:users@ovirt.org>> Sent: Tuesday, 5 January, 2016 5:46:23 AM Subject: Re: [ovirt-users] host status "Non Operational" - how to diagnose & fix? I must admit I’m getting a bit weary of fighting oVirt problems at this point… Before I move on to deploying any VMs onto my new infra, I’d like to get the base infra working… I’m still experiencing a “Non Operational” problem on my “ovirt-node-02” host: http://s1096.photobucket.com/user/willdennis/media/ovirt-node-02_problem.png.html I have pored thru the logs (all the engine logs, plus the syslogs from the engine VM + and my three hypervisor/storage hosts) and I can’t pin down why the one node is having a problem… Of course with how voluminous all these logs are, it’s kind of like looking for a needle in a haystack, and I’m not even sure what the needle looks like, or if it’s even a needle :-/ I have also rebooted this host in past days, this also did not fix the problem. Note that on the screenshot I posted above, that the webadmin hosts screen says that -node-01 has one VM running, and the others 0… You’d think that would be the HE VM running on there, but it’s actually on -node-02: $ ansible istgroup-ovirt -f 1 -i prod -u root -m shell -a "hosted-engine --vm-status | grep -e '^Hostname' -e '^Engine'" ovirt-node-01 | success | rc=0 >> Hostname : ovirt-node-01 Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"} Hostname : ovirt-node-02 Engine status : {"health": "good", "vm": "up", "detail": "up"} Hostname : ovirt-node-03 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} ovirt-node-02 | success | rc=0 >> Hostname : ovirt-node-01 Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"} Hostname : ovirt-node-02 Engine status : {"health": "good", "vm": "up", "detail": "up"} Hostname : ovirt-node-03 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} ovirt-node-03 | success | rc=0 >> Hostname : ovirt-node-01 Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"} Hostname : ovirt-node-02 Engine status : {"health": "good", "vm": "up", "detail": "up"} Hostname : ovirt-node-03 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown”} So it looks like the webadmin UI is wrong as well… It would be awesome if the UI would give a reason for the “Non Operational” status somehow… Or if there was a troubleshooter that could be used to analyze the problem… As it is, being so new to all of this, I am completely at the list’s mercy to figure this out. This software has such promise, so I’ll keep working thru these issues, but it sure hasn’t been a smooth ride so far… On Jan 4, 2016, at 7:54 AM, Will Dennis <wden...@nec-labs.com<mailto:wden...@nec-labs.com><mailto:wden...@nec-labs.com>> wrote: I put all of the engine logs up there now… Try engine.log-20160103.gzhttp://i1096.photobucket.com/albums/g330/willdennis/ovirt-node-02_problem.png _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users