[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
gt;> >> >> >>>> >> >>> >the >> >> >> >>>> >> >>> >Hosted Engine setup finished?* >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >d)Apr 08 22:48:27 ovirt-node-01.phoelex.com >> >> >> >libvirtd[29307]: >> >> >> >>>> >> >2020-04-08 >> >> >> >>>> >> >>> >22:48:27.134+: 29309: warning : >> >> >qemuGetProcessInfo:1404 >> >> >> >: >> >> >> >>>> >> >cannot >> >> >> >>>> >> >>> >parse >> >> >> >>>> >> >>> >process status data >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >Apr 08 22:48:27 ovirt-node-01.phoelex.com >> >> >libvirtd[29307]: >> >> >> >>>> >> >2020-04-08 >> >> >> >>>> >> >>> >22:48:27.134+: 29309: error : >> >> >> >virNetDevTapInterfaceStats:764 >> >> >> >>>> >: >> >> >> >>>> >> >>> >internal >> >> >> >>>> >> >>> >error: /proc/net/dev: Interface not found >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >Apr 08 23:09:39 ovirt-node-01.phoelex.com >> >> >libvirtd[29307]: >> >> >> >>>> >> >2020-04-08 >> >> >> >>>> >> >>> >23:09:39.844+: 29307: error : >> >> >virNetSocketReadWire:1806 >> >> >> >: >> >> >> >>>> >End >> >> >> >>>> >> >of >> >> >> >>>> >> >>> >file >> >> >> >>>> >> >>> >while reading data: Input/output error >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >Apr 09 01:05:26 ovirt-node-01.phoelex.com >> >> >libvirtd[29307]: >> >> >> >>>> >> >2020-04-09 >> >> >> >>>> >> >>> >01:05:26.660+: 29307: error : >> >> >virNetSocketReadWire:1806 >> >> >> >: >> >> >> >>>> >End >> >> >> >>>> >> >of >> >> >> >>>> >> >>> >file >> >> >> >>>> >> >>> >while reading data: Input/output error >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >5 & 6. The broker log is continually printing >this >> >> >error: >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >>>> >> >>> >> >> >> >>>> >> >>> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >> >>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6 started >> >> >> >>>> >> >>> > >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >>>> >> >>> >> >> >> >>>> >> >>> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >> >>>> >> >>> >Running broker >> >> >>
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
ta >> >> >>>> >> >>> > >> >> >>>> >> >>> >Apr 08 22:48:27 ovirt-node-01.phoelex.com >> >libvirtd[29307]: >> >> >>>> >> >2020-04-08 >> >> >>>> >> >>> >22:48:27.134+: 29309: error : >> >> >virNetDevTapInterfaceStats:764 >> >> >>>> >: >> >> >>>> >> >>> >internal >> >> >>>> >> >>> >error: /proc/net/dev: Interface not found >> >> >>>> >> >>> > >> >> >>>> >> >>> >Apr 08 23:09:39 ovirt-node-01.phoelex.com >> >libvirtd[29307]: >> >> >>>> >> >2020-04-08 >> >> >>>> >> >>> >23:09:39.844+: 29307: error : >> >virNetSocketReadWire:1806 >> >> >: >> >> >>>> >End >> >> >>>> >> >of >> >> >>>> >> >>> >file >> >> >>>> >> >>> >while reading data: Input/output error >> >> >>>> >> >>> > >> >> >>>> >> >>> >Apr 09 01:05:26 ovirt-node-01.phoelex.com >> >libvirtd[29307]: >> >> >>>> >> >2020-04-09 >> >> >>>> >> >>> >01:05:26.660+: 29307: error : >> >virNetSocketReadWire:1806 >> >> >: >> >> >>>> >End >> >> >>>> >> >of >> >> >>>> >> >>> >file >> >> >>>> >> >>> >while reading data: Input/output error >> >> >>>> >> >>> > >> >> >>>> >> >>> >5 & 6. The broker log is continually printing this >> >error: >> >> >>>> >> >>> > >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >>>> >> >>> >> >> >>>> >> >>> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >>>> >> >> >> >> >> >> >>>>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6 started >> >> >>>> >> >>> > >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>>> >> >>> >> >> >>>> >> >>> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >>>> >> >> >> >> >> >> >>>>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >>>> >> >>> >Running broker >> >> >>>> >> >>> > >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>>> >> >>> >> >> >>>> >> >>> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >>>> >> >> >> >> >> >> >>>>>>08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_monitor) >> >> >>>> >> >>> >Starting monitor >> >> >>>> >> >>> > >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >>>> >> >>> >> >> >>>> >> >>> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >>>> >> >> >> >> >> >> >>>>>>08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>>> >> >>> >Searching for submonitors in >> >> >>>> >> >>> >> >> >
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
gt;> >>>> >> >>> >internal >> >>>> >> >>> >error: /proc/net/dev: Interface not found >> >>>> >> >>> > >> >>>> >> >>> >Apr 08 23:09:39 ovirt-node-01.phoelex.com >libvirtd[29307]: >> >>>> >> >2020-04-08 >> >>>> >> >>> >23:09:39.844+: 29307: error : >virNetSocketReadWire:1806 >> >: >> >>>> >End >> >>>> >> >of >> >>>> >> >>> >file >> >>>> >> >>> >while reading data: Input/output error >> >>>> >> >>> > >> >>>> >> >>> >Apr 09 01:05:26 ovirt-node-01.phoelex.com >libvirtd[29307]: >> >>>> >> >2020-04-09 >> >>>> >> >>> >01:05:26.660+: 29307: error : >virNetSocketReadWire:1806 >> >: >> >>>> >End >> >>>> >> >of >> >>>> >> >>> >file >> >>>> >> >>> >while reading data: Input/output error >> >>>> >> >>> > >> >>>> >> >>> >5 & 6. The broker log is continually printing this >error: >> >>>> >> >>> > >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6 started >> >>>> >> >>> > >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >>>> >> >>> >Running broker >> >>>> >> >>> > >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>>>08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_monitor) >> >>>> >> >>> >Starting monitor >> >>>> >> >>> > >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>>>08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>>> >> >>> >Searching for submonitors in >> >>>> >> >>> >> >>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker >> >>>> >> >>> > >> >>>> >> >>> >/submonitors >> >>>> >> >>> > >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>>>08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>>> >> >>> >Loaded submonitor network >> >>>> >> >>> > >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >> >>>> >> >> >>>> >> >>>> >> >> >>>>>08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
>> >>> >22:48:27.134+: 29309: error : >virNetDevTapInterfaceStats:764 >>>> >: >>>> >> >>> >internal >>>> >> >>> >error: /proc/net/dev: Interface not found >>>> >> >>> > >>>> >> >>> >Apr 08 23:09:39 ovirt-node-01.phoelex.com libvirtd[29307]: >>>> >> >2020-04-08 >>>> >> >>> >23:09:39.844+: 29307: error : virNetSocketReadWire:1806 >: >>>> >End >>>> >> >of >>>> >> >>> >file >>>> >> >>> >while reading data: Input/output error >>>> >> >>> > >>>> >> >>> >Apr 09 01:05:26 ovirt-node-01.phoelex.com libvirtd[29307]: >>>> >> >2020-04-09 >>>> >> >>> >01:05:26.660+: 29307: error : virNetSocketReadWire:1806 >: >>>> >End >>>> >> >of >>>> >> >>> >file >>>> >> >>> >while reading data: Input/output error >>>> >> >>> > >>>> >> >>> >5 & 6. The broker log is continually printing this error: >>>> >> >>> > >>>> >> >>> >MainThread::INFO::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6 started >>>> >> >>> > >>>> >> >>> >MainThread::DEBUG::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >>>> >> >>> >Running broker >>>> >> >>> > >>>> >> >>> >MainThread::DEBUG::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_monitor) >>>> >> >>> >Starting monitor >>>> >> >>> > >>>> >> >>> >MainThread::INFO::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> >> >>> >Searching for submonitors in >>>> >> >>> >>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker >>>> >> >>> > >>>> >> >>> >/submonitors >>>> >> >>> > >>>> >> >>> >MainThread::INFO::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> >> >>> >Loaded submonitor network >>>> >> >>> > >>>> >> >>> >MainThread::INFO::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> >> >>> >Loaded submonitor cpu-load-no-engine >>>> >> >>> > >>>> >> >>> >MainThread::INFO::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >>>> >> >>>> >>>> >>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> >> >>> >Loaded submonitor mgmt-bridge >>>> >> >>> > >>>> >> >>> >MainThread::INFO::2020-04-09 >>>> >> >>> >>>> >> >>> >>>> >> >&g
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
d submonitor cpu-load-no-engine >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor cpu-load >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor mem-free >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor storage-domain >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor storage-domain >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor mem-free >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor engine-health >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Finished loading submonitors >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker) >> >> >>> >Starting storage broker >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >> >>> >Connecting to VDSM >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug) >> >> >>> >Creating a new json-rpc connection to VDSM >> >> >>> > >> >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >> >>> >08:07:31,453::concurrent::258::root::(run) START thread >> >> >> >> >>> >localhost:54321, started daemon 139992488138496)> >(func=> >> >method >> >> >>> >Reactor.process_requests of > >> >object at >> >> >>> >0x7f528acabc90>>, args=(), kwargs={}) >> >> >>> > >> >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) >> >> >>> >Stomp connection established >> >> >>> > >> >> &
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
bmonitor cpu-load-no-engine >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor cpu-load >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor mem-free >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor storage-domain >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor storage-domain >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor mem-free >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Loaded submonitor engine-health >> >> >>> > >> >> >>> >MainThread::INFO::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >>> >Finished loading submonitors >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker) >> >> >>> >Starting storage broker >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >> >>> >Connecting to VDSM >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug) >> >> >>> >Creating a new json-rpc connection to VDSM >> >> >>> > >> >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >> >>> >08:07:31,453::concurrent::258::root::(run) START thread >> >> >> >> >>> >localhost:54321, started daemon 139992488138496)> (func=> >> >method >> >> >>> >Reactor.process_requests of > >> >object at >> >> >>> >0x7f528acabc90>>, args=(), kwargs={}) >> >> >>> > >> >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >> >>> >> >> >>> >> >> >> >> >> >> >>>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) >> >> >>> >Stomp connection established >> >> >>> > >> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >>> >08:07:31,467::stompclient::294::jsonrpc.Asyn
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
t; >>> > >> >>> > >> > >> > > >>>08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor network > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor cpu-load-no-engine > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor mgmt-bridge > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor network > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor cpu-load > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor engine-health > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor mgmt-bridge > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor cpu-load-no-engine > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor cpu-load > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor mem-free > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor storage-domain > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor storage-domain > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor mem-free > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>> > >> > >> > > >>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >> >>> >Loaded submonitor engine-health > >> >>> > > >> >>> >MainThread::INFO::2020-04-09 > >> >>> > >> >>&
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
04-09 >> >>> >> >>> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Loaded submonitor cpu-load >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Loaded submonitor mem-free >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Loaded submonitor storage-domain >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Loaded submonitor storage-domain >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Loaded submonitor mem-free >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Loaded submonitor engine-health >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >>> >Finished loading submonitors >> >>> > >> >>> >MainThread::DEBUG::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker) >> >>> >Starting storage broker >> >>> > >> >>> >MainThread::DEBUG::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >>> >Connecting to VDSM >> >>> > >> >>> >MainThread::DEBUG::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug) >> >>> >Creating a new json-rpc connection to VDSM >> >>> > >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >>> >08:07:31,453::concurrent::258::root::(run) START thread >> >> >>> >localhost:54321, started daemon 139992488138496)> (func=> >method >> >>> >Reactor.process_requests of > >object at >> >>> >0x7f528acabc90>>, args=(), kwargs={}) >> >>> > >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) >> >>> >Stomp connection established >> >>> > >> >>> >MainThread::DEBUG::2020-04-09 >> >>> >08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >Sending >> >>> >response >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >>> >Connecting the storage >> >>> > >> >>> >MainThread::INFO::2020-04-09 >> >>> >> >>> >> >> >>>08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >>> >Connecting storage server >> >>> > >> >>> >MainThread::DEBUG::2020-04-09 >> >>> >08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >Sending >> >>> >response >> >>> > >> >>> >MainThread::DEBUG::2020-04-09 >> >>> >08:07:31,534::stompclient::294::jsonrpc.AsyncoreCli
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker) > >>> >Starting storage broker > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> > >>> > > >>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect) > >>> >Connecting to VDSM > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> > >>> > > >>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug) > >>> >Creating a new json-rpc connection to VDSM > >>> > > >>> >Client localhost:54321::DEBUG::2020-04-09 > >>> >08:07:31,453::concurrent::258::root::(run) START thread > > >>> >localhost:54321, started daemon 139992488138496)> (func= >method > >>> >Reactor.process_requests of >object at > >>> >0x7f528acabc90>>, args=(), kwargs={}) > >>> > > >>> >Client localhost:54321::DEBUG::2020-04-09 > >>> > >>> > > >>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) > >>> >Stomp connection established > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> >08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send) > >Sending > >>> >response > >>> > > >>> >MainThread::INFO::2020-04-09 > >>> > >>> > > >>08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) > >>> >Connecting the storage > >>> > > >>> >MainThread::INFO::2020-04-09 > >>> > >>> > > >>08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >>> >Connecting storage server > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> >08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send) > >Sending > >>> >response > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> >08:07:31,534::stompclient::294::jsonrpc.AsyncoreClient::(send) > >Sending > >>> >response > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> > >>> > > >>08:07:32,199::storage_server::158::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path) > >>> >Storage domain a6cea67d-dbfb-45cf-a775-b4d0d47b26f2 is not > >available > >>> > > >>> >MainThread::INFO::2020-04-09 > >>> > >>> > > >>08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >>> >Connecting storage server > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> >08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send) > >Sending > >>> >response > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> > >>> > > >>08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >>> >[{u'status': 0, u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}] > >>> > > >>> >MainThread::INFO::2020-04-09 > >>> > >>> > > >>08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >>> >Refreshing the storage domain > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> >08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send) > >Sending > >>> >response > >>> > > >>> >MainThread::DEBUG::2020-04-09 > >>> > >>> > > >>08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >>> >Error refreshing storage domain: Command StorageDomain.getStats > >with > >>> >args > >>> >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: > >>> > > >>> >(code=350, message=Error in storage domain action: > >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) > >>> > > >>> >MainThread::DEBUG::2020-04-09 >
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
vailable >>> > >>> >MainThread::INFO::2020-04-09 >>> >>> >>08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >>> >Connecting storage server >>> > >>> >MainThread::DEBUG::2020-04-09 >>> >08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send) >Sending >>> >response >>> > >>> >MainThread::DEBUG::2020-04-09 >>> >>> >>08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >>> >[{u'status': 0, u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}] >>> > >>> >MainThread::INFO::2020-04-09 >>> >>> >>08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >>> >Refreshing the storage domain >>> > >>> >MainThread::DEBUG::2020-04-09 >>> >08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send) >Sending >>> >response >>> > >>> >MainThread::DEBUG::2020-04-09 >>> >>> >>08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >>> >Error refreshing storage domain: Command StorageDomain.getStats >with >>> >args >>> >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: >>> > >>> >(code=350, message=Error in storage domain action: >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >>> > >>> >MainThread::DEBUG::2020-04-09 >>> >08:07:33,130::stompclient::294::jsonrpc.AsyncoreClient::(send) >Sending >>> >response >>> > >>> >MainThread::DEBUG::2020-04-09 >>> >>> >>08:07:33,795::storage_backends::208::ovirt_hosted_engine_ha.lib.storage_backends::(_get_sector_size) >>> >Command StorageDomain.getInfo with args {'storagedomainID': >>> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: >>> > >>> >(code=350, message=Error in storage domain action: >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >>> > >>> >MainThread::WARNING::2020-04-09 >>> >>> >>08:07:33,795::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) >>> >Can't connect vdsm storage: Command StorageDomain.getInfo with args >>> >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: >>> > >>> >(code=350, message=Error in storage domain action: >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >>> > >>> > >>> >The UUID it is moaning about is indeed the one that the HA sits on >and >>> >is >>> >the one I listed the contents of in step 2 above. >>> > >>> > >>> >So why can't it see this domain? >>> > >>> > >>> >Thanks, Shareef. >>> > >>> >On Thu, Apr 9, 2020 at 6:12 AM Strahil Nikolov > >>> >wrote: >>> > >>> >> On April 9, 2020 1:51:05 AM GMT+03:00, Shareef Jalloq < >>> >> shar...@jalloq.co.uk> wrote: >>> >> >Don't know if this is useful or not, but I just tried to >shutdown >>> >and >>> >> >start >>> >> >another VM on one of the hosts and get the following error: >>> >> > >>> >> >virsh # start scratch >>> >> > >>> >> >error: Failed to start domain scratch >>> >> > >>> >> >error: Network not found: no network with matching name >>> >> >'vdsm-ovirtmgmt' >>> >> > >>> >> >Is this not referring to the interface name as the network is >called >>> >> >'ovirtmgnt'. >>> >> > >>> >> >On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq >>> > >>> >> >wrote: >>> >> > >>> >> >> Hmmm, virsh tells me the HE is running but it hasn't come up >and >>> >the >>> >> >> agent.log is full of the same errors. >>> >> >> >>> >> >> On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq >>> > >>> >
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Monitor::(_discover_submonitors) >> >Loaded submonitor mem-free >> > >> >MainThread::INFO::2020-04-09 >> >> >08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >Loaded submonitor engine-health >> > >> >MainThread::INFO::2020-04-09 >> >> >08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >Finished loading submonitors >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker) >> >Starting storage broker >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >Connecting to VDSM >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug) >> >Creating a new json-rpc connection to VDSM >> > >> >Client localhost:54321::DEBUG::2020-04-09 >> >08:07:31,453::concurrent::258::root::(run) START thread > >localhost:54321, started daemon 139992488138496)> (func=> >Reactor.process_requests of > >0x7f528acabc90>>, args=(), kwargs={}) >> > >> >Client localhost:54321::DEBUG::2020-04-09 >> >> >08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) >> >Stomp connection established >> > >> >MainThread::DEBUG::2020-04-09 >> >08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending >> >response >> > >> >MainThread::INFO::2020-04-09 >> >> >08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >Connecting the storage >> > >> >MainThread::INFO::2020-04-09 >> >> >08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >Connecting storage server >> > >> >MainThread::DEBUG::2020-04-09 >> >08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending >> >response >> > >> >MainThread::DEBUG::2020-04-09 >> >08:07:31,534::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending >> >response >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:32,199::storage_server::158::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path) >> >Storage domain a6cea67d-dbfb-45cf-a775-b4d0d47b26f2 is not available >> > >> >MainThread::INFO::2020-04-09 >> >> >08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >Connecting storage server >> > >> >MainThread::DEBUG::2020-04-09 >> >08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending >> >response >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >[{u'status': 0, u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}] >> > >> >MainThread::INFO::2020-04-09 >> >> >08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >Refreshing the storage domain >> > >> >MainThread::DEBUG::2020-04-09 >> >08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending >> >response >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >Error refreshing storage domain: Command StorageDomain.getStats with >> >args >> >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: >> > >> >(code=350, message=Error in storage domain action: >> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >> > >> >MainThread::DEBUG::2020-04-09 >> >08:07:33,130::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending >> >response >> > >> >MainThread::DEBUG::2020-04-09 >> >> >08:07:33,795::storage_backends::208::ovirt_hosted_engine_ha.lib.storage_backends::(_get_sector_size) >> >Command StorageDomain.getInfo with args {'storagedomainID': >> >'a6cea6
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
1::DEBUG::2020-04-09 > >08:07:31,453::concurrent::258::root::(run) START thread >localhost:54321, started daemon 139992488138496)> (func= >Reactor.process_requests of >0x7f528acabc90>>, args=(), kwargs={}) > > > >Client localhost:54321::DEBUG::2020-04-09 > > >08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) > >Stomp connection established > > > >MainThread::DEBUG::2020-04-09 > >08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending > >response > > > >MainThread::INFO::2020-04-09 > > >08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) > >Connecting the storage > > > >MainThread::INFO::2020-04-09 > > >08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >Connecting storage server > > > >MainThread::DEBUG::2020-04-09 > >08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending > >response > > > >MainThread::DEBUG::2020-04-09 > >08:07:31,534::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending > >response > > > >MainThread::DEBUG::2020-04-09 > > >08:07:32,199::storage_server::158::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path) > >Storage domain a6cea67d-dbfb-45cf-a775-b4d0d47b26f2 is not available > > > >MainThread::INFO::2020-04-09 > > >08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >Connecting storage server > > > >MainThread::DEBUG::2020-04-09 > >08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending > >response > > > >MainThread::DEBUG::2020-04-09 > > >08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >[{u'status': 0, u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}] > > > >MainThread::INFO::2020-04-09 > > >08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >Refreshing the storage domain > > > >MainThread::DEBUG::2020-04-09 > >08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending > >response > > > >MainThread::DEBUG::2020-04-09 > > >08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > >Error refreshing storage domain: Command StorageDomain.getStats with > >args > >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: > > > >(code=350, message=Error in storage domain action: > >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) > > > >MainThread::DEBUG::2020-04-09 > >08:07:33,130::stompclient::294::jsonrpc.AsyncoreClient::(send) Sending > >response > > > >MainThread::DEBUG::2020-04-09 > > >08:07:33,795::storage_backends::208::ovirt_hosted_engine_ha.lib.storage_backends::(_get_sector_size) > >Command StorageDomain.getInfo with args {'storagedomainID': > >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: > > > >(code=350, message=Error in storage domain action: > >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) > > > >MainThread::WARNING::2020-04-09 > > >08:07:33,795::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > >Can't connect vdsm storage: Command StorageDomain.getInfo with args > >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: > > > >(code=350, message=Error in storage domain action: > >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) > > > > > >The UUID it is moaning about is indeed the one that the HA sits on and > >is > >the one I listed the contents of in step 2 above. > > > > > >So why can't it see this domain? > > > > > >Thanks, Shareef. > > > >On Thu, Apr 9, 2020 at 6:12 AM Strahil Nikolov > >wrote: > > > >> On April 9, 2020 1:51:05 AM GMT+03:00, Shareef Jalloq < > >> shar...@jalloq.co.uk> wrote: > >> >Don't know if this is useful or not, but I just tried to shutdown > >and > >> >start > >> >another VM on one of the hosts and get the following error: > >> > > >> >virsh # start scratch > >> > > >> >error: Failed to start domain scratch > >> > > >> >error: Network not found: no ne
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Do these files exist on the hosted engine? I am a Ovirt newbie but it sounds like file or disk corruption. How much actual storage space is left on the volume with he prob files? Or the hosted engine disk? Can you ssh into the hosted engine and put it in global maintenance and rerun engine-setup? What happened to trigger these errors? I'm coming in a bit late to the conversation. Mar 23 18:02:59 ovirt-node-01.phoelex.com supervdsmd[29409]: *failed >to load module nvdimm: libbd_nvdimm.so.2: cannot open shared object >file: >No >such file or directory* > >c) Apr 09 08:05:13 ovirt-node-01.phoelex.com vdsm[4801]: *ERROR failed >to retrieve Hosted Engine HA score '[Errno 2] No such file or >directory'Is the Hosted Engine setup finished?* Eric Evans Digital Data Services LLC. 304.660.9080 -Original Message- From: Strahil Nikolov Sent: Thursday, April 9, 2020 12:57 PM To: Shareef Jalloq Cc: eev...@digitaldatatechs.com; Ovirt Users Subject: [ovirt-users] Re: ovirt-engine unresponsive - how to rescue? On April 9, 2020 11:12:30 AM GMT+03:00, Shareef Jalloq wrote: >OK, let's go through this. I'm looking at the node that at least still >has some VMs running. virsh also tells me that the HostedEngine VM is >running but it's unresponsive and I can't shut it down. > >1. All storage domains exist and are mounted. >2. The ha_agent exists: > >[root@ovirt-node-01 ovirt-hosted-engine-ha]# ls /rhev/data-center/mnt/ >nas-01.phoelex.com\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26 >f2/ > >dom_md ha_agent images master > >3. There are two links > >[root@ovirt-node-01 ovirt-hosted-engine-ha]# ll /rhev/data-center/mnt/ >nas-01.phoelex.com >\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ha_agent/ > >total 8 > >lrwxrwxrwx. 1 vdsm kvm 132 Apr 2 14:50 hosted-engine.lockspace -> >/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ffb90b82-42f >e-4253-85d5-aaec8c280aaf/90e68791-0c6f-406a-89ac-e0d86c631604 > >lrwxrwxrwx. 1 vdsm kvm 132 Apr 2 14:50 hosted-engine.metadata -> >/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/2161aed0-725 >0-4c1d-b667-ac94f60af17e/6b818e33-f80a-48cc-a59c-bba641e027d4 > >4. The services exist but all seem to have some sort of warning: > >a) Apr 08 18:10:55 ovirt-node-01.phoelex.com sanlock[1728]: *2020-04-08 >18:10:55 1744152 [36796]: s16 delta_renew long write time 10 sec* > >b) Mar 23 18:02:59 ovirt-node-01.phoelex.com supervdsmd[29409]: *failed >to load module nvdimm: libbd_nvdimm.so.2: cannot open shared object >file: >No >such file or directory* > >c) Apr 09 08:05:13 ovirt-node-01.phoelex.com vdsm[4801]: *ERROR failed >to retrieve Hosted Engine HA score '[Errno 2] No such file or >directory'Is the Hosted Engine setup finished?* > >d)Apr 08 22:48:27 ovirt-node-01.phoelex.com libvirtd[29307]: 2020-04-08 >22:48:27.134+: 29309: warning : qemuGetProcessInfo:1404 : cannot >parse process status data > >Apr 08 22:48:27 ovirt-node-01.phoelex.com libvirtd[29307]: 2020-04-08 >22:48:27.134+: 29309: error : virNetDevTapInterfaceStats:764 : >internal >error: /proc/net/dev: Interface not found > >Apr 08 23:09:39 ovirt-node-01.phoelex.com libvirtd[29307]: 2020-04-08 >23:09:39.844+: 29307: error : virNetSocketReadWire:1806 : End of >file while reading data: Input/output error > >Apr 09 01:05:26 ovirt-node-01.phoelex.com libvirtd[29307]: 2020-04-09 >01:05:26.660+: 29307: error : virNetSocketReadWire:1806 : End of >file while reading data: Input/output error > >5 & 6. The broker log is continually printing this error: > >MainThread::INFO::2020-04-09 >08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker:: >(run) ovirt-hosted-engine-ha broker 2.3.6 started > >MainThread::DEBUG::2020-04-09 >08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker:: >(run) >Running broker > >MainThread::DEBUG::2020-04-09 >08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker: >:(_get_monitor) >Starting monitor > >MainThread::INFO::2020-04-09 >08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monito >r::(_discover_submonitors) >Searching for submonitors in >/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker > >/submonitors > >MainThread::INFO::2020-04-09 >08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monito >r::(_discover_submonitors) >Loaded submonitor network > >MainThread::INFO::2020-04-09 >08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monito >r::(_discover_submonitors) >Loaded submonitor cpu-load-no-engine > >MainThread::INFO::2020-04-09 >08:07:31,441::monitor::49::o
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
ageBroker::(__init__) >Can't connect vdsm storage: Command StorageDomain.getInfo with args >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: > >(code=350, message=Error in storage domain action: >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) > > >The UUID it is moaning about is indeed the one that the HA sits on and >is >the one I listed the contents of in step 2 above. > > >So why can't it see this domain? > > >Thanks, Shareef. > >On Thu, Apr 9, 2020 at 6:12 AM Strahil Nikolov >wrote: > >> On April 9, 2020 1:51:05 AM GMT+03:00, Shareef Jalloq < >> shar...@jalloq.co.uk> wrote: >> >Don't know if this is useful or not, but I just tried to shutdown >and >> >start >> >another VM on one of the hosts and get the following error: >> > >> >virsh # start scratch >> > >> >error: Failed to start domain scratch >> > >> >error: Network not found: no network with matching name >> >'vdsm-ovirtmgmt' >> > >> >Is this not referring to the interface name as the network is called >> >'ovirtmgnt'. >> > >> >On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq > >> >wrote: >> > >> >> Hmmm, virsh tells me the HE is running but it hasn't come up and >the >> >> agent.log is full of the same errors. >> >> >> >> On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq > >> >> wrote: >> >> >> >>> Ah hah! Ok, so I've managed to start it using virsh on the >second >> >host >> >>> but my first host is still dead. >> >>> >> >>> First of all, what are these 56,317 .prob- files that get dumped >to >> >the >> >>> NFS mounts? >> >>> >> >>> Secondly, why doesn't the node mount the NFS directories at boot? >> >Is >> >>> that the issue with this particular node? >> >>> >> >>> On Wed, Apr 8, 2020 at 11:12 PM >wrote: >> >>> >> >>>> Did you try virsh list --inactive >> >>>> >> >>>> >> >>>> >> >>>> Eric Evans >> >>>> >> >>>> Digital Data Services LLC. >> >>>> >> >>>> 304.660.9080 >> >>>> >> >>>> >> >>>> >> >>>> *From:* Shareef Jalloq >> >>>> *Sent:* Wednesday, April 8, 2020 5:58 PM >> >>>> *To:* Strahil Nikolov >> >>>> *Cc:* Ovirt Users >> >>>> *Subject:* [ovirt-users] Re: ovirt-engine unresponsive - how to >> >rescue? >> >>>> >> >>>> >> >>>> >> >>>> I've now shut down the VMs on one host and rebooted it but the >> >agent >> >>>> service doesn't start. If I run 'hosted-engine --vm-status' I >get: >> >>>> >> >>>> >> >>>> >> >>>> The hosted engine configuration has not been retrieved from >shared >> >>>> storage. Please ensure that ovirt-ha-agent is running and the >> >storage >> >>>> server is reachable. >> >>>> >> >>>> >> >>>> >> >>>> and indeed if I list the mounts under /rhev/data-center/mnt, >only >> >one of >> >>>> the directories is mounted. I have 3 NFS mounts, one ISO Domain >> >and two >> >>>> Data Domains. Only one Data Domain has mounted and this has >lots >> >of .prob >> >>>> files in. So why haven't the other NFS exports been mounted? >> >>>> >> >>>> >> >>>> >> >>>> Manually mounting them doesn't seem to have helped much either. >I >> >can >> >>>> start the broker service but the agent service says no. Same >error >> >as the >> >>>> one in my last email. >> >>>> >> >>>> >> >>>> >> >>>> Shareef. >> >>>> >> >>>> >> >>>> >> >>>> On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq >> > >> >>>> wrote: >> >>>> >> >>>> Right, still down. I've run virsh and it doesn't know anything >> >
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
ork is called > >'ovirtmgnt'. > > > >On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq > >wrote: > > > >> Hmmm, virsh tells me the HE is running but it hasn't come up and the > >> agent.log is full of the same errors. > >> > >> On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq > >> wrote: > >> > >>> Ah hah! Ok, so I've managed to start it using virsh on the second > >host > >>> but my first host is still dead. > >>> > >>> First of all, what are these 56,317 .prob- files that get dumped to > >the > >>> NFS mounts? > >>> > >>> Secondly, why doesn't the node mount the NFS directories at boot? > >Is > >>> that the issue with this particular node? > >>> > >>> On Wed, Apr 8, 2020 at 11:12 PM wrote: > >>> > >>>> Did you try virsh list --inactive > >>>> > >>>> > >>>> > >>>> Eric Evans > >>>> > >>>> Digital Data Services LLC. > >>>> > >>>> 304.660.9080 > >>>> > >>>> > >>>> > >>>> *From:* Shareef Jalloq > >>>> *Sent:* Wednesday, April 8, 2020 5:58 PM > >>>> *To:* Strahil Nikolov > >>>> *Cc:* Ovirt Users > >>>> *Subject:* [ovirt-users] Re: ovirt-engine unresponsive - how to > >rescue? > >>>> > >>>> > >>>> > >>>> I've now shut down the VMs on one host and rebooted it but the > >agent > >>>> service doesn't start. If I run 'hosted-engine --vm-status' I get: > >>>> > >>>> > >>>> > >>>> The hosted engine configuration has not been retrieved from shared > >>>> storage. Please ensure that ovirt-ha-agent is running and the > >storage > >>>> server is reachable. > >>>> > >>>> > >>>> > >>>> and indeed if I list the mounts under /rhev/data-center/mnt, only > >one of > >>>> the directories is mounted. I have 3 NFS mounts, one ISO Domain > >and two > >>>> Data Domains. Only one Data Domain has mounted and this has lots > >of .prob > >>>> files in. So why haven't the other NFS exports been mounted? > >>>> > >>>> > >>>> > >>>> Manually mounting them doesn't seem to have helped much either. I > >can > >>>> start the broker service but the agent service says no. Same error > >as the > >>>> one in my last email. > >>>> > >>>> > >>>> > >>>> Shareef. > >>>> > >>>> > >>>> > >>>> On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq > > > >>>> wrote: > >>>> > >>>> Right, still down. I've run virsh and it doesn't know anything > >about > >>>> the engine vm. > >>>> > >>>> > >>>> > >>>> I've restarted the broker and agent services and I still get > >nothing in > >>>> virsh->list. > >>>> > >>>> > >>>> > >>>> In the logs under /var/log/ovirt-hosted-engine-ha I see lots of > >errors: > >>>> > >>>> > >>>> > >>>> broker.log: > >>>> > >>>> > >>>> > >>>> MainThread::INFO::2020-04-08 > >>>> > > >20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > >>>> ovirt-hosted-engine-ha broker 2.3.6 started > >>>> > >>>> MainThread::INFO::2020-04-08 > >>>> > > >20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >>>> Searching for submonitors in > >>>> > >/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors > >>>> > >>>> MainThread::INFO::2020-04-08 > >>>> > > >20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > >>>> Loaded submonitor network > >>>> > >>>> MainThread::INFO::2020-04-08 > >>>> > > >20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
On April 9, 2020 1:51:05 AM GMT+03:00, Shareef Jalloq wrote: >Don't know if this is useful or not, but I just tried to shutdown and >start >another VM on one of the hosts and get the following error: > >virsh # start scratch > >error: Failed to start domain scratch > >error: Network not found: no network with matching name >'vdsm-ovirtmgmt' > >Is this not referring to the interface name as the network is called >'ovirtmgnt'. > >On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq >wrote: > >> Hmmm, virsh tells me the HE is running but it hasn't come up and the >> agent.log is full of the same errors. >> >> On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq >> wrote: >> >>> Ah hah! Ok, so I've managed to start it using virsh on the second >host >>> but my first host is still dead. >>> >>> First of all, what are these 56,317 .prob- files that get dumped to >the >>> NFS mounts? >>> >>> Secondly, why doesn't the node mount the NFS directories at boot? >Is >>> that the issue with this particular node? >>> >>> On Wed, Apr 8, 2020 at 11:12 PM wrote: >>> >>>> Did you try virsh list --inactive >>>> >>>> >>>> >>>> Eric Evans >>>> >>>> Digital Data Services LLC. >>>> >>>> 304.660.9080 >>>> >>>> >>>> >>>> *From:* Shareef Jalloq >>>> *Sent:* Wednesday, April 8, 2020 5:58 PM >>>> *To:* Strahil Nikolov >>>> *Cc:* Ovirt Users >>>> *Subject:* [ovirt-users] Re: ovirt-engine unresponsive - how to >rescue? >>>> >>>> >>>> >>>> I've now shut down the VMs on one host and rebooted it but the >agent >>>> service doesn't start. If I run 'hosted-engine --vm-status' I get: >>>> >>>> >>>> >>>> The hosted engine configuration has not been retrieved from shared >>>> storage. Please ensure that ovirt-ha-agent is running and the >storage >>>> server is reachable. >>>> >>>> >>>> >>>> and indeed if I list the mounts under /rhev/data-center/mnt, only >one of >>>> the directories is mounted. I have 3 NFS mounts, one ISO Domain >and two >>>> Data Domains. Only one Data Domain has mounted and this has lots >of .prob >>>> files in. So why haven't the other NFS exports been mounted? >>>> >>>> >>>> >>>> Manually mounting them doesn't seem to have helped much either. I >can >>>> start the broker service but the agent service says no. Same error >as the >>>> one in my last email. >>>> >>>> >>>> >>>> Shareef. >>>> >>>> >>>> >>>> On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq > >>>> wrote: >>>> >>>> Right, still down. I've run virsh and it doesn't know anything >about >>>> the engine vm. >>>> >>>> >>>> >>>> I've restarted the broker and agent services and I still get >nothing in >>>> virsh->list. >>>> >>>> >>>> >>>> In the logs under /var/log/ovirt-hosted-engine-ha I see lots of >errors: >>>> >>>> >>>> >>>> broker.log: >>>> >>>> >>>> >>>> MainThread::INFO::2020-04-08 >>>> >20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >>>> ovirt-hosted-engine-ha broker 2.3.6 started >>>> >>>> MainThread::INFO::2020-04-08 >>>> >20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> Searching for submonitors in >>>> >/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors >>>> >>>> MainThread::INFO::2020-04-08 >>>> >20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> Loaded submonitor network >>>> >>>> MainThread::INFO::2020-04-08 >>>> >20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>>> Loaded submonitor cpu-load-no-engine >>>> >>>> MainThread::INFO::2020-04-08 >>>> >20:56:20,140::monitor::49::ovirt
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Don't know if this is useful or not, but I just tried to shutdown and start another VM on one of the hosts and get the following error: virsh # start scratch error: Failed to start domain scratch error: Network not found: no network with matching name 'vdsm-ovirtmgmt' Is this not referring to the interface name as the network is called 'ovirtmgnt'. On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq wrote: > Hmmm, virsh tells me the HE is running but it hasn't come up and the > agent.log is full of the same errors. > > On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq > wrote: > >> Ah hah! Ok, so I've managed to start it using virsh on the second host >> but my first host is still dead. >> >> First of all, what are these 56,317 .prob- files that get dumped to the >> NFS mounts? >> >> Secondly, why doesn't the node mount the NFS directories at boot? Is >> that the issue with this particular node? >> >> On Wed, Apr 8, 2020 at 11:12 PM wrote: >> >>> Did you try virsh list --inactive >>> >>> >>> >>> Eric Evans >>> >>> Digital Data Services LLC. >>> >>> 304.660.9080 >>> >>> >>> >>> *From:* Shareef Jalloq >>> *Sent:* Wednesday, April 8, 2020 5:58 PM >>> *To:* Strahil Nikolov >>> *Cc:* Ovirt Users >>> *Subject:* [ovirt-users] Re: ovirt-engine unresponsive - how to rescue? >>> >>> >>> >>> I've now shut down the VMs on one host and rebooted it but the agent >>> service doesn't start. If I run 'hosted-engine --vm-status' I get: >>> >>> >>> >>> The hosted engine configuration has not been retrieved from shared >>> storage. Please ensure that ovirt-ha-agent is running and the storage >>> server is reachable. >>> >>> >>> >>> and indeed if I list the mounts under /rhev/data-center/mnt, only one of >>> the directories is mounted. I have 3 NFS mounts, one ISO Domain and two >>> Data Domains. Only one Data Domain has mounted and this has lots of .prob >>> files in. So why haven't the other NFS exports been mounted? >>> >>> >>> >>> Manually mounting them doesn't seem to have helped much either. I can >>> start the broker service but the agent service says no. Same error as the >>> one in my last email. >>> >>> >>> >>> Shareef. >>> >>> >>> >>> On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq >>> wrote: >>> >>> Right, still down. I've run virsh and it doesn't know anything about >>> the engine vm. >>> >>> >>> >>> I've restarted the broker and agent services and I still get nothing in >>> virsh->list. >>> >>> >>> >>> In the logs under /var/log/ovirt-hosted-engine-ha I see lots of errors: >>> >>> >>> >>> broker.log: >>> >>> >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >>> ovirt-hosted-engine-ha broker 2.3.6 started >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>> Searching for submonitors in >>> /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>> Loaded submonitor network >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>> Loaded submonitor cpu-load-no-engine >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>> Loaded submonitor mgmt-bridge >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>> Loaded submonitor network >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >>> Loaded submonitor cpu-load >>> >>> MainThread::INFO::2020-04-08 >>> 20:56:20,141::monitor::49::ovirt_hosted_engi
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Hmmm, virsh tells me the HE is running but it hasn't come up and the agent.log is full of the same errors. On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq wrote: > Ah hah! Ok, so I've managed to start it using virsh on the second host > but my first host is still dead. > > First of all, what are these 56,317 .prob- files that get dumped to the > NFS mounts? > > Secondly, why doesn't the node mount the NFS directories at boot? Is that > the issue with this particular node? > > On Wed, Apr 8, 2020 at 11:12 PM wrote: > >> Did you try virsh list --inactive >> >> >> >> Eric Evans >> >> Digital Data Services LLC. >> >> 304.660.9080 >> >> >> >> *From:* Shareef Jalloq >> *Sent:* Wednesday, April 8, 2020 5:58 PM >> *To:* Strahil Nikolov >> *Cc:* Ovirt Users >> *Subject:* [ovirt-users] Re: ovirt-engine unresponsive - how to rescue? >> >> >> >> I've now shut down the VMs on one host and rebooted it but the agent >> service doesn't start. If I run 'hosted-engine --vm-status' I get: >> >> >> >> The hosted engine configuration has not been retrieved from shared >> storage. Please ensure that ovirt-ha-agent is running and the storage >> server is reachable. >> >> >> >> and indeed if I list the mounts under /rhev/data-center/mnt, only one of >> the directories is mounted. I have 3 NFS mounts, one ISO Domain and two >> Data Domains. Only one Data Domain has mounted and this has lots of .prob >> files in. So why haven't the other NFS exports been mounted? >> >> >> >> Manually mounting them doesn't seem to have helped much either. I can >> start the broker service but the agent service says no. Same error as the >> one in my last email. >> >> >> >> Shareef. >> >> >> >> On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq >> wrote: >> >> Right, still down. I've run virsh and it doesn't know anything about the >> engine vm. >> >> >> >> I've restarted the broker and agent services and I still get nothing in >> virsh->list. >> >> >> >> In the logs under /var/log/ovirt-hosted-engine-ha I see lots of errors: >> >> >> >> broker.log: >> >> >> >> MainThread::INFO::2020-04-08 >> 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> ovirt-hosted-engine-ha broker 2.3.6 started >> >> MainThread::INFO::2020-04-08 >> 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Searching for submonitors in >> /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors >> >> MainThread::INFO::2020-04-08 >> 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor network >> >> MainThread::INFO::2020-04-08 >> 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor cpu-load-no-engine >> >> MainThread::INFO::2020-04-08 >> 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor mgmt-bridge >> >> MainThread::INFO::2020-04-08 >> 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor network >> >> MainThread::INFO::2020-04-08 >> 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor cpu-load >> >> MainThread::INFO::2020-04-08 >> 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor engine-health >> >> MainThread::INFO::2020-04-08 >> 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor mgmt-bridge >> >> MainThread::INFO::2020-04-08 >> 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor cpu-load-no-engine >> >> MainThread::INFO::2020-04-08 >> 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor cpu-load >> >> MainThread::INFO::2020-04-08 >> 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> Loaded submonitor mem-free >> >> MainThread::INFO::20
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Ah hah! Ok, so I've managed to start it using virsh on the second host but my first host is still dead. First of all, what are these 56,317 .prob- files that get dumped to the NFS mounts? Secondly, why doesn't the node mount the NFS directories at boot? Is that the issue with this particular node? On Wed, Apr 8, 2020 at 11:12 PM wrote: > Did you try virsh list --inactive > > > > Eric Evans > > Digital Data Services LLC. > > 304.660.9080 > > > > *From:* Shareef Jalloq > *Sent:* Wednesday, April 8, 2020 5:58 PM > *To:* Strahil Nikolov > *Cc:* Ovirt Users > *Subject:* [ovirt-users] Re: ovirt-engine unresponsive - how to rescue? > > > > I've now shut down the VMs on one host and rebooted it but the agent > service doesn't start. If I run 'hosted-engine --vm-status' I get: > > > > The hosted engine configuration has not been retrieved from shared > storage. Please ensure that ovirt-ha-agent is running and the storage > server is reachable. > > > > and indeed if I list the mounts under /rhev/data-center/mnt, only one of > the directories is mounted. I have 3 NFS mounts, one ISO Domain and two > Data Domains. Only one Data Domain has mounted and this has lots of .prob > files in. So why haven't the other NFS exports been mounted? > > > > Manually mounting them doesn't seem to have helped much either. I can > start the broker service but the agent service says no. Same error as the > one in my last email. > > > > Shareef. > > > > On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq > wrote: > > Right, still down. I've run virsh and it doesn't know anything about the > engine vm. > > > > I've restarted the broker and agent services and I still get nothing in > virsh->list. > > > > In the logs under /var/log/ovirt-hosted-engine-ha I see lots of errors: > > > > broker.log: > > > > MainThread::INFO::2020-04-08 > 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > ovirt-hosted-engine-ha broker 2.3.6 started > > MainThread::INFO::2020-04-08 > 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Searching for submonitors in > /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors > > MainThread::INFO::2020-04-08 > 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor network > > MainThread::INFO::2020-04-08 > 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load-no-engine > > MainThread::INFO::2020-04-08 > 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mgmt-bridge > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor network > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor engine-health > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mgmt-bridge > > MainThread::INFO::2020-04-08 > 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load-no-engine > > MainThread::INFO::2020-04-08 > 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load > > MainThread::INFO::2020-04-08 > 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mem-free > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor storage-domain > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor storage-domain > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mem-free > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor engine-health > >
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Did you try virsh list --inactive Eric Evans Digital Data Services LLC. 304.660.9080 From: Shareef Jalloq Sent: Wednesday, April 8, 2020 5:58 PM To: Strahil Nikolov Cc: Ovirt Users Subject: [ovirt-users] Re: ovirt-engine unresponsive - how to rescue? I've now shut down the VMs on one host and rebooted it but the agent service doesn't start. If I run 'hosted-engine --vm-status' I get: The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable. and indeed if I list the mounts under /rhev/data-center/mnt, only one of the directories is mounted. I have 3 NFS mounts, one ISO Domain and two Data Domains. Only one Data Domain has mounted and this has lots of .prob files in. So why haven't the other NFS exports been mounted? Manually mounting them doesn't seem to have helped much either. I can start the broker service but the agent service says no. Same error as the one in my last email. Shareef. On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq mailto:shar...@jalloq.co.uk> > wrote: Right, still down. I've run virsh and it doesn't know anything about the engine vm. I've restarted the broker and agent services and I still get nothing in virsh->list. In the logs under /var/log/ovirt-hosted-engine-ha I see lots of errors: broker.log: MainThread::INFO::2020-04-08 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.3.6 started MainThread::INFO::2020-04-08 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors MainThread::INFO::2020-04-08 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor network MainThread::INFO::2020-04-08 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2020-04-08 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor network MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2020-04-08 20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors MainThread::INFO::2020-04-08 20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2020-04-08 20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2020-04-08 20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2020-04-08 20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain MainThread:
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
I've now shut down the VMs on one host and rebooted it but the agent service doesn't start. If I run 'hosted-engine --vm-status' I get: The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable. and indeed if I list the mounts under /rhev/data-center/mnt, only one of the directories is mounted. I have 3 NFS mounts, one ISO Domain and two Data Domains. Only one Data Domain has mounted and this has lots of .prob files in. So why haven't the other NFS exports been mounted? Manually mounting them doesn't seem to have helped much either. I can start the broker service but the agent service says no. Same error as the one in my last email. Shareef. On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq wrote: > Right, still down. I've run virsh and it doesn't know anything about the > engine vm. > > I've restarted the broker and agent services and I still get nothing in > virsh->list. > > In the logs under /var/log/ovirt-hosted-engine-ha I see lots of errors: > > broker.log: > > MainThread::INFO::2020-04-08 > 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > ovirt-hosted-engine-ha broker 2.3.6 started > > MainThread::INFO::2020-04-08 > 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Searching for submonitors in > /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors > > MainThread::INFO::2020-04-08 > 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor network > > MainThread::INFO::2020-04-08 > 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load-no-engine > > MainThread::INFO::2020-04-08 > 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mgmt-bridge > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor network > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor engine-health > > MainThread::INFO::2020-04-08 > 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mgmt-bridge > > MainThread::INFO::2020-04-08 > 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load-no-engine > > MainThread::INFO::2020-04-08 > 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load > > MainThread::INFO::2020-04-08 > 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mem-free > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor storage-domain > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor storage-domain > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mem-free > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor engine-health > > MainThread::INFO::2020-04-08 > 20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Finished loading submonitors > > MainThread::INFO::2020-04-08 > 20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) > Connecting the storage > > MainThread::INFO::2020-04-08 > 20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > Connecting storage server > > MainThread::INFO::2020-04-08 > 20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > Connecting storage server > > MainThread::INFO::2020-04-08 > 20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) > Refreshing the storage domain > > MainThread::WARNING::2020-04-08 > 20:56:21,057::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > Can't connect vdsm storage: Command StorageDomain.getInfo with args > {'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: > > (code=350, message=Error in storage domain action: >
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Right, still down. I've run virsh and it doesn't know anything about the engine vm. I've restarted the broker and agent services and I still get nothing in virsh->list. In the logs under /var/log/ovirt-hosted-engine-ha I see lots of errors: broker.log: MainThread::INFO::2020-04-08 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.3.6 started MainThread::INFO::2020-04-08 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors MainThread::INFO::2020-04-08 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor network MainThread::INFO::2020-04-08 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2020-04-08 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor network MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2020-04-08 20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors MainThread::INFO::2020-04-08 20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2020-04-08 20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2020-04-08 20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2020-04-08 20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain MainThread::WARNING::2020-04-08 20:56:21,057::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Command StorageDomain.getInfo with args {'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: (code=350, message=Error in storage domain action: (u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) MainThread::INFO::2020-04-08 20:56:21,901::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.3.6 started MainThread::INFO::2020-04-08 20:56:21,901::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors agent.log: MainThread::ERROR::2020-04-08 20:57:00,799::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent MainThread::INFO::2020-04-08 20:57:00,799::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down MainThread::INFO::2020-04-08 20:57:11,144::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.3.6 started MainThread::INFO::2020-04-08 20:57:11,182::hosted_engine::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Foun
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
If you haven’t got this resolved, log into the host and use ‘saslpasswd’ without the quotes. Then virsh start and use the password you set on the local account. I’m not sure it will work, but has worked for regular vm’s. Eric Evans Digital Data Services LLC. 304.660.9080 From: Shareef Jalloq Sent: Wednesday, April 8, 2020 11:51 AM To: users@ovirt.org Subject: [ovirt-users] ovirt-engine unresponsive - how to rescue? So my engine has gone down and I can't ssh into it either. If I try to log into the web-ui of the node it is running on, I get redirected because the node can't reach the engine. What are my next steps? Shareef. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/R2IBM5HIKK73AHQK3M3YWQB427PTFLVJ/
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
If you haven’t got this resolved, log into the host and use ‘saslpasswd’ without the quotes. Then virsh start and use the password you set on the local account. I’m not sure it will work, but has worked for regular vm’s. Eric Evans Digital Data Services LLC. 304.660.9080 From: Maton, Brett Sent: Wednesday, April 8, 2020 12:09 PM To: Shareef Jalloq Cc: Ovirt Users Subject: [ovirt-users] Re: ovirt-engine unresponsive - how to rescue? First steps, on one of your hosts as root: To get information: hosted-engine --vm-status To start the engine: hosted-engine --vm-start On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq mailto:shar...@jalloq.co.uk> > wrote: So my engine has gone down and I can't ssh into it either. If I try to log into the web-ui of the node it is running on, I get redirected because the node can't reach the engine. What are my next steps? Shareef. ___ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-le...@ovirt.org <mailto:users-le...@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/B4GWVFBHALIZLI7MGXP5ZQ63PS327CB2/
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
On April 8, 2020 7:47:20 PM GMT+03:00, "Maton, Brett" wrote: >On the host you tried to restart the engine on: > >Add an alias to virsh (authenticates with virsh_auth.conf) > >alias virsh='virsh -c >qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' > >Then run virsh: > >virsh > >virsh # list > IdName State > > xxHostedEngine Paused > xx** running > ... > xx ** running > >HostedEngine should be in the list, try and resume the engine: > >virsh # resume HostedEngine > >On Wed, 8 Apr 2020 at 17:28, Shareef Jalloq >wrote: > >> Thanks! >> >> The status hangs due to, I guess, the VM being down >> >> [root@ovirt-node-01 ~]# hosted-engine --vm-start >> VM exists and is down, cleaning up and restarting >> VM in WaitForLaunch >> >> but this doesn't seem to do anything. OK, after a while I get a >status of >> it being barfed... >> >> --== Host ovirt-node-00.phoelex.com (id: 1) status ==-- >> >> conf_on_shared_storage : True >> Status up-to-date : False >> Hostname : ovirt-node-00.phoelex.com >> Host ID: 1 >> Engine status : unknown stale-data >> Score : 3400 >> stopped: False >> Local maintenance : False >> crc32 : 9c4a034b >> local_conf_timestamp : 523362 >> Host timestamp : 523608 >> Extra metadata (valid at timestamp): >> metadata_parse_version=1 >> metadata_feature_version=1 >> timestamp=523608 (Wed Apr 8 16:17:11 2020) >> host-id=1 >> score=3400 >> vm_conf_refresh_time=523362 (Wed Apr 8 16:13:06 2020) >> conf_on_shared_storage=True >> maintenance=False >> state=EngineDown >> stopped=False >> >> >> --== Host ovirt-node-01.phoelex.com (id: 2) status ==-- >> >> conf_on_shared_storage : True >> Status up-to-date : True >> Hostname : ovirt-node-01.phoelex.com >> Host ID: 2 >> Engine status : {"reason": "bad vm status", >"health": >> "bad", "vm": "down_unexpected", "detail": "Down"} >> Score : 0 >> stopped: False >> Local maintenance : False >> crc32 : 5045f2eb >> local_conf_timestamp : 1737037 >> Host timestamp : 1737283 >> Extra metadata (valid at timestamp): >> metadata_parse_version=1 >> metadata_feature_version=1 >> timestamp=1737283 (Wed Apr 8 16:16:17 2020) >> host-id=2 >> score=0 >> vm_conf_refresh_time=1737037 (Wed Apr 8 16:12:11 2020) >> conf_on_shared_storage=True >> maintenance=False >> state=EngineUnexpectedlyDown >> stopped=False >> >> On Wed, Apr 8, 2020 at 5:09 PM Maton, Brett > >> wrote: >> >>> First steps, on one of your hosts as root: >>> >>> To get information: >>> hosted-engine --vm-status >>> >>> To start the engine: >>> hosted-engine --vm-start >>> >>> >>> On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq >wrote: >>> So my engine has gone down and I can't ssh into it either. If I >try to log into the web-ui of the node it is running on, I get redirected >because the node can't reach the engine. What are my next steps? Shareef. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: >https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/ >>> This has to be resolved: Engine status : unknown stale-data Run again 'hosted-engine --vm-status'. If it remains the same, restart ovirt-ha-broker.service & ovirt-ha-agent.service Verify that the engine's storage is available. Then monitor the broker & agent logs in /var/log/ovirt-hosted-engine-ha Best Regards, Strahil Nikolov ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NAOURFNIYPHOXX2T4I7JCWF4EXUQYLX6/
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
On the host you tried to restart the engine on: Add an alias to virsh (authenticates with virsh_auth.conf) alias virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' Then run virsh: virsh virsh # list IdName State xxHostedEngine Paused xx** running ... xx ** running HostedEngine should be in the list, try and resume the engine: virsh # resume HostedEngine On Wed, 8 Apr 2020 at 17:28, Shareef Jalloq wrote: > Thanks! > > The status hangs due to, I guess, the VM being down > > [root@ovirt-node-01 ~]# hosted-engine --vm-start > VM exists and is down, cleaning up and restarting > VM in WaitForLaunch > > but this doesn't seem to do anything. OK, after a while I get a status of > it being barfed... > > --== Host ovirt-node-00.phoelex.com (id: 1) status ==-- > > conf_on_shared_storage : True > Status up-to-date : False > Hostname : ovirt-node-00.phoelex.com > Host ID: 1 > Engine status : unknown stale-data > Score : 3400 > stopped: False > Local maintenance : False > crc32 : 9c4a034b > local_conf_timestamp : 523362 > Host timestamp : 523608 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=523608 (Wed Apr 8 16:17:11 2020) > host-id=1 > score=3400 > vm_conf_refresh_time=523362 (Wed Apr 8 16:13:06 2020) > conf_on_shared_storage=True > maintenance=False > state=EngineDown > stopped=False > > > --== Host ovirt-node-01.phoelex.com (id: 2) status ==-- > > conf_on_shared_storage : True > Status up-to-date : True > Hostname : ovirt-node-01.phoelex.com > Host ID: 2 > Engine status : {"reason": "bad vm status", "health": > "bad", "vm": "down_unexpected", "detail": "Down"} > Score : 0 > stopped: False > Local maintenance : False > crc32 : 5045f2eb > local_conf_timestamp : 1737037 > Host timestamp : 1737283 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=1737283 (Wed Apr 8 16:16:17 2020) > host-id=2 > score=0 > vm_conf_refresh_time=1737037 (Wed Apr 8 16:12:11 2020) > conf_on_shared_storage=True > maintenance=False > state=EngineUnexpectedlyDown > stopped=False > > On Wed, Apr 8, 2020 at 5:09 PM Maton, Brett > wrote: > >> First steps, on one of your hosts as root: >> >> To get information: >> hosted-engine --vm-status >> >> To start the engine: >> hosted-engine --vm-start >> >> >> On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq wrote: >> >>> So my engine has gone down and I can't ssh into it either. If I try to >>> log into the web-ui of the node it is running on, I get redirected because >>> the node can't reach the engine. >>> >>> What are my next steps? >>> >>> Shareef. >>> ___ >>> Users mailing list -- users@ovirt.org >>> To unsubscribe send an email to users-le...@ovirt.org >>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/ >>> >> ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6FABGPYS5WWFMW3ZT2DL6VLRP2Z54PHJ/
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
Thanks! The status hangs due to, I guess, the VM being down [root@ovirt-node-01 ~]# hosted-engine --vm-start VM exists and is down, cleaning up and restarting VM in WaitForLaunch but this doesn't seem to do anything. OK, after a while I get a status of it being barfed... --== Host ovirt-node-00.phoelex.com (id: 1) status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : ovirt-node-00.phoelex.com Host ID: 1 Engine status : unknown stale-data Score : 3400 stopped: False Local maintenance : False crc32 : 9c4a034b local_conf_timestamp : 523362 Host timestamp : 523608 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=523608 (Wed Apr 8 16:17:11 2020) host-id=1 score=3400 vm_conf_refresh_time=523362 (Wed Apr 8 16:13:06 2020) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host ovirt-node-01.phoelex.com (id: 2) status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : ovirt-node-01.phoelex.com Host ID: 2 Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down_unexpected", "detail": "Down"} Score : 0 stopped: False Local maintenance : False crc32 : 5045f2eb local_conf_timestamp : 1737037 Host timestamp : 1737283 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1737283 (Wed Apr 8 16:16:17 2020) host-id=2 score=0 vm_conf_refresh_time=1737037 (Wed Apr 8 16:12:11 2020) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False On Wed, Apr 8, 2020 at 5:09 PM Maton, Brett wrote: > First steps, on one of your hosts as root: > > To get information: > hosted-engine --vm-status > > To start the engine: > hosted-engine --vm-start > > > On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq wrote: > >> So my engine has gone down and I can't ssh into it either. If I try to >> log into the web-ui of the node it is running on, I get redirected because >> the node can't reach the engine. >> >> What are my next steps? >> >> Shareef. >> ___ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-le...@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/ >> > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/IZJ3GFUD2JI4U2KZGMSEOFZLGALDVU3Z/
[ovirt-users] Re: ovirt-engine unresponsive - how to rescue?
First steps, on one of your hosts as root: To get information: hosted-engine --vm-status To start the engine: hosted-engine --vm-start On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq wrote: > So my engine has gone down and I can't ssh into it either. If I try to > log into the web-ui of the node it is running on, I get redirected because > the node can't reach the engine. > > What are my next steps? > > Shareef. > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SESPMFFDWZFFBXETP35HNT5SFNU2Z6HK/