On Mon, Jan 6, 2020 at 6:01 PM Bob Franzke <bob.fran...@mdaemon.com> wrote:
>
> I just had some VMs go offline over the weekend. I really cannot figure out 
> how to tell why without the engine working.

If you suspect that they failed, as opposed to being shut down from
inside them (by an admin or whatever), then you can check vdsm logs on
the host they ran on, and might find a clue.

> I don’t need to really 'control' the VMs but seems without the engine its not 
> just the control aspect. It’s the visibility it gives you into the state of 
> your environment. We used Ovirt as also a lab setup for users to access and 
> build VMs as needed. This is completely offline now without a working Engine. 
> Seems like having an engine available all the time would be pretty important 
> generally.
>
> I have never understood the idea of having the machine that controls VMs, 
> being in the same infrastructure its controlling. Seems very 'chicken or the 
> egg' sort of thing to me. If the engine decides to move itself from one host 
> to another, and it fails for some reason because the process of moving itself 
> caused a problem (stopping services, etc.)then not sure what you would end up 
> with there. Seems very iffy to me, but maybe I am reading too much into it. 
> Again I admittedly don’t know enough about Ovirt to know if this thinking is 
> off base or not. My own experience with networking systems means you would 
> never set things up like this. Each system is autonomous and can take over 
> for the other if one part fails. But then again, if Ovirt Engine had been set 
> up this way, maybe I wouldn't be in the position I am now with no working 
> engine. Lots to sort out. Thanks for the help.

Each host participating in the hosted-engine cluster has two small
daemons, called agent and broker, in the package
ovirt-hosted-engine-ha, that should take care of the engine VM.

You are right that this is a chicken-and-egg problem, and this is the
solution that oVirt includes.

>
> -----Original Message-----
> From: Yedidyah Bar David (d...@redhat.com) <d...@redhat.com>
> Sent: Monday, January 6, 2020 8:26 AM
> To: Bob Franzke <bob.fran...@mdaemon.com>
> Cc: users <users@ovirt.org>
> Subject: [ovirt-users] Re: OVirt Engine Server Died - Steps for Rebuilding 
> the Ovirt Engine System
>
> On Mon, Jan 6, 2020 at 4:19 PM Bob Franzke <bob.fran...@mdaemon.com> wrote:
> >
> > So I am getting the impression that without a working ovirt engine, you are 
> > sort of cooked from being able to control VMs such that your whole 
> > organization can potentially come down to the availability of a single 
> > machine? Is this really correct?
>
> Correct.
>
> This does not mean that the engine itself is necessarily critical - if it's 
> down, your VMs should still be ok. If _controlling_ VMs is considered 
> critical for you, then yes - you do need to make sure your engine is alive 
> and well.
>
> > Are there HA options available for the engine server itself?
>
> The standard option is using hosted-engine with several hosts - you get HA 
> out-of-the-box.
>
> I also heard about people using standalone active/standby clustering/HA 
> solutions for the engine.
>
> >
> > -----Original Message-----
> > From: Yedidyah Bar David (d...@redhat.com) <d...@redhat.com>
> > Sent: Monday, January 6, 2020 12:57 AM
> > To: Bob Franzke <bob.fran...@mdaemon.com>
> > Cc: users <users@ovirt.org>
> > Subject: [ovirt-users] Re: OVirt Engine Server Died - Steps for
> > Rebuilding the Ovirt Engine System
> >
> > On Mon, Jan 6, 2020 at 12:00 AM Bob Franzke <bob.fran...@mdaemon.com> wrote:
> > >
> > > Thanks for the reply here. Still waiting on a server to rebuild this 
> > > with. Should be here tomorrow. The engine was running on bare metal 
> > > server, and was not a VM.
> > >
> > > In the mean time we had a few of the VMs go dark for some reason. I 
> > > discovered the vdsm-client commands and tried figuring out what happened. 
> > > Is there any way I can start a VM via command line on one of the VM 
> > > hosts? Is the vdsm-client command the way to do this without a working 
> > > engine?
> >
> > It is, in principle, but that's not supported and is risky - because the 
> > engine will not know what you do.
> >
> > See also e.g.:
> >
> > https://www.ovirt.org/develop/release-management/features/integration/
> > cockpit.html
> >
> > >
> > > -----Original Message-----
> > > From: Yedidyah Bar David (d...@redhat.com) <d...@redhat.com>
> > > Sent: Tuesday, December 24, 2019 1:50 AM
> > > To: Bob Franzke <bob.fran...@mdaemon.com>
> > > Cc: users <users@ovirt.org>
> > > Subject: [ovirt-users] Re: OVirt Engine Server Died - Steps for
> > > Rebuilding the Ovirt Engine System
> > >
> > > On Mon, Dec 23, 2019 at 7:08 PM Bob Franzke <bob.fran...@mdaemon.com> 
> > > wrote:
> > > >
> > > > > Which nightly backups? Do they run engine-backup?
> > > >
> > > > Yes sorry. The backups are the backups created when running the 
> > > > engine-backup script. So I have the files and the DB backed up and off 
> > > > onto different storage. I just grabbed a copy of the entire /etc 
> > > > directory as well just in case there was something needed in there that 
> > > > is not included in the engine-backup solution.
> > > >
> > > > > In either case, assuming this is a production env, I suggest to first 
> > > > > test on a separate env to see how it all looks like.
> > > >
> > > > This is a production environment. My plan is to get a new server 
> > > > ordered and built, removing the old server from the equation (old 
> > > > server is old and needs to be replaced anyway). Then rebuild the Ovirt 
> > > > bits and restore the data from my backups.
> > >
> > > I assume, from your first post, that you refer to the host running the 
> > > engine, and that this is a standalone engine, not hosted-engine.
> > > Right? Meaning, it's running on bare-metal, not inside a VM managed by 
> > > itself.
> > >
> > > For testing you can try stuff on an isolated VM somewhere, no need to 
> > > wait for your new server to arrive.
> > >
> > > >
> > > > I just more needed a quick set up steps to take here. From what I 
> > > > gather I need to basically:
> > > >
> > > > 1. reinstall CentOS
> > > > 2. Reconfigure storage (this server has several ISCSI LUNs its attached 
> > > > to currently. I don’t know if they are required for this or what).
> > >
> > > I obviously have no idea what is your storage design and requirements, 
> > > but this is largely a local matter, unrelated to the hosts that run VMs. 
> > > The engine machine's storage is (normally) not used for that, only for 
> > > the engine itself (and its db, etc.).
> > >
> > > > 3. Install PostGreSQL (maybe? Or does the ovirt engine script do
> > > > this for you?) 3. Install Ovirt/run ovirt-engine script maybe?
> > >
> > > Add relevant repo, by installing relevant ovirt-releast* package (see the 
> > > web site), and then 'yum install ovirt-engine' - this should grab for you 
> > > postgresql etc.
> > >
> > > > 4. Restore DB and data
> > >
> > > Yes. Run basically 'engine-backup --mode=restore' and then 
> > > 'engine-setup'. Please check the backup/restore documentation on the web 
> > > site.
> > > If your current engine used only defaults (meaning, engine+dwh+their DBs 
> > > all on the engine machine, provisioned by engine-setup), then the restore 
> > > command should be something like:
> > >
> > > engine-backup --mode=restore --file=your-backup-file
> > > --provision-all-databases
> > >
> > > Again, please test on a test VM somewhere, and make sure it's
> > > isolated
> > > - that it can't reach your hosts and start to manage them (unless that's 
> > > what you want, of course).
> > >
> > > >
> > > > I am not sure the details of the list outlined above (what to run 
> > > > where, etc.). I am looking for consultants to help me out here as its 
> > > > clear I am a bit behind the curve on this one. So far not much has 
> > > > worked out on that front. Does the above list seem reasonable in terms 
> > > > of needed steps to get this going again?
> > >
> > > See above.
> > >
> > > For consultants, you might want to check:
> > >
> > > https://www.ovirt.org/community/user-stories/users-and-providers.htm
> > > l
> > >
> > > And/or post again to the list with a subject line that's more likely to 
> > > attract them ("Looking for an oVirt consultant...").
> > >
> > > Good luck and best regards,
> > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Yedidyah Bar David (d...@redhat.com) <d...@redhat.com>
> > > > Sent: Sunday, December 22, 2019 1:58 AM
> > > > To: bob.fran...@mdaemon.com
> > > > Cc: users <users@ovirt.org>
> > > > Subject: [ovirt-users] Re: OVirt Engine Server Died - Steps for
> > > > Rebuilding the Ovirt Engine System
> > > >
> > > > On Fri, Dec 20, 2019 at 8:55 PM <bob.fran...@mdaemon.com> wrote:
> > > > >
> > > > > Full disclosure here.....I am not an Ovirt Expert. I am a network 
> > > > > Engineer that has been forced to take over sysadmin duties for a 
> > > > > departed co-worker. I have little experience with Ovirt so apologies 
> > > > > up front for anything I say that comes across as stupid or "RTM" 
> > > > > questions. Normally I would do just that but I am in a bind and am 
> > > > > trying to figure this out quickly. We have an OVirt installation 
> > > > > setup that consists of 4 nodes and a server that hosts the 
> > > > > ovirt-engine all running CentOS 7. The server that hosts the engine 
> > > > > has a pair of failing hard drives and I need to replace the hardware 
> > > > > ASAP. Need to outline the steps needed to build a new server to serve 
> > > > > as and replace the ovirt engine server. I have backed up the entire 
> > > > > /etc directory and the backups being done nightly by the engine 
> > > > > itself.
> > > >
> > > > Which nightly backups? Do they run engine-backup?
> > > >
> > > > > I also backed up the iscsi info and took a printout of all the disk 
> > > > > arrangement . The disk has gotten so bad at this point that the DB 
> > > > > won't back up any longer. Get fatal:backup failed error when
> > > > >   trying to run the ovirt backup tool. Also the Ovirt management site 
> > > > > is not rendering and I am not sure why.
> > > > >
> > > > > Is there anything else I need to make sure I backup in order to 
> > > > > migrate the engine from one server to another?
> > > >
> > > > Generally speaking, if you used engine-backup for backups, it should be 
> > > > enough - it backs up all it needs from /etc.
> > > >
> > > > If you didn't use that, /etc won't be enough. You also need a database 
> > > > backup.
> > > >
> > > > If you do not have a backup of the database, you'll need to create a 
> > > > new engine from scratch. You can then import the existing storage 
> > > > domains and add the hosts. This will require downtime, and you'll loose 
> > > > some stuff, so if you do have an engine-backup backup, better use that.
> > > >
> > > > In either case, assuming this is a production env, I suggest to first 
> > > > test on a separate env to see how it all looks like.
> > > >
> > > > > Also, until I can get the engine running again, is there any tool 
> > > > > available to manage the VMs on the hosts themselves. The VMs on the 
> > > > > hosts are running but need a way to manage them if needed in case 
> > > > > something happens while the engine is being repaired.
> > > >
> > > > Some management is possible via cockpit. It's much less than what the 
> > > > engine allows.
> > > >
> > > > If you search the list archives, you can find suggestions by people to 
> > > > directly use libvirt/virsh after poking a bit inside your storage 
> > > > domain. I'd not recommend doing that, unless you know very well what 
> > > > you are doing and have no other solution (e.g. if storage is corrupted 
> > > > enough so that import to a new engine fails).
> > > >
> > > > > Any info on this as well as what to backup and the steps to move the 
> > > > > engine from one server to another would be much much appreciated.
> > > >
> > > > You can search the site for backup, restore, and import storage domain, 
> > > > and should find the relevant pages. Please note that the pages under 
> > > > /develop are written during development and are usually not updated 
> > > > after a feature is complete. The official documentation is under 
> > > > /documentation. That, in turn, is often outdated as well :-(.
> > > > You can use RHV docs in addition. These are more up-to-date and should 
> > > > be 99% applicable to oVirt.
> > > >
> > > > > Sorry I know this a real RTM type post but I am in a bind and need a 
> > > > > solution rather quickly. Thanks in advance.
> > > >
> > > > Good luck!
> > > > --
> > > > Didi
> > > > _______________________________________________
> > > > Users mailing list -- users@ovirt.org To unsubscribe send an email
> > > > to users-le...@ovirt.org Privacy
> > > > Statement: https://www.ovirt.org/site/privacy-policy/
> > > > oVirt Code of Conduct:
> > > > https://www.ovirt.org/community/about/community-guidelines/
> > > > List Archives:
> > > > https://lists.ovirt.org/archives/list/users@ovirt.org/message/FU4A
> > > > IR
> > > > 7S
> > > > CTQOQRWLPLPUH5XHDXYI4DD7/
> > > >
> > >
> > >
> > > --
> > > Didi
> > > _______________________________________________
> > > Users mailing list -- users@ovirt.org To unsubscribe send an email
> > > to users-le...@ovirt.org Privacy
> > > Statement: https://www.ovirt.org/site/privacy-policy/
> > > oVirt Code of Conduct:
> > > https://www.ovirt.org/community/about/community-guidelines/
> > > List Archives:
> > > https://lists.ovirt.org/archives/list/users@ovirt.org/message/DX2JFA
> > > I6
> > > T2MXOVOXUVL4QIVPSHQQBSNP/
> > >
> >
> >
> > --
> > Didi
> > _______________________________________________
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org Privacy
> > Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/B56KRGFP
> > AFGYO7MAF43PJXUCLDNDUSBS/
> >
>
>
> --
> Didi
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: 
> https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HHE26SHL6ZE3SVXTWZ5WQ6PSVEVKAXMX/
>


-- 
Didi
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TLRTWH25E7HYSLVO5JACNBROPBJUHSFZ/

Reply via email to