Re: [ovirt-users] moving storage away from a single point of failure

2015-09-24 Thread Nicolas Ecarnot

Le 25/09/2015 01:57, Donny Davis a écrit :

Gluster is pretty stable, you shouldn't have any issues. It works best
when there are more than 2 or 3 nodes though.


Hi,

On a site, I have an oVirt setup made of 3 nodes acting as 
compute+storage based on gluster, plus another standalone engine.
The replica-3 seems to do some good job even when I test-stressed them 
brutaly.
But as I have the opportunity to add more storage nodes, I'm interested 
in the comment above.


When adding another node, what is the effect on the duration of a 
rebuild when recovering from a crash? What is the effect on perfs?


Regards,

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Add external provider (Failed with error PROVIDER_FAILURE and code 5050) foreman 1.9

2015-09-24 Thread Nathanaël Blanchet

Hello,
I have a working foreman 1.9.1 installed with katello 2.3.
As required, ruby193-rubygem-ovirt_provision_plugin-1.0.1-1.el7 is also 
installed on the same host.
But the issue is the same as below when testing  in "add external 
provider" from ovirt 3.5.4.

Is it a known bug?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Michael Hölzl
Thanks for the help! I will definitely stay tuned with updates on this
matter.

Michael

On 09/24/2015 03:13 PM, Martin Perina wrote:
> I created a bug covering this:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1266099
>
> - Original Message -
>> From: "Martin Sivak" 
>> To: "Michael Hölzl" 
>> Cc: "Martin Perina" , users@ovirt.org
>> Sent: Thursday, September 24, 2015 2:59:52 PM
>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
>> gets shutdown
>>
>> Hi Michael,
>>
>> Martin summed the situation neatly, I would just add that this issue
>> is not limited to the size of your setup. The same would happen to HA
>> VMs running on the same host as the hosted engine even if the cluster
>> had 50 hosts...
>>
>> About the recommended way of engine deployment: It really is about
>> whether you can tolerate your engine to be down for a longer time
>> (starting another host using a backup db).
>>
>> Hosted engine restores your management in an automated way and without
>> any data loss. However I agree that the fact that you have to tend to
>> your HA VMs manually after an engine restart is not nice. Fortunately
>> that should only happen when your host (or vdsm) dies and does not
>> come up for an extended period of time.
>>
>> The summary would be.. there will be no HA handling if the host
>> running the engine is down, independently on whether the deployment is
>> hosted engine or standalone engine. If the issue is related to the
>> software only then there is no real difference.
>>
>> - When a host with the standalone engine dies, the VMs are fine, but
>> if anything happens while the engine is down (and reinstalling a
>> standalone engine takes time + you need a very fresh db backup) you
>> might again face issues with HA VMs being down or not starting when
>> the engine comes up.
>>
>> - When a hosted engine dies because of a host failure, some VMs
>> generally disappear with it. The engine will come up automatically and
>> HA VMs from the original hosts have to be manually pushed to work.
>> This requires some manual action, but I see it as less demanding than
>> the first case.
>>
>> - When a hosted engine VM is stopped properly by the tooling it will
>> be restarted elsewhere and it will be able to connect to the original
>> host just fine. The engine will then make sure that all HA VMs are up
>> even if the the VMs died while the engine was down.
>>
>> So I would recommend hosted engine based deployment. And ask for a bit
>> of patience as we have a plan how to mitigate the second case to some
>> extent without compromising the fencing storm prevention.
>>
>> Best regards
>>
>> --
>> Martin Sivak
>> msi...@redhat.com
>> SLA RHEV-M
>>
>>
>> On Thu, Sep 24, 2015 at 2:31 PM, Michael Hölzl  wrote:
>>> Ok, thanks!
>>>
>>> So, I would still like to know if you would recommend not to use hosted
>>> engines but rather another machine for the engine?
>>>
>>> On 09/24/2015 01:24 PM, Martin Perina wrote:
 - Original Message -
> From: "Michael Hölzl" 
> To: "Martin Perina" , "Eli Mesika"
> 
> Cc: "Doron Fediuck" , users@ovirt.org
> Sent: Thursday, September 24, 2015 12:35:13 PM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> gets shutdown
>
> Hi,
>
> thanks for the detailed answer! In principle, I understand the issue
> now. However, I can not fully follow the argument that this is a corner
> case. In a smaller or medium sized company, I would assume that such a
> setup, consisting of two machine with a hosted engine, is not uncommon.
> Especially as there is documentation online which describes how to
> deploy this setup. Does that mean that hosted engines are in general not
> recommended?
>
> I am also wondering why the fencing could not be triggered by the hosted
> engine after the DisableFenceAtStartupInSec timeout? In the events log
> of the engine I keep on getting the message "Host hosted_engine_2 is not
> responding. It will stay in Connecting state for a grace period of 120
> seconds and after that an attempt to fence the host will be issued.",
> which would indicate that the engine is actually trying to fence the non
> responsive host.
 Unfortunately this is a bit misleading message, it's shown every time that
 we start handling network exception for the host and it's fired before
 the logic which manages to start/skip fencing process (this misleading
 message is fixed in 3.6). But in current logic we really execute fencing
 only when host status is about to change from Connecting to NonResponsive
 and this happens only for the 1st time when we are still in
 DisableFenceAtStartupInSec interval. During all other attempts the host is
 already in status Non Responsive, so fencing is skipped.

> On 09/24/2015 11:50 AM, Martin Perina wrote:
>> - Original Message -
>>> From: "Eli M

Re: [ovirt-users] Permission Issues

2015-09-24 Thread Budur Nagaraju
I understood that ,but now facing different issue,without selecting the
permission "attach disk profile" user won't be able to create instance,if I
disable the option user can't create the instance.

Now the issue is by selecting the permission, users will be able to add new
additional HDD space from to the deployed instance which I do not want,is
there any way to resolve this /

On Fri, Sep 25, 2015 at 11:13 AM, Oved Ourfali  wrote:

> You should put the user role on the relevant VM. Permissions in ovirt are
> a combination of user, role and object.
>
> If you put the UserRole on the cluster, the user will see all VMs in the
> cluster. If on a VM, he will only see this VM, if on the DC he will see all
> VMs in this DC, and if on the entire system then he will see all VMs in the
> system.
>
> Hope I helped,
> Oved
> On Sep 24, 2015 9:07 AM, "Budur Nagaraju"  wrote:
>
>> HI
>>
>> I have created a user with the "user role permissions" when logged in
>> able to view all the vms ,by default this should not happen ,is there any
>> solution to resolve this ?
>>
>> Thanks,
>> Nagaraju
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Permission Issues

2015-09-24 Thread Oved Ourfali
You should put the user role on the relevant VM. Permissions in ovirt are a
combination of user, role and object.

If you put the UserRole on the cluster, the user will see all VMs in the
cluster. If on a VM, he will only see this VM, if on the DC he will see all
VMs in this DC, and if on the entire system then he will see all VMs in the
system.

Hope I helped,
Oved
On Sep 24, 2015 9:07 AM, "Budur Nagaraju"  wrote:

> HI
>
> I have created a user with the "user role permissions" when logged in able
> to view all the vms ,by default this should not happen ,is there any
> solution to resolve this ?
>
> Thanks,
> Nagaraju
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VNC console behind NAT

2015-09-24 Thread Alan Murrell


On 24/09/15 12:47 AM, Michal Skrivanek wrote:
> 
> btw, what's the reason for VNC? lack of SPICE client on a particular platform?

One of my guests is running Zentyal.  For some reason the console does
not display very well when using SPICE but with VNC it is fine.

Zentyal is built on top of Ubuntu.  Oddly, if I install a guest VM with
just the same version of Ubuntu that Zentyal is built on, SPICE works
fine with that.

Not sure what to make of that.

Regards,

Alan

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Permission Issues

2015-09-24 Thread Budur Nagaraju
HI

I have created a new role and applied the below permission,when I create a
new instance from the chosen template I get the error "user is not
authorized to perform this action.


under system :enabled the checkbox Login permissions
under vm:basic operations and remote login.
Provisioning Operations:create and create instance.
Diskprofile:attach disk profile.

Thanks,
Nagaraju




On Fri, Sep 25, 2015 at 5:47 AM, Donny Davis  wrote:

> Sure is.
>
> Where are you setting the permissions at?
>
> Read this thread
>
> http://lists.ovirt.org/pipermail/users/2015-January/030981.html
> HI
>
> I have created a user with the "user role permissions" when logged in able
> to view all the vms ,by default this should not happen ,is there any
> solution to resolve this ?
>
> Thanks,
> Nagaraju
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Older hardware deployment

2015-09-24 Thread Chas Ecomm
It gives the “Host CPU type is not compatible with Cluster Properties” message, 
and it doesn’t matter what CPU compatibility level I set it to (I’ve tried all 
of them) it gives the same error.  I’ve seen a few others with this kind of 
issue but not a solid description of how to modify the database to include 
processor families prior to Conroe.

 

From: Donny Davis [mailto:do...@cloudspin.me] 
Sent: Thursday, September 24, 2015 7:19 PM
To: Chas Ecomm 
Cc: users 
Subject: Re: [ovirt-users] Older hardware deployment

 

What errors are you getting?

On Sep 22, 2015 5:47 PM, "Chas Ecomm" mailto:chash...@speakfree.net> > wrote:

I’ve searched on this issue all over the place and not found a full answer.

 

I’ve got a lab with older Dell PE860 servers with Pentium D processors.  We are 
attempting to prove out that we can move from VMware to KVM and oVirt, but with 
3.5 we can’t get these servers to enter into a cluster due to the age of the 
processors.

 

I’ve seen it referenced in a few list entries to modify ‘the database’ to put 
CPUInfo entries in for the processor capabilities of this processor, but it is 
unclear to me and I’ve been unable to find any documentation that speaks to 
what database and how to modify it safely.

 

I know we could update these servers and get better power consumption, heat 
generation, etc., but it’s not in the cards and I can’t move my newer servers 
until we prove it out with these older ones, which ran ESXi 5.5 with absolutely 
no issue.  This cluster will be a functional test and small tools cluster 
ultimately, so I don’t really care about their lower end specs, either.

 

Would installing something older than 3.5 help me here?  Or is there some doc 
somewhere I’ve just not found that describes this process of updating the DB 
with these CPU parameters.

 

TIA!!


___
Users mailing list
Users@ovirt.org  
http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] vm cannot be started

2015-09-24 Thread qinglong.d...@horebdata.cn
Hi all,
I have installed ovirt-hosted-engine-setup on one machine and it used 
iscsi shared storage on another machine. I created and sealed a windows xp 
template and then I created a vm based on the template. When the vm was started 
at the first time I attached the floppy to it and use sysprep to init it. Then 
I shutdown the vm, but after I shutdown it I cannot start it. Here are the logs:
VM test is down with error. Exit message: internal error Process exited 
while reading console log output: char device redirected to /dev/pts/1
2015-09-25T01:29:31.222028Z qemu-kvm: -drive 
file=/var/run/vdsm/payload/6be72664-a80c-4d78-9a5b-f2bbe37c5b2e.4ebf24c33f6111e0dae20466f370de53.img,if=none,id=drive-fdc0-0-0,format=raw,serial=:
 could not open disk image 
/var/run/vdsm/payload/6be72664-a80c-4d78-9a5b-f2bbe37c5b2e.4ebf24c33f6111e0dae20466f370de53.img:
 Permission denied.
Anyone can help? Thanks!


Dolny
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Older hardware deployment

2015-09-24 Thread Donny Davis
What errors are you getting?
On Sep 22, 2015 5:47 PM, "Chas Ecomm"  wrote:

> I’ve searched on this issue all over the place and not found a full answer.
>
>
>
> I’ve got a lab with older Dell PE860 servers with Pentium D processors.
> We are attempting to prove out that we can move from VMware to KVM and
> oVirt, but with 3.5 we can’t get these servers to enter into a cluster due
> to the age of the processors.
>
>
>
> I’ve seen it referenced in a few list entries to modify ‘the database’ to
> put CPUInfo entries in for the processor capabilities of this processor,
> but it is unclear to me and I’ve been unable to find any documentation that
> speaks to what database and how to modify it safely.
>
>
>
> I know we could update these servers and get better power consumption,
> heat generation, etc., but it’s not in the cards and I can’t move my newer
> servers until we prove it out with these older ones, which ran ESXi 5.5
> with absolutely no issue.  This cluster will be a functional test and small
> tools cluster ultimately, so I don’t really care about their lower end
> specs, either.
>
>
>
> Would installing something older than 3.5 help me here?  Or is there some
> doc somewhere I’ve just not found that describes this process of updating
> the DB with these CPU parameters.
>
>
>
> TIA!!
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Permission Issues

2015-09-24 Thread Donny Davis
Sure is.

Where are you setting the permissions at?

Read this thread

http://lists.ovirt.org/pipermail/users/2015-January/030981.html
HI

I have created a user with the "user role permissions" when logged in able
to view all the vms ,by default this should not happen ,is there any
solution to resolve this ?

Thanks,
Nagaraju


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Systemd-Script to put the node in "maintenance" on shutdown

2015-09-24 Thread Donny Davis
I was just trying to do the same thing, I was going to use power
management, and the api.

I asked if this feature already exists in the cluster policy.

With the cluster policy you can control how the vms get distributed, and
then I was going to write a script on the engine that would use the api to
put the host in maintenance, then use the api to tell power management to
turn off the host... When load gets above 75%, the script would power a
server on, and then take it out of maintenance. The cluster policy would
take care of the rest.

Anyone else doing something like this?

-Donny D
On Sep 24, 2015 2:18 AM, "Nir Soffer"  wrote:

> On Fri, Sep 18, 2015 at 4:26 PM, Luca Bertoncello <
> l.bertonce...@queo-group.com> wrote:
>
>> Hi again,
>>
>> I'm trying to write a systemd-script (for CentOS7) in order to
>> automatically put the host in "maintenance" on shutdown and to activate it
>> after boot.
>> I wrote a python-script that to that and it works so that I can start it
>> and see the host in "maintenance" and having all VMs migrated.
>>
>> Unfortunately I can't call this script on shutdown/reboot and wait until
>> all VMs are migrated and the host is in maintenance.
>>
>
> I don't think this will work, since you must put the host into
> maintenance, and wait until all vms were migrated before you reboot the
> host.
>
> All this can be done only by controlling engine, not from the host that is
> going to shutdown.
>
> If you want to trigger this from the host itself, I would write an
> ovirt-shutdown tool, ask engine to put the host into maintenance, wait
> until all vms migrate, and the invoke the real shutdown command.
>
> I guess it would be more useful to run this tool not on the host you want
> to reboot but on another machine.
>
> It may be possible to somehow get systemd to use this tool instead of the
> shutdown command, but I don't think it is a good idea. This will make your
> host fail to shutdown if your tool is buggy.
>
> Maybe ask on systemd mailing list about this.
>
>
>>
>> Here my script:
>>
>> [Unit]
>> Description=oVirt interface for managing host
>> After=remote-fs.target vdsmd.service multipathd.service libvirtd.service
>> time-sync.target iscsid.service rpcbind.service supervdsmd.service
>> sanlock.service vdsm-network.service
>> Wants=remote-fs.target vdsmd.service multipathd.service libvirtd.service
>> time-sync.target iscsid.service rpcbind.service supervdsmd.service
>> sanlock.service vdsm-network.service
>>
>> [Service]
>> Type=simple
>> RemainAfterExit=yes
>> ExecStart=/usr/local/bin/ovirt-maintenance.sh active
>> ExecStop=/usr/local/bin/ovirt-maintenance.sh maintenance
>> KillMode=none
>>
>> [Install]
>> WantedBy=multi-user.target
>>
>> Could someone help me and say what I'm doing wrong?
>>
>> Thanks a lot
>>
>> Mit freundlichen Grüßen
>>
>> Luca Bertoncello
>>
>> --
>> Besuchen Sie unsere Webauftritte:
>>
>> www.queo.bizAgentur für Markenführung und Kommunikation
>> www.queoflow.comIT-Consulting und Individualsoftwareentwicklung
>>
>> Luca Bertoncello
>> Administrator
>> Telefon:+49 351 21 30 38 0
>> Fax:+49 351 21 30 38 99
>> E-Mail: l.bertonce...@queo-group.com
>>
>> queo GmbH
>> Tharandter Str. 13
>> 01159 Dresden
>> Sitz der Gesellschaft: Dresden
>> Handelsregistereintrag: Amtsgericht Dresden HRB 22352
>> Geschäftsführer: Rüdiger Henke, André Pinkert
>> USt-IdNr.: DE234220077
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Power Management

2015-09-24 Thread Donny Davis
Is there any way to physically turn servers off in the cluster policy.

IE. I have ten nodes, migrate and turn off under utilized servers

Thanks

Donny D
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] moving storage away from a single point of failure

2015-09-24 Thread Donny Davis
Gluster is pretty stable, you shouldn't have any issues. It works best when
there are more than 2 or 3 nodes though.

What hardware do you have
On Sep 24, 2015 3:44 PM, "Michael Kleinpaste" <
michael.kleinpa...@sharperlending.com> wrote:

> I thought I had read where Gluster had corrected this behavior.  That's
> disappointing.
>
> On Tue, Sep 22, 2015 at 4:18 AM Alastair Neil 
> wrote:
>
>> My own experience with gluster for VMs is that it is just fine until you
>> need to bring down a node and need the VM's to be live.  I have a replica 3
>> gluster server and, while the VMs are fine while the node is down, when it
>> is brought back up, gluster attempts to heal the files on the downed node
>> and the ensuing i/o freezes the VM's until the heal is complete, and with
>> many VM's on a storage volume that can take hours.  I have migrated all my
>> critical VMs back onto NFS.   There are changes coming soon in gluster that
>> will hopefully mitigate this (better granualarity in the data heals, i/o
>> throttling during heals etc.)  but for now I am keeping most of my VMs on
>> nfs.
>>
>> The alternative is to set the quorum so that the VM volume goes read only
>> when a node goes down.  This may seem mad, but at least your VMs are frozen
>> only while a node is down and not for hours afterwards.
>>
>>
>>
>> On 22 September 2015 at 05:32, Daniel Helgenberger <
>> daniel.helgenber...@m-box.de> wrote:
>>
>>>
>>>
>>> On 18.09.2015 23:04, Robert Story wrote:
>>> > Hi,
>>>
>>> Hello Robert,
>>>
>>> >
>>> > I'm running oVirt 3.5 in our lab, and currently I'm using NFS to a
>>> single
>>> > server. I'd like to move away from having a single point of failure.
>>>
>>> In this case have a look at iSCSI or FC storage. If you have redundant
>>> contollers and switches
>>> the setup should be reliable enough?
>>>
>>> > Watching the mailing list, all the issues with gluster getting out of
>>> sync
>>> > and replica issues has me nervous about gluster, plus I just have 2
>>> > machines with lots of drive bays for storage.
>>>
>>> Still, I would stick to gluster if you want a replicated storage:
>>>  - It is supported out of the box and you get active support from lots
>>> of users here
>>>  - Replica3 will solve most out of sync cases
>>>  - I dare say other replicated storage backends do suffer from the same
>>> issues, this is by design.
>>>
>>> Two things you should keep in mind when running gluster in production:
>>>  - Do not run compute and storage on the same hosts
>>>  - Do not (yet) use Gluster as storage for Hosted Engine
>>>
>>> > I've been reading about GFS2
>>> > and DRBD, and wanted opinions on if either is a good/bad idea, or to
>>> see if
>>> > there are other alternatives.
>>> >
>>> > My oVirt setup is currently 5 nodes and about 25 VMs, might double in
>>> size
>>> > eventually, but probably won't get much bigger than that.
>>>
>>> In the end, it is quite easy to migrate storage domains. If you are
>>> satisfied with your lab
>>> setup, put it in production and add storage later and move the disks.
>>> Afterwards, remove old
>>> storage domains.
>>>
>>> My to cent with gluster: It runs quite stable since some time now if you
>>> do not touch it.
>>> I never had issues when adding bricks, though removing and replacing
>>> them can be very tricky.
>>>
>>> HTH,
>>>
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Robert
>>> >
>>>
>>> --
>>> Daniel Helgenberger
>>> m box bewegtbild GmbH
>>>
>>> P: +49/30/2408781-22
>>> F: +49/30/2408781-10
>>>
>>> ACKERSTR. 19
>>> D-10115 BERLIN
>>>
>>>
>>> www.m-box.de  www.monkeymen.tv
>>>
>>> Geschäftsführer: Martin Retschitzegger / Michaela Göllner
>>> Handeslregister: Amtsgericht Charlottenburg / HRB 112767
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
> --
> *Michael Kleinpaste*
> Senior Systems Administrator
> SharperLending, LLC.
> www.SharperLending.com
> michael.kleinpa...@sharperlending.com
> (509) 324-1230   Fax: (509) 324-1234
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Virtual appliance import question/problem

2015-09-24 Thread Donny Davis
You could go old school and create an empty VM, with the right number of
disks, how you want it to be setup, then dd if=/your/disks/one/at/a/time
of=/the/empty/disks/created/by/ovirt

This happens on the storage server
On Sep 24, 2015 5:58 PM, "Maurice James"  wrote:

> It seems that oVirt is not recognizing the qcow2 images. I must convert it
> to raw which will not work for me because they are too large when expanded
>
>
> --
> *From: *"Nir Soffer" 
> *To: *"Maurice James" 
> *Cc: *"Shahar Havivi" , "users" 
> *Sent: *Wednesday, September 23, 2015 2:40:36 PM
> *Subject: *Re: [ovirt-users] Virtual appliance import question/problem
>
> On Wed, Sep 23, 2015 at 9:26 PM, Maurice James 
> wrote:
>
>>
>> What about the problem with the IDE drives?
>>
>
> I think ide is limited to 4 drives, not sure why you see a limit of 3
> drives.
>
> Vdsm logs showing the errors you get would be very helpful to understand
> this.
>
> Please open an ovirt bug for this, and attach vdsm logs.
>
> But if you can use ide drives (hdX?), why not use virtio (vdX)? You can
> have 16 of these.
>
>
>> "The second problem is that these disks are scsi and ti does not seem to
>> work using the virtio-scsi selection. I tried selecting the IDE option, but
>> there is a limit to the number of IDE disks that I can use."
>>
>>
>>
>> --
>> *From: *"Nir Soffer" 
>> *To: *"Maurice James" 
>> *Cc: *"Shahar Havivi" , "users" 
>> *Sent: *Wednesday, September 23, 2015 1:16:49 PM
>>
>> *Subject: *Re: [ovirt-users] Virtual appliance import question/problem
>>
>> On Wed, Sep 23, 2015 at 2:36 PM, Maurice James 
>> wrote:
>>
>>>
>>> To convert the images I used:
>>> qemu-img convert 250.qcow2 -O raw 250.img -p
>>>
>>
>> Sure this will expand the file to the full size, but why do you need raw
>> image? ovirt works with qcow images.
>>
>>
>>> oVirt will not allow me to have more than 3 IDE devices on a VM
>>>
>>
>> What do you mean by "it does not seem to work using the virtio-scsi
>> selection."?
>>
>>
>>> Doesnt 3.6 only work on RHEL/Centos 7?
>>>
>>
>> And Fedora 21/22.
>>
>>
>>>
>>>
>>>
>>> --
>>> *From: *"Nir Soffer" 
>>> *To: *"Maurice James" , "Shahar Havivi" <
>>> shav...@redhat.com>
>>> *Cc: *"users" 
>>> *Sent: *Tuesday, September 22, 2015 7:37:58 PM
>>> *Subject: *Re: [ovirt-users] Virtual appliance import question/problem
>>>
>>>
>>> On Mon, Sep 21, 2015 at 7:57 PM, Maurice James 
>>> wrote:
>>>
 I have a virtual mail security appliance that I am trying to import
 into oVirt 3.5.4. The appliance was built for kvm. it has a total of 5 scsi
 disks. I can convert and copy the OS disk only because it expands its self
 to full size.

 The first problem that I have is that the disks expand to their full
 size when I convert the to an oVirt format

>>>
>>> How do you convert to ovirt format?
>>>
>>>


 OS Disk
 mail.qcow2 (74M) converts to main.img (294M)


 Storage disks
 250.qcow2 (256K) converts to 250.img  (250GB)
 1024.qcow2 (256K) converts to 1024.img (1TB)
 2048.qcow2 (256K) converts to 2048.img (2TB)
 4096.qvow2 (256K) converts to 4096.img (4TB)
 8192.qcow2 (256K) converts to 8192.img (8TB)



 The second problem is that these disks are scsi and ti does not seem to
 work using the virtio-scsi selection. I tried selecting the IDE option, but
 there is a limit to the number of IDE disks that I can use.

>>>
>>> Can you provide more details about "does not seem to work"?
>>>
>>>


 Virtualbox has no issues running the appliance that was distributed in
 the ova format. Any help would be appreciated

>>>
>>> ovirt-3.6 beta supports import from ova format; maybe you like to try it?
>>>
>>> Nir
>>>
>>>



 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


>>>
>>>
>>
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] RHEL 6 support

2015-09-24 Thread matthew lagoe
I would imagine so, that said you could always build the package yourself.

 

From: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] On Behalf Of 
Maurice James
Sent: Thursday, September 24, 2015 3:54 PM
To: users 
Subject: [ovirt-users] RHEL 6 support

 

Is this going to be permanent? RHEL 6 is not EOL until 2020 

 


RHEL 6.7 - CentOS 6.7 and similar 


*   Upgrade of All-in-One on EL6 is not supported in 3.6. VDSM and the 
packages requiring it are not built anymore for EL6 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] RHEL 6 support

2015-09-24 Thread Maurice James
Is this going to be permanent? RHEL 6 is not EOL until 2020 

RHEL 6.7 - CentOS 6.7 and similar 


* Upgrade of All-in-One on EL6 is not supported in 3.6. VDSM and the 
packages requiring it are not built anymore for EL6 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Virtual appliance import question/problem

2015-09-24 Thread Maurice James
It seems that oVirt is not recognizing the qcow2 images. I must convert it to 
raw which will not work for me because they are too large when expanded 


- Original Message -

From: "Nir Soffer"  
To: "Maurice James"  
Cc: "Shahar Havivi" , "users"  
Sent: Wednesday, September 23, 2015 2:40:36 PM 
Subject: Re: [ovirt-users] Virtual appliance import question/problem 

On Wed, Sep 23, 2015 at 9:26 PM, Maurice James < mja...@media-node.com > wrote: 




What about the problem with the IDE drives? 




I think ide is limited to 4 drives, not sure why you see a limit of 3 drives. 

Vdsm logs showing the errors you get would be very helpful to understand this. 

Please open an ovirt bug for this, and attach vdsm logs. 

But if you can use ide drives (hdX?), why not use virtio (vdX)? You can have 16 
of these. 





"The second problem is that these disks are scsi and ti does not seem to work 
using the virtio-scsi selection. I tried selecting the IDE option, but there is 
a limit to the number of IDE disks that I can use." 




From: "Nir Soffer" < nsof...@redhat.com > 
To: "Maurice James" < mja...@media-node.com > 
Cc: "Shahar Havivi" < shav...@redhat.com >, "users" < users@ovirt.org > 
Sent: Wednesday, September 23, 2015 1:16:49 PM 

Subject: Re: [ovirt-users] Virtual appliance import question/problem 

On Wed, Sep 23, 2015 at 2:36 PM, Maurice James < mja...@media-node.com > wrote: 




To convert the images I used: 
qemu-img convert 250.qcow2 -O raw 250.img -p 




Sure this will expand the file to the full size, but why do you need raw image? 
ovirt works with qcow images. 



oVirt will not allow me to have more than 3 IDE devices on a VM 




What do you mean by " it does not seem to work using the virtio-scsi 
selection."? 



Doesnt 3.6 only work on RHEL/Centos 7? 




And Fedora 21/22. 








From: "Nir Soffer" < nsof...@redhat.com > 
To: "Maurice James" < mja...@media-node.com >, "Shahar Havivi" < 
shav...@redhat.com > 
Cc: "users" < users@ovirt.org > 
Sent: Tuesday, September 22, 2015 7:37:58 PM 
Subject: Re: [ovirt-users] Virtual appliance import question/problem 


On Mon, Sep 21, 2015 at 7:57 PM, Maurice James < mja...@media-node.com > wrote: 



I have a virtual mail security appliance that I am trying to import into oVirt 
3.5.4. The appliance was built for kvm. it has a total of 5 scsi disks. I can 
convert and copy the OS disk only because it expands its self to full size. 

The first problem that I have is that the disks expand to their full size when 
I convert the to an oVirt format 




How do you convert to ovirt format? 





OS Disk 
mail.qcow2 (74M) converts to main.img (294M) 


Storage disks 
250.qcow2 (256K) converts to 250.img (250GB) 
1024.qcow2 (256K) converts to 1024.img (1TB) 
2048.qcow2 (256K) converts to 2048.img (2TB) 
4096.qvow2 (256K) converts to 4096.img (4TB) 
8192.qcow2 (256K) converts to 8192.img (8TB) 



The second problem is that these disks are scsi and ti does not seem to work 
using the virtio-scsi selection. I tried selecting the IDE option, but there is 
a limit to the number of IDE disks that I can use. 




Can you provide more details about "does not seem to work"? 





Virtualbox has no issues running the appliance that was distributed in the ova 
format. Any help would be appreciated 




ovirt-3.6 beta supports import from ova format; maybe you like to try it? 

Nir 






___ 
Users mailing list 
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users 
















___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] moving storage away from a single point of failure

2015-09-24 Thread Michael Kleinpaste
I thought I had read where Gluster had corrected this behavior.  That's
disappointing.

On Tue, Sep 22, 2015 at 4:18 AM Alastair Neil  wrote:

> My own experience with gluster for VMs is that it is just fine until you
> need to bring down a node and need the VM's to be live.  I have a replica 3
> gluster server and, while the VMs are fine while the node is down, when it
> is brought back up, gluster attempts to heal the files on the downed node
> and the ensuing i/o freezes the VM's until the heal is complete, and with
> many VM's on a storage volume that can take hours.  I have migrated all my
> critical VMs back onto NFS.   There are changes coming soon in gluster that
> will hopefully mitigate this (better granualarity in the data heals, i/o
> throttling during heals etc.)  but for now I am keeping most of my VMs on
> nfs.
>
> The alternative is to set the quorum so that the VM volume goes read only
> when a node goes down.  This may seem mad, but at least your VMs are frozen
> only while a node is down and not for hours afterwards.
>
>
>
> On 22 September 2015 at 05:32, Daniel Helgenberger <
> daniel.helgenber...@m-box.de> wrote:
>
>>
>>
>> On 18.09.2015 23:04, Robert Story wrote:
>> > Hi,
>>
>> Hello Robert,
>>
>> >
>> > I'm running oVirt 3.5 in our lab, and currently I'm using NFS to a
>> single
>> > server. I'd like to move away from having a single point of failure.
>>
>> In this case have a look at iSCSI or FC storage. If you have redundant
>> contollers and switches
>> the setup should be reliable enough?
>>
>> > Watching the mailing list, all the issues with gluster getting out of
>> sync
>> > and replica issues has me nervous about gluster, plus I just have 2
>> > machines with lots of drive bays for storage.
>>
>> Still, I would stick to gluster if you want a replicated storage:
>>  - It is supported out of the box and you get active support from lots of
>> users here
>>  - Replica3 will solve most out of sync cases
>>  - I dare say other replicated storage backends do suffer from the same
>> issues, this is by design.
>>
>> Two things you should keep in mind when running gluster in production:
>>  - Do not run compute and storage on the same hosts
>>  - Do not (yet) use Gluster as storage for Hosted Engine
>>
>> > I've been reading about GFS2
>> > and DRBD, and wanted opinions on if either is a good/bad idea, or to
>> see if
>> > there are other alternatives.
>> >
>> > My oVirt setup is currently 5 nodes and about 25 VMs, might double in
>> size
>> > eventually, but probably won't get much bigger than that.
>>
>> In the end, it is quite easy to migrate storage domains. If you are
>> satisfied with your lab
>> setup, put it in production and add storage later and move the disks.
>> Afterwards, remove old
>> storage domains.
>>
>> My to cent with gluster: It runs quite stable since some time now if you
>> do not touch it.
>> I never had issues when adding bricks, though removing and replacing them
>> can be very tricky.
>>
>> HTH,
>>
>> >
>> >
>> > Thanks,
>> >
>> > Robert
>> >
>>
>> --
>> Daniel Helgenberger
>> m box bewegtbild GmbH
>>
>> P: +49/30/2408781-22
>> F: +49/30/2408781-10
>>
>> ACKERSTR. 19
>> D-10115 BERLIN
>>
>>
>> www.m-box.de  www.monkeymen.tv
>>
>> Geschäftsführer: Martin Retschitzegger / Michaela Göllner
>> Handeslregister: Amtsgericht Charlottenburg / HRB 112767
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
-- 
*Michael Kleinpaste*
Senior Systems Administrator
SharperLending, LLC.
www.SharperLending.com
michael.kleinpa...@sharperlending.com
(509) 324-1230   Fax: (509) 324-1234
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine-image-uploader error

2015-09-24 Thread Maurice James
I'm going to run engine-setup again because I'm getting the same error when 
trying to access the shell 



- Original Message -

From: "Simone Tiraboschi"  
To: "Maurice James"  
Cc: "users"  
Sent: Thursday, September 24, 2015 12:59:06 PM 
Subject: Re: [ovirt-users] engine-image-uploader error 



On Thu, Sep 24, 2015 at 5:55 PM, Maurice James < mja...@media-node.com > wrote: 





Im trying to upload an image using the engine-image-uploader and im receiving 
the following error on the cli 

[root@saturn home]# engine-image-uploader -e SaturnExport upload vm.ova 
ERROR: Unable to connect to REST API at https://xxx.xxx.net:443/api 
Reason: Internal Server Error 


engine.log is also attached 
SELinux is disabled 
Firewall is disabled 

When I try to access that link in a browser I get the following 

HTTP Status 500 - 



Did it tried to authenticate you? 
Can you also please attach server.log? 






type Exception report 

message 

description The server encountered an internal error () that prevented it from 
fulfilling this request. 

exception 
javax.servlet.ServletException: Servlet.init() for servlet 
org.ovirt.engine.api.restapi.BackendApplication threw exception

org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489)

org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)

org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505)

org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445)
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
java.lang.Thread.run(Thread.java:745) 


root cause 
java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate 
MessageBodyWriter

org.jboss.resteasy.plugins.providers.RegisterBuiltin.register(RegisterBuiltin.java:35)

org.jboss.resteasy.spi.ResteasyDeployment.start(ResteasyDeployment.java:211)

org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.init(ServletContainerDispatcher.java:67)

org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.init(HttpServletDispatcher.java:36)

org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489)

org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)

org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505)

org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445)
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
java.lang.Thread.run(Thread.java:745) 


root cause 
java.lang.RuntimeException: Unable to instantiate MessageBodyWriter

org.jboss.resteasy.spi.ResteasyProviderFactory.registerProvider(ResteasyProviderFactory.java:772)

org.jboss.resteasy.plugins.providers.RegisterBuiltin.registerProviders(RegisterBuiltin.java:70)

org.jboss.resteasy.plugins.providers.RegisterBuiltin.register(RegisterBuiltin.java:31)

org.jboss.resteasy.spi.ResteasyDeployment.start(ResteasyDeployment.java:211)

org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.init(ServletContainerDispatcher.java:67)

org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.init(HttpServletDispatcher.java:36)

org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489)

org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)

org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505)

org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445)
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
java.lang.Thread.run(Thread.java:745) 


root cause 
java.lang.RuntimeException: Failed to construct public 
org.ovirt.engine.api.pdf.FOPMessageBodyWriter()

org.jboss.resteasy.core.ConstructorInjectorImpl.construct(ConstructorInjectorImpl.java:144)

org.jboss.resteasy.spi.ResteasyProviderFactory.getProviderInstance(ResteasyProviderFactory

Re: [ovirt-users] ovirt 3.6.0 Sixth Beta Release

2015-09-24 Thread Simone Tiraboschi
On Thu, Sep 24, 2015 at 5:17 PM, Rudi Schmitz  wrote:

>
> I have a simple setup. Machine is a node. going to deploy from iscsi on
> thenetwork but dont get that far.  I installed 3.6 sixth beta release node
> iso on a machine. IP adress is setup. Then I try to deploy hosted engine
> over ssh. I input a local http url of the centos 7 iso  and hit Deploy.
>
> The tui stops and I get:
>
> An error appeared in the UI: AttributeError("'TransactionProgressDialog'
> object has no attribute 'event'",)
> Press ENTER to logout ...
> or enter 's' to drop to shell
>

Could you please attach the logs?

You can also decide to use the engine appliance witch greatly speed up the
deployment process being ready to use, check here:
http://www.ovirt.org/Features/HEApplianceFlow

On a regular el7 you can get it via
 yum install ovirt-engine-appliance
while yum is disabled on node but you can still directly download the OVA
from here:
http://jenkins.ovirt.org/job/ovirt-appliance-engine_3.6_build-artifacts-el7_merged/

The whole setup should took about 15 minutes on commodity hardware.


>
> The http iso file does exist and is working. Is the deploy hosted engine
> on a node working for others?
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine-image-uploader error

2015-09-24 Thread Simone Tiraboschi
On Thu, Sep 24, 2015 at 5:55 PM, Maurice James 
wrote:

>
>
> Im trying to upload an image using the engine-image-uploader and im
> receiving the following error on the cli
>
> [root@saturn home]# engine-image-uploader -e SaturnExport upload vm.ova
> ERROR: Unable to connect to REST API at https://xxx.xxx.net:443/api
> Reason: Internal Server Error
>
>
> engine.log is also attached
> SELinux is disabled
> Firewall is disabled
>
> When I try to access that link in a browser I get the following
>
> HTTP Status 500 -
>
Did it tried to authenticate you?
Can you also please attach server.log?


> --
>
> *type* Exception report
>
> *message*
>
> *description* The server encountered an internal error () that prevented
> it from fulfilling this request.
>
> *exception*
>
> javax.servlet.ServletException: Servlet.init() for servlet 
> org.ovirt.engine.api.restapi.BackendApplication threw exception
>   
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489)
>   
> org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)
>   
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>   org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)
>   
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
>   org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505)
>   
> org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445)
>   org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
>   java.lang.Thread.run(Thread.java:745)
>
> *root cause*
>
> java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate 
> MessageBodyWriter
>   
> org.jboss.resteasy.plugins.providers.RegisterBuiltin.register(RegisterBuiltin.java:35)
>   
> org.jboss.resteasy.spi.ResteasyDeployment.start(ResteasyDeployment.java:211)
>   
> org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.init(ServletContainerDispatcher.java:67)
>   
> org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.init(HttpServletDispatcher.java:36)
>   
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489)
>   
> org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)
>   
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>   org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)
>   
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
>   org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505)
>   
> org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445)
>   org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
>   java.lang.Thread.run(Thread.java:745)
>
> *root cause*
>
> java.lang.RuntimeException: Unable to instantiate MessageBodyWriter
>   
> org.jboss.resteasy.spi.ResteasyProviderFactory.registerProvider(ResteasyProviderFactory.java:772)
>   
> org.jboss.resteasy.plugins.providers.RegisterBuiltin.registerProviders(RegisterBuiltin.java:70)
>   
> org.jboss.resteasy.plugins.providers.RegisterBuiltin.register(RegisterBuiltin.java:31)
>   
> org.jboss.resteasy.spi.ResteasyDeployment.start(ResteasyDeployment.java:211)
>   
> org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.init(ServletContainerDispatcher.java:67)
>   
> org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.init(HttpServletDispatcher.java:36)
>   
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489)
>   
> org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)
>   
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>   org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466)
>   
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
>   org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505)
>   
> org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445)
>   org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
>   java.lang.Thread.run(Thread.java:745)
>
> *root cause*
>
> java.lang.RuntimeException: Failed to construct public 
> org.ovirt.engine.api.pdf.FOPMessageBodyWriter()
>   
> org.jboss.resteasy.core.ConstructorInjectorImpl.construct(ConstructorInjectorImpl.java:144)
>   
> org.jboss.resteasy.spi.ResteasyProviderFactory.getProviderInstance(ResteasyProviderFactory.java:1039)
>   
> org.jboss.resteasy.spi.ResteasyProviderFactory.addMessageBodyWriter(ResteasyProviderFactory.java:519)
>   
> org.jboss.resteasy.spi.ResteasyProviderFactory.registerPro

Re: [ovirt-users] vmware import hangs after click load button on 3.5 rc5

2015-09-24 Thread Nir Soffer
On Thu, Sep 24, 2015 at 6:26 PM, Nir Soffer  wrote:

> On Thu, Sep 24, 2015 at 5:24 PM, Ian Fraser  wrote:
>
>> Hi Nir,
>>
>>
>>
>> https://gerrit.ovirt.org/46634/ has improved the situation, even tho the
>> “Import Virtual Machine(s)” window still hangs. I queried all 3 vCenter
>> Hosts then I run:
>>
>>
>>
>> grep -e v2v -e libvirtE -e ERROR /var/log/vdsm/vdsm.log
>>
>>
>>
>> I get the following much more helpful information:
>>
>>
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:05:20,542::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'devsquid': internal error: Invalid or not yet handled value
>> 'emptyBackingString' for VMX entry 'ide1:0.fileName' for device type
>> 'cdrom-image'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:05:23,975::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'HD2012': internal error: Expecting VMX entry 'ethernet0.virtualDev'
>> to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000' but found 'e1000e'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:05:41,102::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'QA11 2012 R2': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:05:43,643::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'HDWin8': internal error: Expecting VMX entry 'ethernet0.virtualDev'
>> to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000' but found 'e1000e'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:05:45,861::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'HDwin10-2': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:05:51,421::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'HDWin10': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:06:05,973::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'wsus-ash-01': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:06:06,841::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'Hactar2': internal error: Expecting VMX entry 'numvcpus' to be an
>> unsigned integer (1 or a multiple of 2) but found 3
>>
>> Thread-19193::ERROR::2015-09-24
>> 13:06:21,972::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'squid3.2': internal error: Invalid or not yet handled value
>> 'emptyBackingString' for VMX entry 'ide1:0.fileName' for device type
>> 'cdrom-image'
>>
>> Thread-19382::ERROR::2015-09-24
>> 13:06:45,322::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'sequoia-demo-01': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19382::ERROR::2015-09-24
>> 13:06:47,515::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'sales-demo-01': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19382::ERROR::2015-09-24
>> 13:06:50,535::v2v::158::root::(get_external_vms) error getting domain xml
>> for vm 'rdpgw-demo-01': internal error: Expecting VMX entry
>> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
>> but found 'e1000e'
>>
>> Thread-19382::ERROR::2015-09-24
>> 13:06:54,202::v2v::688::root::(_add_disk_info) Error getting disk size
>>
>>   File "/usr/share/vdsm/v2v.py", line 686, in _add_disk_info
>>
>> if ret is None: raise libvirtError ('virStorageVolGetInfo() failed',
>> vol=self)
>>
>> libvirtError: internal error: Could not search in datastore
>> 'MD3220i-LUN0': FileNotFound - File [MD3220i-LUN0] isostore was not found
>>
>
> Thanks for testing the patch. This info should be very useful to libvirt
> guys.
>
>
>> Looks like it is continuing with some errors but still getting tripped on
>> others. (vdsm.log attached)
>>
>
> I will look at the log later - the fact that the engine front end still
> hangs mean there is another bug hiding in the vdsm.
>

Looking in the log, we see:

1. Request started

Thread-19382::DEBUG::2015-09-24
13:06:41,079::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
'Host.getExternalVMs' in bridge with {u'username': u'ian_ad...@asm.org.uk',
u'password': '', u'
uri': u'vpx://
ian_admin%40asm.org...@vcenter.asm.org.uk/Ashford_House/vh-ash-03.asm.org.uk?no_verify=1
'}

2. (Lot of errors...)

3. Request ends

Thread-19382::DEBUG::2015-09-24
13:06:57,237::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
'Host.getExternalVMs' in bridge with

Re: [ovirt-users] vmware import hangs after click load button on 3.5 rc5

2015-09-24 Thread Nir Soffer
On Thu, Sep 24, 2015 at 5:24 PM, Ian Fraser  wrote:

> Hi Nir,
>
>
>
> https://gerrit.ovirt.org/46634/ has improved the situation, even tho the
> “Import Virtual Machine(s)” window still hangs. I queried all 3 vCenter
> Hosts then I run:
>
>
>
> grep -e v2v -e libvirtE -e ERROR /var/log/vdsm/vdsm.log
>
>
>
> I get the following much more helpful information:
>
>
>
> Thread-19193::ERROR::2015-09-24
> 13:05:20,542::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'devsquid': internal error: Invalid or not yet handled value
> 'emptyBackingString' for VMX entry 'ide1:0.fileName' for device type
> 'cdrom-image'
>
> Thread-19193::ERROR::2015-09-24
> 13:05:23,975::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'HD2012': internal error: Expecting VMX entry 'ethernet0.virtualDev'
> to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000' but found 'e1000e'
>
> Thread-19193::ERROR::2015-09-24
> 13:05:41,102::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'QA11 2012 R2': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19193::ERROR::2015-09-24
> 13:05:43,643::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'HDWin8': internal error: Expecting VMX entry 'ethernet0.virtualDev'
> to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000' but found 'e1000e'
>
> Thread-19193::ERROR::2015-09-24
> 13:05:45,861::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'HDwin10-2': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19193::ERROR::2015-09-24
> 13:05:51,421::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'HDWin10': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19193::ERROR::2015-09-24
> 13:06:05,973::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'wsus-ash-01': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19193::ERROR::2015-09-24
> 13:06:06,841::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'Hactar2': internal error: Expecting VMX entry 'numvcpus' to be an
> unsigned integer (1 or a multiple of 2) but found 3
>
> Thread-19193::ERROR::2015-09-24
> 13:06:21,972::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'squid3.2': internal error: Invalid or not yet handled value
> 'emptyBackingString' for VMX entry 'ide1:0.fileName' for device type
> 'cdrom-image'
>
> Thread-19382::ERROR::2015-09-24
> 13:06:45,322::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'sequoia-demo-01': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19382::ERROR::2015-09-24
> 13:06:47,515::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'sales-demo-01': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19382::ERROR::2015-09-24
> 13:06:50,535::v2v::158::root::(get_external_vms) error getting domain xml
> for vm 'rdpgw-demo-01': internal error: Expecting VMX entry
> 'ethernet0.virtualDev' to be 'vlance' or 'vmxnet' or 'vmxnet3' or 'e1000'
> but found 'e1000e'
>
> Thread-19382::ERROR::2015-09-24
> 13:06:54,202::v2v::688::root::(_add_disk_info) Error getting disk size
>
>   File "/usr/share/vdsm/v2v.py", line 686, in _add_disk_info
>
> if ret is None: raise libvirtError ('virStorageVolGetInfo() failed',
> vol=self)
>
> libvirtError: internal error: Could not search in datastore
> 'MD3220i-LUN0': FileNotFound - File [MD3220i-LUN0] isostore was not found
>

Thanks for testing the patch. This info should be very useful to libvirt
guys.


> Looks like it is continuing with some errors but still getting tripped on
> others. (vdsm.log attached)
>

I will look at the log later - the fact that the engine front end still
hangs mean there is another bug hiding in the vdsm.


> Sadly none of these errors are showing in the events in the front end
> which would be more helpful.
>

Please open a bug about that - the ui should display every vm found, with
error information for those cannot be imported.


>
>
> I have been playing divide and conquer with my 80 VMs and it seems there
> are still things that libvirt doesn’t like about some of those VMs. Is
> there a way I can run these commands from the command line so I can pin the
> problems down and alert the libvirt guys directly?
>

Yes, use:

vdsClient -s 0 getExternalVMs   

You should see the uri format sent by the engine in vdsm logs.


> If so where would I find what commands are being run? Should I still
> create BZ against ovirt

[ovirt-users] ovirt 3.6.0 Sixth Beta Release

2015-09-24 Thread Rudi Schmitz
I have a simple setup. Machine is a node. going to deploy from iscsi on
thenetwork but dont get that far.  I installed 3.6 sixth beta release node
iso on a machine. IP adress is setup. Then I try to deploy hosted engine
over ssh. I input a local http url of the centos 7 iso  and hit Deploy.

The tui stops and I get:

An error appeared in the UI: AttributeError("'TransactionProgressDialog'
object has no attribute 'event'",)
Press ENTER to logout ...
or enter 's' to drop to shell

The http iso file does exist and is working. Is the deploy hosted engine on
a node working for others?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] MAC address recycling

2015-09-24 Thread Daniel Helgenberger
Hello,

I recently experienced an issue with mac address uses in ovirt with foreman[1].

Bottom line, a mac address was recycled causing an issue where I could not 
rebuild a host because of a stale DHCP reservation record.

What is the current behavior regarding the reuse of MAC addresses for new VMs?
Can I somehow delay a the recycle of a MAC?

Thanks!
-- 
Daniel Helgenberger
m box bewegtbild GmbH

P: +49/30/2408781-22
F: +49/30/2408781-10

ACKERSTR. 19
D-10115 BERLIN


www.m-box.de  www.monkeymen.tv

Geschäftsführer: Martin Retschitzegger / Michaela Göllner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Martin Perina
I created a bug covering this:

https://bugzilla.redhat.com/show_bug.cgi?id=1266099

- Original Message -
> From: "Martin Sivak" 
> To: "Michael Hölzl" 
> Cc: "Martin Perina" , users@ovirt.org
> Sent: Thursday, September 24, 2015 2:59:52 PM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
> gets shutdown
> 
> Hi Michael,
> 
> Martin summed the situation neatly, I would just add that this issue
> is not limited to the size of your setup. The same would happen to HA
> VMs running on the same host as the hosted engine even if the cluster
> had 50 hosts...
> 
> About the recommended way of engine deployment: It really is about
> whether you can tolerate your engine to be down for a longer time
> (starting another host using a backup db).
> 
> Hosted engine restores your management in an automated way and without
> any data loss. However I agree that the fact that you have to tend to
> your HA VMs manually after an engine restart is not nice. Fortunately
> that should only happen when your host (or vdsm) dies and does not
> come up for an extended period of time.
> 
> The summary would be.. there will be no HA handling if the host
> running the engine is down, independently on whether the deployment is
> hosted engine or standalone engine. If the issue is related to the
> software only then there is no real difference.
> 
> - When a host with the standalone engine dies, the VMs are fine, but
> if anything happens while the engine is down (and reinstalling a
> standalone engine takes time + you need a very fresh db backup) you
> might again face issues with HA VMs being down or not starting when
> the engine comes up.
> 
> - When a hosted engine dies because of a host failure, some VMs
> generally disappear with it. The engine will come up automatically and
> HA VMs from the original hosts have to be manually pushed to work.
> This requires some manual action, but I see it as less demanding than
> the first case.
> 
> - When a hosted engine VM is stopped properly by the tooling it will
> be restarted elsewhere and it will be able to connect to the original
> host just fine. The engine will then make sure that all HA VMs are up
> even if the the VMs died while the engine was down.
> 
> So I would recommend hosted engine based deployment. And ask for a bit
> of patience as we have a plan how to mitigate the second case to some
> extent without compromising the fencing storm prevention.
> 
> Best regards
> 
> --
> Martin Sivak
> msi...@redhat.com
> SLA RHEV-M
> 
> 
> On Thu, Sep 24, 2015 at 2:31 PM, Michael Hölzl  wrote:
> > Ok, thanks!
> >
> > So, I would still like to know if you would recommend not to use hosted
> > engines but rather another machine for the engine?
> >
> > On 09/24/2015 01:24 PM, Martin Perina wrote:
> >>
> >> - Original Message -
> >>> From: "Michael Hölzl" 
> >>> To: "Martin Perina" , "Eli Mesika"
> >>> 
> >>> Cc: "Doron Fediuck" , users@ovirt.org
> >>> Sent: Thursday, September 24, 2015 12:35:13 PM
> >>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> >>> gets shutdown
> >>>
> >>> Hi,
> >>>
> >>> thanks for the detailed answer! In principle, I understand the issue
> >>> now. However, I can not fully follow the argument that this is a corner
> >>> case. In a smaller or medium sized company, I would assume that such a
> >>> setup, consisting of two machine with a hosted engine, is not uncommon.
> >>> Especially as there is documentation online which describes how to
> >>> deploy this setup. Does that mean that hosted engines are in general not
> >>> recommended?
> >>>
> >>> I am also wondering why the fencing could not be triggered by the hosted
> >>> engine after the DisableFenceAtStartupInSec timeout? In the events log
> >>> of the engine I keep on getting the message "Host hosted_engine_2 is not
> >>> responding. It will stay in Connecting state for a grace period of 120
> >>> seconds and after that an attempt to fence the host will be issued.",
> >>> which would indicate that the engine is actually trying to fence the non
> >>> responsive host.
> >> Unfortunately this is a bit misleading message, it's shown every time that
> >> we start handling network exception for the host and it's fired before
> >> the logic which manages to start/skip fencing process (this misleading
> >> message is fixed in 3.6). But in current logic we really execute fencing
> >> only when host status is about to change from Connecting to NonResponsive
> >> and this happens only for the 1st time when we are still in
> >> DisableFenceAtStartupInSec interval. During all other attempts the host is
> >> already in status Non Responsive, so fencing is skipped.
> >>
> >>> On 09/24/2015 11:50 AM, Martin Perina wrote:
>  - Original Message -
> > From: "Eli Mesika" 
> > To: "Martin Perina" , "Doron Fediuck"
> > 
> > Cc: "Michael Hölzl" , users@ovirt.org
> > Sent: Thursday, September 24, 2015 11:38:39 AM
> > Subject: Re

Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Martin Sivak
Hi Michael,

Martin summed the situation neatly, I would just add that this issue
is not limited to the size of your setup. The same would happen to HA
VMs running on the same host as the hosted engine even if the cluster
had 50 hosts...

About the recommended way of engine deployment: It really is about
whether you can tolerate your engine to be down for a longer time
(starting another host using a backup db).

Hosted engine restores your management in an automated way and without
any data loss. However I agree that the fact that you have to tend to
your HA VMs manually after an engine restart is not nice. Fortunately
that should only happen when your host (or vdsm) dies and does not
come up for an extended period of time.

The summary would be.. there will be no HA handling if the host
running the engine is down, independently on whether the deployment is
hosted engine or standalone engine. If the issue is related to the
software only then there is no real difference.

- When a host with the standalone engine dies, the VMs are fine, but
if anything happens while the engine is down (and reinstalling a
standalone engine takes time + you need a very fresh db backup) you
might again face issues with HA VMs being down or not starting when
the engine comes up.

- When a hosted engine dies because of a host failure, some VMs
generally disappear with it. The engine will come up automatically and
HA VMs from the original hosts have to be manually pushed to work.
This requires some manual action, but I see it as less demanding than
the first case.

- When a hosted engine VM is stopped properly by the tooling it will
be restarted elsewhere and it will be able to connect to the original
host just fine. The engine will then make sure that all HA VMs are up
even if the the VMs died while the engine was down.

So I would recommend hosted engine based deployment. And ask for a bit
of patience as we have a plan how to mitigate the second case to some
extent without compromising the fencing storm prevention.

Best regards

--
Martin Sivak
msi...@redhat.com
SLA RHEV-M


On Thu, Sep 24, 2015 at 2:31 PM, Michael Hölzl  wrote:
> Ok, thanks!
>
> So, I would still like to know if you would recommend not to use hosted
> engines but rather another machine for the engine?
>
> On 09/24/2015 01:24 PM, Martin Perina wrote:
>>
>> - Original Message -
>>> From: "Michael Hölzl" 
>>> To: "Martin Perina" , "Eli Mesika" 
>>> Cc: "Doron Fediuck" , users@ovirt.org
>>> Sent: Thursday, September 24, 2015 12:35:13 PM
>>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
>>> gets shutdown
>>>
>>> Hi,
>>>
>>> thanks for the detailed answer! In principle, I understand the issue
>>> now. However, I can not fully follow the argument that this is a corner
>>> case. In a smaller or medium sized company, I would assume that such a
>>> setup, consisting of two machine with a hosted engine, is not uncommon.
>>> Especially as there is documentation online which describes how to
>>> deploy this setup. Does that mean that hosted engines are in general not
>>> recommended?
>>>
>>> I am also wondering why the fencing could not be triggered by the hosted
>>> engine after the DisableFenceAtStartupInSec timeout? In the events log
>>> of the engine I keep on getting the message "Host hosted_engine_2 is not
>>> responding. It will stay in Connecting state for a grace period of 120
>>> seconds and after that an attempt to fence the host will be issued.",
>>> which would indicate that the engine is actually trying to fence the non
>>> responsive host.
>> Unfortunately this is a bit misleading message, it's shown every time that
>> we start handling network exception for the host and it's fired before
>> the logic which manages to start/skip fencing process (this misleading
>> message is fixed in 3.6). But in current logic we really execute fencing
>> only when host status is about to change from Connecting to NonResponsive
>> and this happens only for the 1st time when we are still in
>> DisableFenceAtStartupInSec interval. During all other attempts the host is
>> already in status Non Responsive, so fencing is skipped.
>>
>>> On 09/24/2015 11:50 AM, Martin Perina wrote:
 - Original Message -
> From: "Eli Mesika" 
> To: "Martin Perina" , "Doron Fediuck"
> 
> Cc: "Michael Hölzl" , users@ovirt.org
> Sent: Thursday, September 24, 2015 11:38:39 AM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> gets shutdown
>
>
>
> - Original Message -
>> From: "Martin Perina" 
>> To: "Michael Hölzl" 
>> Cc: users@ovirt.org
>> Sent: Thursday, September 24, 2015 11:02:21 AM
>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
>> gets shutdown
>>
>> Hi,
>>
>> sorry for the late response, but you hit a "corner case" :-(
>>
>> Let me start explain you a few things first:
>>
>> After sta

Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Michael Hölzl
Ok, thanks!

So, I would still like to know if you would recommend not to use hosted
engines but rather another machine for the engine?

On 09/24/2015 01:24 PM, Martin Perina wrote:
>
> - Original Message -
>> From: "Michael Hölzl" 
>> To: "Martin Perina" , "Eli Mesika" 
>> Cc: "Doron Fediuck" , users@ovirt.org
>> Sent: Thursday, September 24, 2015 12:35:13 PM
>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
>> gets shutdown
>>
>> Hi,
>>
>> thanks for the detailed answer! In principle, I understand the issue
>> now. However, I can not fully follow the argument that this is a corner
>> case. In a smaller or medium sized company, I would assume that such a
>> setup, consisting of two machine with a hosted engine, is not uncommon.
>> Especially as there is documentation online which describes how to
>> deploy this setup. Does that mean that hosted engines are in general not
>> recommended?
>>
>> I am also wondering why the fencing could not be triggered by the hosted
>> engine after the DisableFenceAtStartupInSec timeout? In the events log
>> of the engine I keep on getting the message "Host hosted_engine_2 is not
>> responding. It will stay in Connecting state for a grace period of 120
>> seconds and after that an attempt to fence the host will be issued.",
>> which would indicate that the engine is actually trying to fence the non
>> responsive host.
> Unfortunately this is a bit misleading message, it's shown every time that
> we start handling network exception for the host and it's fired before
> the logic which manages to start/skip fencing process (this misleading
> message is fixed in 3.6). But in current logic we really execute fencing
> only when host status is about to change from Connecting to NonResponsive
> and this happens only for the 1st time when we are still in
> DisableFenceAtStartupInSec interval. During all other attempts the host is
> already in status Non Responsive, so fencing is skipped.
>
>> On 09/24/2015 11:50 AM, Martin Perina wrote:
>>> - Original Message -
 From: "Eli Mesika" 
 To: "Martin Perina" , "Doron Fediuck"
 
 Cc: "Michael Hölzl" , users@ovirt.org
 Sent: Thursday, September 24, 2015 11:38:39 AM
 Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
 gets shutdown



 - Original Message -
> From: "Martin Perina" 
> To: "Michael Hölzl" 
> Cc: users@ovirt.org
> Sent: Thursday, September 24, 2015 11:02:21 AM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> gets shutdown
>
> Hi,
>
> sorry for the late response, but you hit a "corner case" :-(
>
> Let me start explain you a few things first:
>
> After startup of engine there's an internval during which fencing is
> disabled. It's called DisableFenceAtStartupInSec and by default it's
> set to 5 minutes. It can be changed using
>
>engine-config -s DisableFenceAtStartupInSec
>
> but please do that with caution.
>
> Why do we have such timeout? It's a prevention of fencing storm, which
> could happen in during power issues in whole DC: when both engine and
> hosts are started, for huge hosts it may take a lot of time until become
> up and VDSM start to communicate with engine. So usually engine is
> started
> first and without this interval engine will start fencing for hosts which
> are just starting ...
>
> Another thing: if we cannot properly fence the host, we cannot determine
> if there's not just communication issue between engine and host, so we
> cannot restart HA VMs on another host. The only thing we can do is to
> offer "Mark host as rebooted" manual option to administrator. If
> administrator execution this option, we try to restart HA VMs on
> different
> host ASAP, because admin took the responsibility of validation that VMs
> are really not running.
>
>
> When engine is started, following actions related to fencing are taken:
>
> 1. Get status of all hosts from DB and schedule Non Responding Treatment
>after DisableFenceAtStartupInSec timeout is passed
>
> 2. Try to communicate with all host and refresh their status
>
>
> If some host become Non Resposive during DisableFenceAtStartupInSec
> interval
> we skip fencing and administator will see message in Events tab that host
> is Non Responsive, but fencing is disabled due to startup interval. So
> administrator have to take care of such host manually.
>
>
> Now what happened in your case:
>
>  1. Hosted engine VM is running on host1 with other VMs
>  2. Status of host1 and host2 is Up
>  3. You kill/shutdown host1 -> hosted engine VM is also shut down -> no
>  engine
> is running to detect issue with host1 and change its status to Non
> Responsive
>  4. In the

[ovirt-users] [ANN] oVirt 3.5.5 Second Release Candidate is now available for testing

2015-09-24 Thread Sandro Bonazzola
The oVirt Project is pleased to announce the availability
of the Second Release Candidate of oVirt 3.5.5 for testing, as of September
24th, 2015.

This release is available now for
Red Hat Enterprise Linux 6.7, CentOS Linux 6.7 (or similar) and
Red Hat Enterprise Linux 7.1, CentOS Linux 7.1 (or similar).

This release supports Hypervisor Hosts running
Red Hat Enterprise Linux 6.7, CentOS Linux 6.7 (or similar),
Red Hat Enterprise Linux 7.1, CentOS Linux 7.1 (or similar) and  #Fedora 21.

This release of oVirt 3.5.5 includes new DWH and reports packages.
See the release notes [1] for an initial list of bugs fixed.

Please refer to release notes [1] for Installation / Upgrade instructions.
New oVirt Node ISO will be available soon as well[2].

Please note that mirrors[3] may need usually one day before being
synchronized.

Please refer to the release notes for known issues in this release.
Please test add yourself to the test page[4] if you're testing this release.

[1] http://www.ovirt.org/OVirt_3.5.5_Release_Notes
[2] http://plain.resources.ovirt.org/pub/ovirt-3.5-pre/iso/
[3] http://www.ovirt.org/Repository_mirrors#Current_mirrors
[4] http://www.ovirt.org/Testing/oVirt_3.5.5_Testing

-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Martin Perina


- Original Message -
> From: "Michael Hölzl" 
> To: "Martin Perina" , "Eli Mesika" 
> Cc: "Doron Fediuck" , users@ovirt.org
> Sent: Thursday, September 24, 2015 12:35:13 PM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
> gets shutdown
> 
> Hi,
> 
> thanks for the detailed answer! In principle, I understand the issue
> now. However, I can not fully follow the argument that this is a corner
> case. In a smaller or medium sized company, I would assume that such a
> setup, consisting of two machine with a hosted engine, is not uncommon.
> Especially as there is documentation online which describes how to
> deploy this setup. Does that mean that hosted engines are in general not
> recommended?
> 
> I am also wondering why the fencing could not be triggered by the hosted
> engine after the DisableFenceAtStartupInSec timeout? In the events log
> of the engine I keep on getting the message "Host hosted_engine_2 is not
> responding. It will stay in Connecting state for a grace period of 120
> seconds and after that an attempt to fence the host will be issued.",
> which would indicate that the engine is actually trying to fence the non
> responsive host.

Unfortunately this is a bit misleading message, it's shown every time that
we start handling network exception for the host and it's fired before
the logic which manages to start/skip fencing process (this misleading
message is fixed in 3.6). But in current logic we really execute fencing
only when host status is about to change from Connecting to NonResponsive
and this happens only for the 1st time when we are still in
DisableFenceAtStartupInSec interval. During all other attempts the host is
already in status Non Responsive, so fencing is skipped.

> 
> On 09/24/2015 11:50 AM, Martin Perina wrote:
> >
> > - Original Message -
> >> From: "Eli Mesika" 
> >> To: "Martin Perina" , "Doron Fediuck"
> >> 
> >> Cc: "Michael Hölzl" , users@ovirt.org
> >> Sent: Thursday, September 24, 2015 11:38:39 AM
> >> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> >> gets shutdown
> >>
> >>
> >>
> >> - Original Message -
> >>> From: "Martin Perina" 
> >>> To: "Michael Hölzl" 
> >>> Cc: users@ovirt.org
> >>> Sent: Thursday, September 24, 2015 11:02:21 AM
> >>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> >>> gets shutdown
> >>>
> >>> Hi,
> >>>
> >>> sorry for the late response, but you hit a "corner case" :-(
> >>>
> >>> Let me start explain you a few things first:
> >>>
> >>> After startup of engine there's an internval during which fencing is
> >>> disabled. It's called DisableFenceAtStartupInSec and by default it's
> >>> set to 5 minutes. It can be changed using
> >>>
> >>>engine-config -s DisableFenceAtStartupInSec
> >>>
> >>> but please do that with caution.
> >>>
> >>> Why do we have such timeout? It's a prevention of fencing storm, which
> >>> could happen in during power issues in whole DC: when both engine and
> >>> hosts are started, for huge hosts it may take a lot of time until become
> >>> up and VDSM start to communicate with engine. So usually engine is
> >>> started
> >>> first and without this interval engine will start fencing for hosts which
> >>> are just starting ...
> >>>
> >>> Another thing: if we cannot properly fence the host, we cannot determine
> >>> if there's not just communication issue between engine and host, so we
> >>> cannot restart HA VMs on another host. The only thing we can do is to
> >>> offer "Mark host as rebooted" manual option to administrator. If
> >>> administrator execution this option, we try to restart HA VMs on
> >>> different
> >>> host ASAP, because admin took the responsibility of validation that VMs
> >>> are really not running.
> >>>
> >>>
> >>> When engine is started, following actions related to fencing are taken:
> >>>
> >>> 1. Get status of all hosts from DB and schedule Non Responding Treatment
> >>>after DisableFenceAtStartupInSec timeout is passed
> >>>
> >>> 2. Try to communicate with all host and refresh their status
> >>>
> >>>
> >>> If some host become Non Resposive during DisableFenceAtStartupInSec
> >>> interval
> >>> we skip fencing and administator will see message in Events tab that host
> >>> is Non Responsive, but fencing is disabled due to startup interval. So
> >>> administrator have to take care of such host manually.
> >>>
> >>>
> >>> Now what happened in your case:
> >>>
> >>>  1. Hosted engine VM is running on host1 with other VMs
> >>>  2. Status of host1 and host2 is Up
> >>>  3. You kill/shutdown host1 -> hosted engine VM is also shut down -> no
> >>>  engine
> >>> is running to detect issue with host1 and change its status to Non
> >>> Responsive
> >>>  4. In the meantime hosted engine VM is started on host2 -> it will read
> >>>  host
> >>> status from DB, but all hosts are up -> it will try to communicate
> >>> with
> >>> host1,
> >>> but it's unreachable 

Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Michael Hölzl
Hi,

thanks for the detailed answer! In principle, I understand the issue
now. However, I can not fully follow the argument that this is a corner
case. In a smaller or medium sized company, I would assume that such a
setup, consisting of two machine with a hosted engine, is not uncommon.
Especially as there is documentation online which describes how to
deploy this setup. Does that mean that hosted engines are in general not
recommended?

I am also wondering why the fencing could not be triggered by the hosted
engine after the DisableFenceAtStartupInSec timeout? In the events log
of the engine I keep on getting the message "Host hosted_engine_2 is not
responding. It will stay in Connecting state for a grace period of 120
seconds and after that an attempt to fence the host will be issued.",
which would indicate that the engine is actually trying to fence the non
responsive host.

On 09/24/2015 11:50 AM, Martin Perina wrote:
>
> - Original Message -
>> From: "Eli Mesika" 
>> To: "Martin Perina" , "Doron Fediuck" 
>> 
>> Cc: "Michael Hölzl" , users@ovirt.org
>> Sent: Thursday, September 24, 2015 11:38:39 AM
>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
>> gets shutdown
>>
>>
>>
>> - Original Message -
>>> From: "Martin Perina" 
>>> To: "Michael Hölzl" 
>>> Cc: users@ovirt.org
>>> Sent: Thursday, September 24, 2015 11:02:21 AM
>>> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
>>> gets shutdown
>>>
>>> Hi,
>>>
>>> sorry for the late response, but you hit a "corner case" :-(
>>>
>>> Let me start explain you a few things first:
>>>
>>> After startup of engine there's an internval during which fencing is
>>> disabled. It's called DisableFenceAtStartupInSec and by default it's
>>> set to 5 minutes. It can be changed using
>>>
>>>engine-config -s DisableFenceAtStartupInSec
>>>
>>> but please do that with caution.
>>>
>>> Why do we have such timeout? It's a prevention of fencing storm, which
>>> could happen in during power issues in whole DC: when both engine and
>>> hosts are started, for huge hosts it may take a lot of time until become
>>> up and VDSM start to communicate with engine. So usually engine is started
>>> first and without this interval engine will start fencing for hosts which
>>> are just starting ...
>>>
>>> Another thing: if we cannot properly fence the host, we cannot determine
>>> if there's not just communication issue between engine and host, so we
>>> cannot restart HA VMs on another host. The only thing we can do is to
>>> offer "Mark host as rebooted" manual option to administrator. If
>>> administrator execution this option, we try to restart HA VMs on different
>>> host ASAP, because admin took the responsibility of validation that VMs
>>> are really not running.
>>>
>>>
>>> When engine is started, following actions related to fencing are taken:
>>>
>>> 1. Get status of all hosts from DB and schedule Non Responding Treatment
>>>after DisableFenceAtStartupInSec timeout is passed
>>>
>>> 2. Try to communicate with all host and refresh their status
>>>
>>>
>>> If some host become Non Resposive during DisableFenceAtStartupInSec
>>> interval
>>> we skip fencing and administator will see message in Events tab that host
>>> is Non Responsive, but fencing is disabled due to startup interval. So
>>> administrator have to take care of such host manually.
>>>
>>>
>>> Now what happened in your case:
>>>
>>>  1. Hosted engine VM is running on host1 with other VMs
>>>  2. Status of host1 and host2 is Up
>>>  3. You kill/shutdown host1 -> hosted engine VM is also shut down -> no
>>>  engine
>>> is running to detect issue with host1 and change its status to Non
>>> Responsive
>>>  4. In the meantime hosted engine VM is started on host2 -> it will read
>>>  host
>>> status from DB, but all hosts are up -> it will try to communicate with
>>> host1,
>>> but it's unreachable -> so it changes host1 status Non Responsive and
>>> starts
>>> Non Responsive Treatment for host1 -> Non Responsive Treatment is
>>> aborted
>>> because engine is still in DisableFenceAtStartupInSec
>>>
>>>
>>> So in normal deployment (without hosted engine) admin is notified that
>>> host,
>>> where engine is running, crashed and was rebooted, so he has to take a look
>>> and do manual steps if needed.
>>>
>>> In hosted engine deployment it's an issue because hosted engine VM can be
>>> restart
>>> on different host also in other cases then crashes (for example if host is
>>> overloaded hosted engine can stop hosted engine VM and restart it on
>>> different
>>> host, but this shouldn't happen too often).
>>>
>>> At the moment the only solution for this is manual: let administrator to be
>>> notified that host engine VM is restarted on different host, so
>>> administrator
>>> can check manually what was the cause for this restart and execute manual
>>> steps
>>> if needed.
>>>
>>> So to summarize: at the moment I don't

[ovirt-users] Strange disk behaviour within VMs

2015-09-24 Thread Morten A. Middelthon

Hi,

last Sunday I experienced a power outage with one of my two oVirt 
hypervisors. After power was restored I experienced some weirdness:


- on one of the VMs running on this hypervisor the boot disk changed, so 
it was no longer able to boot. Looking at the console the VM would hang 
on "Booting from hard disk". After I noticed that the wrong virtual disk 
was marked as OS/bootable I got it booting again after correcting it to 
the proper boot disk. This was done from the oVirt management server.


- on another VM I tried today to add another virtual disk to expand a 
LVM volume. In dmesg I can see the new device:

[17167560.005768]  vdc: unknown partition table
However, when I tried to run pvcreate I got an error message saying that 
this was already marked as an LVM disk, and then running pvs give me the 
following error:


# pvs
  Couldn't find device with uuid 7dDcyq-TZ6I-96Im-lfjL-cTUv-nff1-B11Mm7.
  PV VG  Fmt  Attr PSizePFree
  /dev/vda5  rit-kvm-ssweb02 lvm2 a--59.76g0
  /dev/vdb1  vg_syncsharelvm2 a--   500.00g0
  /dev/vdc1  VG_SYNCSHARE01  lvm2 a--   400.00g0
  unknown device VG_SYNCSHARE01  lvm2 a-m  1024.00g0

As you can see there's already a PV called /dev/vdc1, as well another 
one named "unknown device". These two PVs belong to a VG that does NOT 
belong to this VM, VG_SYNCSHARE01. The uuids for these two PVs are:


--- Physical volume ---
  PV Name   unknown device
  VG Name   VG_SYNCSHARE01
  PV Size   1024.00 GiB / not usable 3.00 MiB
  Allocatable   yes (but full)
  PE Size   4.00 MiB
  Total PE  262143
  Free PE   0
  Allocated PE  262143
  PV UUID   7dDcyq-TZ6I-96Im-lfjL-cTUv-nff1-B11Mm7

  --- Physical volume ---
  PV Name   /dev/vdc1
  VG Name   VG_SYNCSHARE01
  PV Size   400.00 GiB / not usable 3.00 MiB
  Allocatable   yes (but full)
  PE Size   4.00 MiB
  Total PE  102399
  Free PE   0
  Allocated PE  102399
  PV UUID   oKSDoo-3pxU-0uue-zQ0H-kv1N-lyPa-P2M2FY

The two PVs which doesn't belong on this VM actually belongs to a 
totally different VM.


On VM number two:

# pvdisplay
  --- Physical volume ---
  PV Name   /dev/vdb1
  VG Name   VG_SYNCSHARE01
  PV Size   1024.00 GiB / not usable 3.00 MiB
  Allocatable   yes (but full)
  PE Size   4.00 MiB
  Total PE  262143
  Free PE   0
  Allocated PE  262143
  PV UUID   7dDcyq-TZ6I-96Im-lfjL-cTUv-nff1-B11Mm7

  --- Physical volume ---
  PV Name   /dev/vdd1
  VG Name   VG_SYNCSHARE01
  PV Size   400.00 GiB / not usable 3.00 MiB
  Allocatable   yes (but full)
  PE Size   4.00 MiB
  Total PE  102399
  Free PE   0
  Allocated PE  102399
  PV UUID   oKSDoo-3pxU-0uue-zQ0H-kv1N-lyPa-P2M2FY

As you can see, same uuid and VG name, but two different VMs.

My setup:
oVirt manager: oVirt 3.5 running on CentOS 6.7
oVirt hypervisors: two oVirt 3.5 servers running on CentOS 6.7

During the time of the power outage mentioned earlier I was running 
oVirt 3.4, but I upgraded today and rebooted the manager and both 
hypervisors, but NOT the VMs.


Virtual machines:
Debian wheezy 7.9 x86_64

Storage:
HP LeftHand iSCSI

I have tried to locate error messages in the logs which can be related 
to this behaviour, but so far no luck :(


--
Morten A. Middelthon
Email: mor...@flipp.net

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Martin Perina


- Original Message -
> From: "Eli Mesika" 
> To: "Martin Perina" , "Doron Fediuck" 
> 
> Cc: "Michael Hölzl" , users@ovirt.org
> Sent: Thursday, September 24, 2015 11:38:39 AM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
> gets shutdown
> 
> 
> 
> - Original Message -
> > From: "Martin Perina" 
> > To: "Michael Hölzl" 
> > Cc: users@ovirt.org
> > Sent: Thursday, September 24, 2015 11:02:21 AM
> > Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> > gets shutdown
> > 
> > Hi,
> > 
> > sorry for the late response, but you hit a "corner case" :-(
> > 
> > Let me start explain you a few things first:
> > 
> > After startup of engine there's an internval during which fencing is
> > disabled. It's called DisableFenceAtStartupInSec and by default it's
> > set to 5 minutes. It can be changed using
> > 
> >engine-config -s DisableFenceAtStartupInSec
> > 
> > but please do that with caution.
> > 
> > Why do we have such timeout? It's a prevention of fencing storm, which
> > could happen in during power issues in whole DC: when both engine and
> > hosts are started, for huge hosts it may take a lot of time until become
> > up and VDSM start to communicate with engine. So usually engine is started
> > first and without this interval engine will start fencing for hosts which
> > are just starting ...
> > 
> > Another thing: if we cannot properly fence the host, we cannot determine
> > if there's not just communication issue between engine and host, so we
> > cannot restart HA VMs on another host. The only thing we can do is to
> > offer "Mark host as rebooted" manual option to administrator. If
> > administrator execution this option, we try to restart HA VMs on different
> > host ASAP, because admin took the responsibility of validation that VMs
> > are really not running.
> > 
> > 
> > When engine is started, following actions related to fencing are taken:
> > 
> > 1. Get status of all hosts from DB and schedule Non Responding Treatment
> >after DisableFenceAtStartupInSec timeout is passed
> > 
> > 2. Try to communicate with all host and refresh their status
> > 
> > 
> > If some host become Non Resposive during DisableFenceAtStartupInSec
> > interval
> > we skip fencing and administator will see message in Events tab that host
> > is Non Responsive, but fencing is disabled due to startup interval. So
> > administrator have to take care of such host manually.
> > 
> > 
> > Now what happened in your case:
> > 
> >  1. Hosted engine VM is running on host1 with other VMs
> >  2. Status of host1 and host2 is Up
> >  3. You kill/shutdown host1 -> hosted engine VM is also shut down -> no
> >  engine
> > is running to detect issue with host1 and change its status to Non
> > Responsive
> >  4. In the meantime hosted engine VM is started on host2 -> it will read
> >  host
> > status from DB, but all hosts are up -> it will try to communicate with
> > host1,
> > but it's unreachable -> so it changes host1 status Non Responsive and
> > starts
> > Non Responsive Treatment for host1 -> Non Responsive Treatment is
> > aborted
> > because engine is still in DisableFenceAtStartupInSec
> > 
> > 
> > So in normal deployment (without hosted engine) admin is notified that
> > host,
> > where engine is running, crashed and was rebooted, so he has to take a look
> > and do manual steps if needed.
> > 
> > In hosted engine deployment it's an issue because hosted engine VM can be
> > restart
> > on different host also in other cases then crashes (for example if host is
> > overloaded hosted engine can stop hosted engine VM and restart it on
> > different
> > host, but this shouldn't happen too often).
> > 
> > At the moment the only solution for this is manual: let administrator to be
> > notified that host engine VM is restarted on different host, so
> > administrator
> > can check manually what was the cause for this restart and execute manual
> > steps
> > if needed.
> > 
> > So to summarize: at the moment I don't see any reliable automatic solution
> > for this :-( and fencing storm prevention is more important. But feel free
> > to
> > create
> > a bug for this issue, maybe we can think of at least some improvement for
> > this use
> > case.
> 
> Thanks for the detailed explanation Martin
> Really a corner case, lets see if we got more inputs on that from other users
> Maybe when hosted engine VM is restarted on another node we can ask for the
> reason and act accordingly
> Doron, with current implementation, is the reason for hosted engine VM
> restart stored anywhere ?

I have already discussed this with Martin Sivak and hosted engine doesn't
touch engine db at all. We discussed this possible solution with Martin,
which we could do in master and maybe in 3.6 if agreed:

 1. Just after start of engine we can read from the db name of the host
which hosted engine VM is running on and store it somewhere in memory
   

Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Eli Mesika


- Original Message -
> From: "Martin Perina" 
> To: "Michael Hölzl" 
> Cc: users@ovirt.org
> Sent: Thursday, September 24, 2015 11:02:21 AM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
> gets shutdown
> 
> Hi,
> 
> sorry for the late response, but you hit a "corner case" :-(
> 
> Let me start explain you a few things first:
> 
> After startup of engine there's an internval during which fencing is
> disabled. It's called DisableFenceAtStartupInSec and by default it's
> set to 5 minutes. It can be changed using
> 
>engine-config -s DisableFenceAtStartupInSec
> 
> but please do that with caution.
> 
> Why do we have such timeout? It's a prevention of fencing storm, which
> could happen in during power issues in whole DC: when both engine and
> hosts are started, for huge hosts it may take a lot of time until become
> up and VDSM start to communicate with engine. So usually engine is started
> first and without this interval engine will start fencing for hosts which
> are just starting ...
> 
> Another thing: if we cannot properly fence the host, we cannot determine
> if there's not just communication issue between engine and host, so we
> cannot restart HA VMs on another host. The only thing we can do is to
> offer "Mark host as rebooted" manual option to administrator. If
> administrator execution this option, we try to restart HA VMs on different
> host ASAP, because admin took the responsibility of validation that VMs
> are really not running.
> 
> 
> When engine is started, following actions related to fencing are taken:
> 
> 1. Get status of all hosts from DB and schedule Non Responding Treatment
>after DisableFenceAtStartupInSec timeout is passed
> 
> 2. Try to communicate with all host and refresh their status
> 
> 
> If some host become Non Resposive during DisableFenceAtStartupInSec interval
> we skip fencing and administator will see message in Events tab that host
> is Non Responsive, but fencing is disabled due to startup interval. So
> administrator have to take care of such host manually.
> 
> 
> Now what happened in your case:
> 
>  1. Hosted engine VM is running on host1 with other VMs
>  2. Status of host1 and host2 is Up
>  3. You kill/shutdown host1 -> hosted engine VM is also shut down -> no
>  engine
> is running to detect issue with host1 and change its status to Non
> Responsive
>  4. In the meantime hosted engine VM is started on host2 -> it will read host
> status from DB, but all hosts are up -> it will try to communicate with
> host1,
> but it's unreachable -> so it changes host1 status Non Responsive and
> starts
> Non Responsive Treatment for host1 -> Non Responsive Treatment is aborted
> because engine is still in DisableFenceAtStartupInSec
> 
> 
> So in normal deployment (without hosted engine) admin is notified that host,
> where engine is running, crashed and was rebooted, so he has to take a look
> and do manual steps if needed.
> 
> In hosted engine deployment it's an issue because hosted engine VM can be
> restart
> on different host also in other cases then crashes (for example if host is
> overloaded hosted engine can stop hosted engine VM and restart it on
> different
> host, but this shouldn't happen too often).
> 
> At the moment the only solution for this is manual: let administrator to be
> notified that host engine VM is restarted on different host, so administrator
> can check manually what was the cause for this restart and execute manual
> steps
> if needed.
> 
> So to summarize: at the moment I don't see any reliable automatic solution
> for this :-( and fencing storm prevention is more important. But feel free to
> create
> a bug for this issue, maybe we can think of at least some improvement for
> this use
> case.

Thanks for the detailed explanation Martin 
Really a corner case, lets see if we got more inputs on that from other users
Maybe when hosted engine VM is restarted on another node we can ask for the 
reason and act accordingly 
Doron, with current implementation, is the reason for hosted engine VM restart 
stored anywhere ? 

> 
> 
> Thanks
> 
> Martin Perina
> 
> - Original Message -
> > From: "Michael Hölzl" 
> > To: "Martin Perina" 
> > Cc: users@ovirt.org
> > Sent: Monday, September 21, 2015 4:47:06 PM
> > Subject: Re: [ovirt-users] HA - Fencing not working when host with engine
> > gets shutdown
> > 
> > Hi,
> > 
> > The whole engine.log including the shutdown time (was performed around
> > 9:19)
> > http://pastebin.com/cdY9uTkJ
> > 
> > vdsm.log of host01 (the host which kept on running and took over the
> > engine) split into 3 uploads (limit of 512 kB of pastebin):
> > 1 : http://pastebin.com/dr9jNTek
> > 2 : http://pastebin.com/cuyHL6ne
> > 3 : http://pastebin.com/7x2ZQy1y
> > 
> > Michael
> > 
> > On 09/21/2015 03:00 PM, Martin Perina wrote:
> > > Hi,
> > >
> > > could you please post whole engine.log (from the time which you turned
> > > off
> > > the

Re: [ovirt-users] backup and restore the vm

2015-09-24 Thread Eli Mesika


- Original Message -
> From: "Budur Nagaraju" 
> To: users@ovirt.org
> Sent: Tuesday, September 22, 2015 1:24:17 PM
> Subject: [ovirt-users] backup and restore the vm
> 
> Hi
> 
> can you pls provide me the tools to backup and restore the vm images in
> Ovirt?

Please take a look at
http://www.ovirt.org/Features/Backup-Restore_API_Integration

> 
> Thanks,
> Nagaraju
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Backup storage domain for the meta data ONLY?

2015-09-24 Thread Nicolas Ecarnot

Hi,

In a lab, I have an oVirt test datacenter using a very old SAN (iSCSI 
setup).

There are very few VM on it, and none of them are in critical.

Some risky maintenance is to be planned on the SAN, and the risk is to 
lose it all.
To be honnest, to lose these VMs is bearable, but to lose the setup 
would be painful (quite long to set everything back up).


This is not an hosted setup, so I guess most of the config lies in the 
manager, and this part will stay untouched.


My question : Is there any benefit to add an small additionnal temporary 
storage domain where (some unknown by me) meta datas could be duplicated 
(knowing that I have not enough room to export the VM)?

What additionnal setup would be saved this way?

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA - Fencing not working when host with engine gets shutdown

2015-09-24 Thread Martin Perina
Hi,

sorry for the late response, but you hit a "corner case" :-(

Let me start explain you a few things first:

After startup of engine there's an internval during which fencing is
disabled. It's called DisableFenceAtStartupInSec and by default it's
set to 5 minutes. It can be changed using 

   engine-config -s DisableFenceAtStartupInSec

but please do that with caution.

Why do we have such timeout? It's a prevention of fencing storm, which
could happen in during power issues in whole DC: when both engine and
hosts are started, for huge hosts it may take a lot of time until become
up and VDSM start to communicate with engine. So usually engine is started
first and without this interval engine will start fencing for hosts which
are just starting ...

Another thing: if we cannot properly fence the host, we cannot determine
if there's not just communication issue between engine and host, so we
cannot restart HA VMs on another host. The only thing we can do is to
offer "Mark host as rebooted" manual option to administrator. If
administrator execution this option, we try to restart HA VMs on different
host ASAP, because admin took the responsibility of validation that VMs
are really not running.


When engine is started, following actions related to fencing are taken:

1. Get status of all hosts from DB and schedule Non Responding Treatment
   after DisableFenceAtStartupInSec timeout is passed

2. Try to communicate with all host and refresh their status


If some host become Non Resposive during DisableFenceAtStartupInSec interval
we skip fencing and administator will see message in Events tab that host
is Non Responsive, but fencing is disabled due to startup interval. So
administrator have to take care of such host manually.


Now what happened in your case:

 1. Hosted engine VM is running on host1 with other VMs
 2. Status of host1 and host2 is Up
 3. You kill/shutdown host1 -> hosted engine VM is also shut down -> no engine
is running to detect issue with host1 and change its status to Non 
Responsive
 4. In the meantime hosted engine VM is started on host2 -> it will read host
status from DB, but all hosts are up -> it will try to communicate with 
host1,
but it's unreachable -> so it changes host1 status Non Responsive and starts
Non Responsive Treatment for host1 -> Non Responsive Treatment is aborted
because engine is still in DisableFenceAtStartupInSec


So in normal deployment (without hosted engine) admin is notified that host,
where engine is running, crashed and was rebooted, so he has to take a look
and do manual steps if needed.

In hosted engine deployment it's an issue because hosted engine VM can be 
restart
on different host also in other cases then crashes (for example if host is
overloaded hosted engine can stop hosted engine VM and restart it on different
host, but this shouldn't happen too often).

At the moment the only solution for this is manual: let administrator to be
notified that host engine VM is restarted on different host, so administrator
can check manually what was the cause for this restart and execute manual steps
if needed.

So to summarize: at the moment I don't see any reliable automatic solution
for this :-( and fencing storm prevention is more important. But feel free to 
create
a bug for this issue, maybe we can think of at least some improvement for this 
use
case.


Thanks

Martin Perina

- Original Message -
> From: "Michael Hölzl" 
> To: "Martin Perina" 
> Cc: users@ovirt.org
> Sent: Monday, September 21, 2015 4:47:06 PM
> Subject: Re: [ovirt-users] HA - Fencing not working when host with engine 
> gets shutdown
> 
> Hi,
> 
> The whole engine.log including the shutdown time (was performed around 9:19)
> http://pastebin.com/cdY9uTkJ
> 
> vdsm.log of host01 (the host which kept on running and took over the
> engine) split into 3 uploads (limit of 512 kB of pastebin):
> 1 : http://pastebin.com/dr9jNTek
> 2 : http://pastebin.com/cuyHL6ne
> 3 : http://pastebin.com/7x2ZQy1y
> 
> Michael
> 
> On 09/21/2015 03:00 PM, Martin Perina wrote:
> > Hi,
> >
> > could you please post whole engine.log (from the time which you turned off
> > the host with engine VM) and also vdsm.log from both hosts?
> >
> > Thanks
> >
> > Martin Perina
> >
> > - Original Message -
> >> From: "Michael Hölzl" 
> >> To: users@ovirt.org
> >> Sent: Monday, September 21, 2015 10:27:08 AM
> >> Subject: [ovirt-users] HA - Fencing not working when host with engine gets
> >>shutdown
> >>
> >> Hi all,
> >>
> >> we are trying to setup an ovirt environment with two hosts, both
> >> connected to a ISCSI storage device, a hosted engine and power
> >> management configured over ILO. So far it seems to work fine in our
> >> testing setup and starting/stopping VMs works smoothly with proper
> >> scheduling between those hosts. So we wanted to test HA for the VMs now
> >> and started to manually shutdown a host while there are still VMs
> >> running on that machine (to simu

Re: [ovirt-users] moving storage away from a single point of failure

2015-09-24 Thread Alan Murrell


On 22/09/15 02:32 AM, Daniel Helgenberger wrote:
> - Do not run compute and storage on the same hosts

Is the Engine considered to be the "Compute" part of things?

Regards,

Alan

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VNC console behind NAT

2015-09-24 Thread Michal Skrivanek

On Sep 24, 2015, at 09:43 , Alan Murrell  wrote:

> 
> 
> On 23/09/15 07:44 AM, Michal Skrivanek wrote:
>> you would need a full proxy. Is websocket proxy & noVNC in browser an option?
> 
> I have tried both of those and have never been able to get them to work
> for me.

It is a bit more difficult, it is also not fully equivalent (especially 
spice_html5 is really an alpha quality)
But with the right deployment of websocket-proxy it should work just fine 
(engine-setup should ask you if you rerun - if you want to run it on engine 
host; or it can be deployed on a separate host). Then make sure you did indeed 
import engine's CA into the browser

btw, what's the reason for VNC? lack of SPICE client on a particular platform?

Thanks,
michal

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VNC console behind NAT

2015-09-24 Thread Alan Murrell


On 23/09/15 07:44 AM, Michal Skrivanek wrote:
> you would need a full proxy. Is websocket proxy & noVNC in browser an option?

I have tried both of those and have never been able to get them to work
for me.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users