Re: [ovirt-users] [Gluster-users] HA storage based on two nodes with one point of failure

2015-06-07 Thread Юрий Полторацкий
2015-06-08 8:32 GMT+03:00 Ravishankar N :

>
>
> On 06/08/2015 02:38 AM, Юрий Полторацкий wrote:
>
> Hi,
>
> I have made a lab with a config listed below and have got unexpected
> result. Someone, tell me, please, where did I go wrong?
>
> I am testing oVirt. Data Center has two clusters: the first as a computing
> with three nodes (node1, node2, node3); the second as a storage (node5,
> node6) based on glusterfs (replica 2).
>
> I want the storage to be HA. I have read here
> 
> next:
> For a replicated volume with two nodes and one brick on each machine, if
> the server-side quorum is enabled and one of the nodes goes offline, the
> other node will also be taken offline because of the quorum configuration.
> As a result, the high availability provided by the replication is
> ineffective. To prevent this situation, a dummy node can be added to the
> trusted storage pool which does not contain any bricks. This ensures that
> even if one of the nodes which contains data goes offline, the other node
> will remain online. Note that if the dummy node and one of the data nodes
> goes offline, the brick on other node will be also be taken offline, and
> will result in data unavailability.
>
> So, I have added my "Engine" (not self-hosted) as a dummy node without a
> brick and have configured quorum as listed below:
> cluster.quorum-type: fixed
> cluster.quorum-count: 1
> cluster.server-quorum-type: server
> cluster.server-quorum-ratio: 51%
>
>
> Then, I've run a VM and have dropped the network link from node6, after
> one a hour have switched back the link and after a while have got a
> split-brain. But why? No one could write to the brick on node6: the VM was
> running on node3 and node1 was SPM.
>
>
>
> It could have happened that after node6 came up, the client(s) saw a
> temporary disconnect of node 5 and a write happened at that time. When the
> node 5 is connected again, we have AFR xattrs on both nodes blaming each
> other, causing split-brain. For a replica 2 setup. it is best to set the
> client-quorum to auto instead of fixed. What this means is that the first
> node of the replica must always be up for writes to be permitted. If the
> first node goes down, the volume becomes read-only.
>
Yes, at first I have tested with client-quorum auto, but my VMs has been
paused when the first node goes down and this is not unacceptable

Ok, I understood: there is now way to have fault tolerance storage with
only two servers using GlusterFS. I have to get another one.

Thanks.


> For better availability , it would be better to use a replica 3 volume
> with (again with client-quorum set to auto). If you are using glusterfs
> 3.7, you can also consider using the arbiter configuration [1] for replica
> 3.
>
> [1]
> https://github.com/gluster/glusterfs/blob/master/doc/features/afr-arbiter-volumes.md
>
> Thanks,
> Ravi
>
>
>  Gluster's log from node6:
> Июн 07 15:35:06 node6.virt.local etc-glusterfs-glusterd.vol[28491]:
> [2015-06-07 12:35:06.106270] C [MSGID: 106002]
> [glusterd-server-quorum.c:356:glusterd_do_volume_quorum_action]
> 0-management: Server quorum lost for volume vol3. Stopping local bricks.
> Июн 07 16:30:06 node6.virt.local etc-glusterfs-glusterd.vol[28491]:
> [2015-06-07 13:30:06.261505] C [MSGID: 106003]
> [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
> 0-management: Server quorum regained for volume vol3. Starting local bricks.
>
>
> gluster> volume heal vol3 info
> Brick node5.virt.local:/storage/brick12/
> /5d0bb2f3-f903-4349-b6a5-25b549affe5f/dom_md/ids - Is in split-brain
>
> Number of entries: 1
>
> Brick node6.virt.local:/storage/brick13/
> /5d0bb2f3-f903-4349-b6a5-25b549affe5f/dom_md/ids - Is in split-brain
>
> Number of entries: 1
>
>
> gluster> volume info vol3
>
> Volume Name: vol3
> Type: Replicate
> Volume ID: 69ba8c68-6593-41ca-b1d9-40b3be50ac80
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: node5.virt.local:/storage/brick12
> Brick2: node6.virt.local:/storage/brick13
> Options Reconfigured:
> storage.owner-gid: 36
> storage.owner-uid: 36
> cluster.server-quorum-type: server
> cluster.quorum-type: fixed
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> auth.allow: *
> user.cifs: disable
> nfs.disable: on
> performance.readdir-ahead: on
> cluster.quorum-count: 1
> cluster.server-quorum-ratio: 51%
>
>
>
> 06.06.2015 12:09, Юрий Полторацкий пишет:
>
> Hi,
>
>  I want to build a HA storage based on two servers. I want that if one
> goes down, my storage will be available in RW mode.
>
>  If I will use replica 2, then split-brain can occur. To avoid this I
> would use a quorum. As I understand correctly, I can use quorum on a client
> side, on a server side, or on both. I wa

Re: [ovirt-users] Comma Seperated Value doesn't accept by custom properties

2015-06-07 Thread Punit Dambiwal
Hi Dan,

The results are below :-

[root@mgmt ~]# engine-config -g UserDefinedVMProperties
UserDefinedVMProperties:  version: 3.0
UserDefinedVMProperties:  version: 3.1
UserDefinedVMProperties:  version: 3.2
UserDefinedVMProperties:  version: 3.3
UserDefinedVMProperties:  version: 3.4
UserDefinedVMProperties: noipspoof=^[0-9.]*$ version: 3.5
[root@mgmt ~]#

I used the following command to set this :-

[root@mgmt ~]# engine-config -s
"UserDefinedVMProperties=noipspoof=^[0-9.]*$"


On Fri, Jun 5, 2015 at 10:16 PM, Dan Kenigsberg  wrote:

> On Fri, Jun 05, 2015 at 03:28:11PM +0800, Punit Dambiwal wrote:
> > Hi,
> >
> > I have installed noipspoof VDSM hooks...in the vdsm hooks it specify that
> > you can use multiple ip address by comma separate listbut it's not
> > working...
> >
> > single ip can work without any issue...but can not add multiple ip
> address
> > in this filed...
>
> what is your
>
>  engine-config -g UserDefinedVMProperties
>
> the regexp there should accept comas, too.
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Why not bond0

2015-06-07 Thread Alan Murrell


On 06/04/2015 03:23 AM, Dan Kenigsberg wrote:
> Do not use tlb or alb in bridge, never! It does not work, that's it.
> The reason is it mangles source macs in xmit frames and arps. When it
> is possible, just use mode 4 (lacp). That should be always possible
> because all enterprise switches support that. Generally, for 99% of
> use cases, you *should* use mode 4. There is no reason to use other modes.

I think 99% may be a bit high... what if one wants to split the bond
between two or more non-stacked switches, which I would imagine is a
pretty common scenario.  My understanding is if mode 4 (LACP) is used,
all the interfaces need to be connected to the same switch, no?

Regards,

Alan

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] VM Clone/Export vanishes PostgreSQL Database in VM

2015-06-07 Thread Matt .
Hi Guys,

I have done some tests with a VM installed with Foreman which runs on
PostgreSQL.

Everytime I clone a working VM, or export it and power it on again the
database of foreman is almost truncated.

I have checked everything, snapshots, etc.

Now it happened on a running Foreman 1.8.1 VM without any Snapshots being taken.

Is there something wrong with Ovirt here ?

I'm getting tired of placing backups back.

Any idea is welcome.

Thanks,

Matt
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA storage based on two nodes with one point of failure

2015-06-07 Thread Юрий Полторацкий

Hi,

I have made a lab with a config listed below and have got unexpected 
result. Someone, tell me, please, where did I go wrong?


I am testing oVirt. Data Center has two clusters: the first as a 
computing with three nodes (node1, node2, node3); the second as a 
storage (node5, node6) based on glusterfs (replica 2).


I want the storage to be HA. I have read here 
 
next:
For a replicated volume with two nodes and one brick on each machine, if 
the server-side quorum is enabled and one of the nodes goes offline, the 
other node will also be taken offline because of the quorum 
configuration. As a result, the high availability provided by the 
replication is ineffective. To prevent this situation, a dummy node can 
be added to the trusted storage pool which does not contain any bricks. 
This ensures that even if one of the nodes which contains data goes 
offline, the other node will remain online. Note that if the dummy node 
and one of the data nodes goes offline, the brick on other node will be 
also be taken offline, and will result in data unavailability.


So, I have added my "Engine" (not self-hosted) as a dummy node without a 
brick and have configured quorum as listed below:

cluster.quorum-type: fixed
cluster.quorum-count: 1
cluster.server-quorum-type: server
cluster.server-quorum-ratio: 51%


Then, I've run a VM and have dropped the network link from node6, after 
one a hour have switched back the link and after a while have got a 
split-brain. But why? No one could write to the brick on node6: the VM 
was running on node3 and node1 was SPM.


Gluster's log from node6:
Июн 07 15:35:06 node6.virt.local etc-glusterfs-glusterd.vol[28491]: 
[2015-06-07 12:35:06.106270] C [MSGID: 106002] 
[glusterd-server-quorum.c:356:glusterd_do_volume_quorum_action] 
0-management: Server quorum lost for volume vol3. Stopping local bricks.
Июн 07 16:30:06 node6.virt.local etc-glusterfs-glusterd.vol[28491]: 
[2015-06-07 13:30:06.261505] C [MSGID: 106003] 
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 
0-management: Server quorum regained for volume vol3. Starting local bricks.



gluster> volume heal vol3 info
Brick node5.virt.local:/storage/brick12/
/5d0bb2f3-f903-4349-b6a5-25b549affe5f/dom_md/ids - Is in split-brain

Number of entries: 1

Brick node6.virt.local:/storage/brick13/
/5d0bb2f3-f903-4349-b6a5-25b549affe5f/dom_md/ids - Is in split-brain

Number of entries: 1


gluster> volume info vol3

Volume Name: vol3
Type: Replicate
Volume ID: 69ba8c68-6593-41ca-b1d9-40b3be50ac80
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node5.virt.local:/storage/brick12
Brick2: node6.virt.local:/storage/brick13
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: fixed
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
auth.allow: *
user.cifs: disable
nfs.disable: on
performance.readdir-ahead: on
cluster.quorum-count: 1
cluster.server-quorum-ratio: 51%



06.06.2015 12:09, Юрий Полторацкий пишет:

Hi,

I want to build a HA storage based on two servers. I want that if one 
goes down, my storage will be available in RW mode.


If I will use replica 2, then split-brain can occur. To avoid this I 
would use a quorum. As I understand correctly, I can use quorum on a 
client side, on a server side, or on both. I want to add a dummy node 
without a brick and make such config:


cluster.quorum-type: fixed
cluster.quorum-count: 1
cluster.server-quorum-type: server
cluster.server-quorum-ratio: 51%

I expect that client will have access in RW mode until one brick 
alive. On the other side if server's quorum will not meet, then bricks 
will be RO.


Say, HOST1 with a brick BRICK1, HOST2 with a brick BRICK2, and HOST3 
without a brick.


Once HOST1 lose a network connection, than on this node server quorum 
will not meet and the brick BRICK1 will not be able for writing. But 
on HOST2 there is no problem with server quorum (HOST2 + HOST3 > 51%) 
and that's why BRICK2 still accessible for writing. With client's 
quorum there is no problem also - one brick is alive, so client can 
write on it.


I have made a lab using KVM on my desktop and it seems to be worked 
well and as expected.


The main question is:
Can I use such a storage for production?

Thanks.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Need Advice on Hosted Engine installation

2015-06-07 Thread wodel youchi
Hi,

We are installing a new ovirt platform for a client. this is our first
deployment.

we chose Hosted Engine install.

here is the hardware used and the network configuration used be patient :)

We have 02 hypervisors for now.

each sever has 8 nic ports, we will use bonding.

A NAS with nfs4 for the Engine_VM the ISO and Export domains.

And iSCSI SAN for the rest of the VMs (Data domain)

At first we chose as a network configuration:

10.0.0.X/16 as DMZ (ovirtmgmt will use this network)

172.16.1/24 as Storage network. we've configured our nfs to export only on
this network.

-
We didn't configure the bonding manually at first, we thought that we could
do it once the engine is up we couldn't :-(

for ovirtmgmt, if we create the bond after the engine's installation, we
could not modify the configuration, error message "network in use"

for the storage network, we could't attach this logical network to the nic
used for the storage, because when we do that we loose the connexion to nfs
for the VM engine...

so we had to export the VM engine nfs's share on DMZ and we had to
configure bonding before starting the hosted engine installation.

We also configured bonding on storage network, but after the engine
installation, we couldn't attach this bonding to the logical network
Storage, error message "you have to specify network address if static
protocol is used", the IP address is specified, it's the IP address of the
bonding... :-(

---
Questions:

- Do we missed anything on network configuration capabilities of ovirt? or
the hosted engine is really a special case?
- What is the best way to configure ovirt's Hosted Engine storage?
- For the rest of the network, do we have to configure bonding and bridging
from the GUI only? (not manually)

thanks in advance.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Centos7.1] [Ovirt 3.5.2] hosted engine, is there a way to resume deploy

2015-06-07 Thread Yedidyah Bar David
- Original Message -
> From: "wodel youchi" 
> To: "users" 
> Sent: Sunday, June 7, 2015 3:17:50 PM
> Subject: [ovirt-users] [Centos7.1] [Ovirt 3.5.2] hosted engine,   is 
> there a way to resume deploy
> 
> Hi,
> 
> I tried to deploy ovirt hosted engine 3.5.2 on a Centos7.1
> 
> I messed up things with datacenter naming, I used something else than
> Default, and the result was that, after the DB welcome message between the
> engine and the hypervisor, there was in error (I didn't catch it I was using
> screen command over ssh without log :-( ),

You should be able to find it in the log, in /var/log/ovirt-hosted-engine-setup.

> and the last steps were not done,
> so I ended up
> an engine up but without the hypervisor being registred.
> 
> Is there a way to force the registration again, or will I have to deploy from
> the beginning and reinstall the VM engine?

This depends on your exact status, including whether there is anything in the
engine db and in the hosted-engine metadata file.

What's the output of 'hosted-engine --vm-status' on the host?

You can try to just run again hosted-engine --deploy, replying 'no' to 'is this 
an
additional host deploy?'.

If the engine db is "dirty", you can run engine-cleanup/engine-setup again 
inside
the engine vm, prior to trying deploy again.

There is currently no simple way to clean up the metadata file, see [1] for 
that.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1116469

Best,
-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [Centos7.1] [Ovirt 3.5.2] hosted engine, is there a way to resume deploy

2015-06-07 Thread wodel youchi
Hi,

I tried to deploy ovirt hosted engine 3.5.2 on a Centos7.1

I messed up things with datacenter naming, I used something else than
Default, and the result was that, after the DB welcome message between the
engine and the hypervisor, there was in error (I didn't catch it I was
using screen command over ssh without log :-( ), and the last steps were
not done, so I ended up
an engine up but without the hypervisor being registred.

Is there a way to force the registration again, or will I have to deploy
from the beginning and reinstall the VM engine?

thanks.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Comma Seperated Value doesn't accept by custom properties

2015-06-07 Thread Eli Mesika


- Original Message -
> From: "Dan Kenigsberg" 
> To: "Punit Dambiwal" 
> Cc: users@ovirt.org
> Sent: Friday, June 5, 2015 5:16:43 PM
> Subject: Re: [ovirt-users] Comma Seperated Value doesn't accept by custom 
> properties
> 
> On Fri, Jun 05, 2015 at 03:28:11PM +0800, Punit Dambiwal wrote:
> > Hi,
> > 
> > I have installed noipspoof VDSM hooks...in the vdsm hooks it specify that
> > you can use multiple ip address by comma separate listbut it's not
> > working...
> > 
> > single ip can work without any issue...but can not add multiple ip address
> > in this filed...
> 
> what is your
> 
>  engine-config -g UserDefinedVMProperties
> 
> the regexp there should accept comas, too.

As far as I see the separator is ";" and not "," 

packaging/etc/engine-config/engine-config.properties:UserDefinedVMProperties.description="User
 defined VM properties"
packaging/etc/engine-config/engine-config.properties:UserDefinedVMProperties.type=UserDefinedVMProperties
packaging/etc/engine-config/engine-config.properties:UserDefinedVMProperties.mergable=true
packaging/etc/engine-config/engine-config.properties:UserDefinedVMProperties.delimiter=;



> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Remove default cluster

2015-06-07 Thread Omer Frenkel


- Original Message -
> From: "Nicolas Ecarnot" 
> To: Users@ovirt.org
> Sent: Friday, June 5, 2015 3:48:11 PM
> Subject: [ovirt-users] Remove default cluster
> 
> Hello,
> 
> I finished upgrading a 3.5.1 DC that contained some CentOS 6.6 hosts.
> For that, I added a second cluster in which I progressively moved my
> upgraded hosts into CentOS 7.
> 
> Now my old cluster is empty, and the new one contains my CentOS 7 hosts.
> 
> I'd like to get rid of the old empty cluster, but when trying to delete
> it, oVirt explains that some template are still used in this cluster.
> Actually, I have only the default blank template and another custom one.
> Once the custom one has easily been moved, I have no way to move the
> blank one because every action is greyd out.
> 
> I guess no one will tell me to play with psql... :)
> 

no.. just wait for 3.6 :)

in 3.6 the blank template is not associated with a cluster:
Bug 1145002 - When Default datacenter deleted, cannot remove Default cluster or 
associated template


> --
> Nicolas ECARNOT
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users