Re: [ovirt-users] Good practices

Erekle Magradze Mon, 07 Aug 2017 14:12:07 -0700

Hi Fernando,

Indeed, having and arbiter node is always a good idea, and it savescosts a lot.


Good luck with your setup.

Cheers

Erekle


On 07.08.2017 23:03, FERNANDO FREDIANI wrote:

Thanks for the detailed answer Erekle.
I conclude that it is worth in any scenario to have a arbiter node inorder to avoid wasting more disk space to RAID X + Gluster Replicationon the top of it. The cost seems much lower if you consider runningcosts of the whole storage and compare it with the cost to build thearbiter node. Even having a fully redundant arbiter service with 2nodes would make it wort on a larger deployment.
Regards
Fernando

On 07/08/2017 17:07, Erekle Magradze wrote:
Hi Fernando (sorry for misspelling your name, I used a differentkeyboard),
So let's go with the following scenarios:
1. Let's say you have two servers (replication factor is 2), i.e. twobricks per volume, in this case it is strongly recommended to havethe arbiter node, the metadata storage that will guarantee avoidingthe split brain situation, in this case for arbiter you don't evenneed a disk with lots of space, it's enough to have a tiny ssd buthosted on a separate server. Advantage of such setup is that youdon't need the RAID 1 for each brick, you have the metadatainformation stored in arbiter node and brick replacement is easy.
2. If you have odd number of bricks (let's say 3, i.e. replicationfactor is 3) in your volume and you didn't create the arbiter node aswell as you didn't configure the quorum, in this case the entire loadfor keeping the consistency of the volume resides on all 3 servers,each of them is important and each brick contains key information,they need to cross-check each other (that's what people usually dowith the first try of gluster :) ), in this case replacing a brick isa big pain and in this case RAID 1 is a good option to have (that'sthe disadvantage, i.e. loosing the space and not having the JBODoption) advantage is that you don't have the to have additionalarbiter node.
3. You have odd number of bricks and configured arbiter node, in thiscase you can easily go with JBOD, however a good practice would be tohave a RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectlysufficient for volumes with 10s of TB-s in size.)
That's basically it
The rest about the reliability and setup scenarios you can find ingluster documentation, especially look for quorum and arbiter nodeconfigs+options.
Cheers

Erekle
P.S. What I was mentioning, regarding a good practice is mostlyrelated to the operations of gluster not installation or deployment,i.e. not the conceptual understanding of gluster (conceptually it's aJBOD system).
On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:
Thanks for the clarification Erekle.
However I get surprised with this way of operating from GlusterFS asit adds another layer of complexity to the system (either a hardwareor software RAID) before the gluster config and increase thesystem's overall costs.
An important point to consider is: In RAID configuration you alreadyhave space 'wasted' in order to build redundancy (either RAID 1, 5,or 6). Then when you have GlusterFS on the top of several RAIDs youhave again more data replicated so you end up with the same dataconsuming more space in a group of disks and again on the top ofseveral RAIDs depending on the Gluster configuration you have (in aRAID 1 config the same data is replicated 4 times).
Yet another downside of having a RAID (specially RAID 5 or 6) isthat it reduces considerably the write speeds as each group of diskswill end up having the write speed of a single disk as all otherdisks of that group have to wait for each other to write as well.
Therefore if Gluster already replicates data why does it create thisbig pain you mentioned if the data is replicated somewhere else, canstill be retrieved to both serve clients and reconstruct theequivalent disk when it is replaced ?
Fernando


On 07/08/2017 10:26, Erekle Magradze wrote:
Hi Frenando,
Here is my experience, if you consider a particular hard drive as abrick for gluster volume and it dies, i.e. it becomes notaccessible it's a huge hassle to discard that brick and exchangewith another one, since gluster some tries to access that brokenbrick and it's causing (at least it cause for me) a big pain,therefore it's better to have a RAID as brick, i.e. have RAID 1(mirroring) for each brick, in this case if the disk is down youcan easily exchange it and rebuild the RAID without going offline,i.e switching off the volume doing brick manipulations andswitching it back on.
Cheers

Erekle


On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:
For any RAID 5 or 6 configuration I normally follow a simple goldrule which gave good results so far:
- up to 4 disks RAID 5
- 5 or more disks RAID 6
However I didn't really understand well the recommendation to useany RAID with GlusterFS. I always thought that GlusteFS likes towork in JBOD mode and control the disks (bricks) directlly so youcan create whatever distribution rule you wish, and if a singledisk fails you just replace it and which obviously have the datareplicated from another. The only downside of using in this way isthat the replication data will be flow accross all servers butthat is not much a big issue.
Anyone can elaborate about Using RAID + GlusterFS and JBOD +GlusterFS.
Thanks
Regards
Fernando


On 07/08/2017 03:46, Devin Acosta wrote:
Moacir,
I have recently installed multiple Red Hat Virtualization hostsfor several different companies, and have dealt with the Red HatSupport Team in depth about optimal configuration in regards tosetting up GlusterFS most efficiently and I wanted to share withyou what I learned.
In general Red Hat Virtualization team frowns upon using eachDISK of the system as just a JBOD, sure there is some protectionby having the data replicated, however, the recommendation is touse RAID 6 (preferred) or RAID-5, or at least RAID-1 at the veryleast.
Here is the direct quote from Red Hat when I asked about RAID andBricks:
/
/
/"A typical Gluster configuration would use RAID underneath thebricks. RAID 6 is most typical as it gives you 2 disk failureprotection, but RAID 5 could be used too. Once you have theRAIDed bricks, you'd then apply the desired replication on top ofthat. The most popular way of doing this would be distributedreplicated with 2x replication. In general you'll get betterperformance with larger bricks. 12 drives is often a sweet spot.Another option would be to create a separate tier using all SSD’s.” /
/In order to SSD tiering from my understanding you would need 1 xNVMe drive in each server, or 4 x SSD hot tier (it needs to bedistributed, replicated for the hot tier if not using NVME). Sowith you only having 1 SSD drive in each server, I’d suggestmaybe looking into the NVME option. /
/
/
/Since your using only 3-servers, what I’d probably suggest is todo (2 Replicas + Arbiter Node), this setup actually doesn’trequire the 3rd server to have big drives at all as it onlystores meta-data about the files and not actually a full copy. /
/
/
/Please see the attached document that was given to me by Red Hatto get more information on this. Hope this information helps you./
/
/

--

Devin Acosta, RHCA, RHVCA
Red Hat Certified Architect
On August 6, 2017 at 7:29:29 PM, Moacir Ferreira(moacirferre...@hotmail.com <mailto:moacirferre...@hotmail.com>)wrote:
I am willing to assemble a oVirt "pod", made of 3 servers, eachwith 2 CPU sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. Theidea is to use GlusterFS to provide HA for the VMs. The 3servers have a dual 40Gb NIC and a dual 10Gb NIC. So myintention is to create a loop like a server triangle using the40Gb NICs for virtualization files (VMs .qcow2) access and tomove VMs around the pod (east /west traffic) while using the10Gb interfaces for giving services to the outside world(north/south traffic).
This said, my first question is: How should I deploy GlusterFSin such oVirt scenario? My questions are:
1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirtnode, and then create a GlusterFS using them?
2 - Instead, should I create a JBOD array made of all server'sdisks?
3 - What is the best Gluster configuration to provide for HAwhile not consuming too much disk space?
4 - Does a oVirt hypervisor pod like I am planning to build, andthe virtualization environment, benefits from tiering when usinga SSD disk? And yes, will Gluster do it by default or I have toconfigure it to do so?
At the bottom line, what is the good practice for usingGlusterFS in small pods for enterprises?
You opinion/feedback will be really appreciated!

Moacir

_______________________________________________
Users mailing list
Users@ovirt.org <mailto:Users@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--
Recogizer Group GmbH

Dr.rer.nat. Erekle Magradze
Lead Big Data Engineering & DevOps
Rheinwerkallee 2, 53227 Bonn
Tel: +49 228 29974555

e-mailerekle.magra...@recogizer.de
Web:www.recogizer.com
Recogizer auf LinkedInhttps://www.linkedin.com/company-beta/10039182/
Folgen Sie uns auf Twitterhttps://twitter.com/recogizer
-----------------------------------------------------------------
Recogizer Group GmbH
Geschäftsführer: Oliver Habisch, Carsten Kreutze
Handelsregister: Amtsgericht Bonn HRB 20724
Sitz der Gesellschaft: Bonn; USt-ID-Nr.: DE294195993
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten 
haben,
informieren Sie bitte sofort den Absender und löschen Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der 
darin enthaltenen Informationen ist nicht gestattet.

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Good practices

Reply via email to