Hi Xavier and thanks for your answers. Servers will have 26*8TB disks.I don't want to loose more than 2 disk for raid, so my options are HW RAID6 24+2 or 2 * HW RAID5 12+1, in both cases I can create 2 bricks per server using LVM and use one brick per server to create two distributed-disperse volumes. I will test those configurations when servers arrive.
I can go with 8+1 or 16+2, will make tests when servers arrive. But 8+2 will be too much, I lost nearly %25 space in this case. For the client count, this cluster will get backups from hadoop nodes so there will be 750-1000 clients at least which sends data at the same time. Can 16+2 * 3 = 54 gluster nodes handle this or should I increase node count? I will check the parameters you mentioned. Serkan On Tue, Oct 13, 2015 at 1:43 PM, Xavier Hernandez <xhernan...@datalab.es> wrote: > +gluster-users > > > On 13/10/15 12:34, Xavier Hernandez wrote: > >> Hi Serkan, >> >> On 12/10/15 16:52, Serkan Çoban wrote: >> >>> Hi, >>> >>> I am planning to use GlusterFS for backup purposes. I write big files >>> (>100MB) with a throughput of 2-3GB/sn. In order to gain from space we >>> plan to use erasure coding. I have some questions for EC and brick >>> planning: >>> - I am planning to use 200TB XFS/ZFS RAID6 volume to hold one brick per >>> server. Should I increase brick count? is increasing brick count also >>> increases performance? >>> >> >> Using a distributed-dispersed volume increases performance. You can >> split each RAID6 volume into multiple bricks to create such a volume. >> This is because a single brick process cannot achieve the maximum >> throughput of the disk, so creating multiple bricks improves this. >> However having too many bricks could be worse because all request will >> go to the same filesystem and will compete between them in your case. >> >> Another thing to consider is the size of the RAID volume. A 200TB RAID >> will require *a lot* of time to reconstruct in case of failure of any >> disk. Also, a 200 TB RAID means you need almost 30 8TB disks. A RAID6 of >> 30 disks is quite fragile. Maybe it would be better to create multiple >> RAID6 volumes, each with 18 disks at most (16+2 is a good and efficient >> configuration, specially for XFS on non-hardware raids). Even in this >> configuration, you can create multiple bricks in each RAID6 volume. >> >> - I plan to use 16+2 for EC. Is this a problem? Should I decrease this >>> to 12+2 or 10+2? Or is it completely safe to use whatever we want? >>> >> >> 16+2 is a very big configuration. It requires much computation power and >> forces you to grow (if you need to grow the gluster volume at some >> point) in multiples of 18 bricks. >> >> Considering that you are already using a RAID6 in your servers, what you >> are really protecting with the disperse redundancy is the failure of the >> servers themselves. Maybe a 8+1 configuration could be enough for your >> needs and requires less computation. If you really need redundancy 2, >> 8+2 should be ok. >> >> Using values that are not a power of 2 has a theoretical impact on the >> performance of the disperse volume when applications write blocks whose >> size is a multiple of a power of 2 (which is the most normal case). This >> means that it's possible that a 10+2 performs worse than a 8+2. However >> this depends on many other factors, some even internal to gluster, like >> caching, meaning that the real impact could be almost negligible in some >> cases. You should test it with your workload. >> >> - I understand that EC calculation is performed on client side, I want >>> to know if there are any benchmarks how EC affects CPU usage? For >>> example each 100MB/sn traffic may use 1CPU core? >>> >> >> I don't have a detailed measurement of CPU usage related to bandwidth, >> however we have made some tests that seem to indicate that the CPU >> overhead caused by disperse is quite small for a 4+2 configuration. I >> don't have access to this data right now. When I have it, I'll send it >> to you. >> >> I will also try to do some tests with a 8+2 and 16+2 configuration to >> see the difference. >> >> - Is client number affect cluster performance? Is there any difference >>> if I connect 100 clients each writing with 20-30MB/s to cluster vs 1000 >>> clients each writing 2-3MB/s? >>> >> >> Increasing the number of clients improves performance however I wont' go >> over 100 clients as this could have a negative impact on performance >> caused by the overhead of managing all of them. In our tests, the >> maximum performance if obtained with ~8 parallel clients (if my memory >> doesn't fail). >> >> You will also probably want to tweak some volume parameters, like >> server.event-threads, client.event-threads, >> performance.client-io-threads and server.outstanding-rpc-limit to >> increase performance. >> >> Xavi >> >> >>> Thank you for your time, >>> Serkan >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>>
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users