Thanks for the info Karli, I wasn’t aware ZFS Dedup was such a dog. I guess I’ll leave that off. My data get’s 3.5:1 savings on compression alone. I was aware of stripped sets. I will be doing 6x Striped sets across 12x disks.
On top of this design I’m going to try and test Intel Optane DIMM (512GB) as a “Tier” for GlusterFS to try and get further write acceleration. And issues with GlusterFS “Tier” functionality that anyone is aware of? Thank you, Cody Hill > On Apr 18, 2019, at 2:32 AM, Karli Sjöberg <ka...@inparadise.se> wrote: > > > > Den 17 apr. 2019 16:30 skrev Cody Hill <c...@platform9.com>: > Hey folks. > > I’m looking to deploy GlusterFS to host some VMs. I’ve done a lot of reading > and would like to implement Deduplication and Compression in this setup. My > thought would be to run ZFS to handle the Compression and Deduplication. > > You _really_ don't want ZFS doing dedup for any reason. > > > ZFS would give me the following benefits: > 1. If a single disk fails rebuilds happen locally instead of over the network > 2. Zil & L2Arc should add a slight performance increase > > Adding two really good NVME SSD's as a mirrored SLOG vdev does a huge deal > for synchronous write performance, turning every random write into large > streams that the spinning drives handle better. > > Don't know how picky Gluster is about synchronicity though, most > "performance" tweaking suggests setting stuff to async, which I wouldn't > recommend, but it's a huge boost for throughput obviously; not having to wait > for stuff to actually get written, but it's dangerous. > > With mirrored NVME SLOG's, you could probably get that throughput without > going asynchronous, which saves you from potential data corruption in a > sudden power loss. > > L2ARC on the other hand does a bit for read latency, but for a general > purpose file server- in practice- not a huge difference, the working set is > just too large. Also keep in mind that L2ARC isn't "free". You need more RAM > to know where you've cached stuff... > > 3. Deduplication and Compression are inline and have pretty good performance > with modern hardware (Intel Skylake) > > ZFS deduplication has terrible performance. Watch your throughput > automatically drop from hundreds or thousands of MB/s down to, like 5. It's a > feature;) > > 4. Automated Snapshotting > > I can then layer GlusterFS on top to handle distribution to allow 3x Replicas > of my storage. > My question is… Why aren’t more people doing this? Is this a horrible idea > for some reason that I’m missing? > > While it could save a lot of space in some hypothetical instance, the > drawbacks can never motivate it. E.g. if you want one node to suddenly die > and never recover because of RAM exhaustion, go with ZFS dedup ;) > > I’d be very interested to hear your thoughts. > > Avoid ZFS dedup at all costs. LZ4 compression on the hand is awesome, > definitely use that! It's basically a free performance enhancer the also > saves space :) > > As another person has said, the best performance layout is RAID10- striped > mirrors. I understand you'd want to get as much volume as possible with > RAID-Z/RAID(5|6) since gluster also replicates/distributes, but it has a huge > impact on IOPS. If performance is the main concern, do striped mirrors with > replica 3 in Gluster. My advice is to test thoroughly with different pool > layouts to see what gives acceptable performance against your volume > requirements. > > /K > > > Additional thoughts: > I’d like to use Ganesha pNFS to connect to this storage. (Any issues here?) > I think I’d need KeepAliveD across these 3x nodes to store in the FSTAB (Is > this correct?) > I’m also thinking about creating a “Gluster Tier” of 512GB of Intel Optane > DIMM to really smooth out write latencies… Any issues here? > > Thank you, > Cody Hill > > > > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users