Thank you for the advice and help I do plan on going 10Gbps networking; haven't quite jumped off that cliff yet, though.
I did put my data-hdd (main VM storage volume) onto a dedicated 1Gbps network, and I've watched throughput on that and never seen more than 60GB/s achieved (as reported by bwm-ng). I have a separate 1Gbps network for communication and ovirt migration, but I wanted to break that up further (separate out VM traffice from migration/mgmt traffic). My three SSD-backed gluster volumes run the main network too, as I haven't been able to get them to move to the new network (which I was trying to use as all gluster). I tried bonding, but that seamed to reduce performance rather than improve it. --Jim On Fri, Jul 6, 2018 at 2:52 PM, Jamie Lawrence <jlawre...@squaretrade.com> wrote: > Hi Jim, > > I don't have any targeted suggestions, because there isn't much to latch > on to. I can say Gluster replica three (no arbiters) on dedicated servers > serving a couple Ovirt VM clusters here have not had these sorts of issues. > > I suspect your long heal times (and the resultant long periods of high > load) are at least partly related to 1G networking. That is just a matter > of IO - heals of VMs involve moving a lot of bits. My cluster uses 10G > bonded NICs on the gluster and ovirt boxes for storage traffic and separate > bonded 1G for ovirtmgmt and communication with other machines/people, and > we're occasionally hitting the bandwidth ceiling on the storage network. > I'm starting to think about 40/100G, different ways of splitting up > intensive systems, and considering iSCSI for specific volumes, although I > really don't want to go there. > > I don't run FreeNAS[1], but I do run FreeBSD as storage servers for their > excellent ZFS implementation, mostly for backups. ZFS will make your `heal` > problem go away, but not your bandwidth problems, which become worse > (because of fewer NICS pushing traffic). 10G hardware is not exactly in the > impulse-buy territory, but if you can, I'd recommend doing some testing > using it. I think at least some of your problems are related. > > If that's not possible, my next stops would be optimizing everything I > could about sharding, healing and optimizing for serving the shard size to > squeeze as much performance out of 1G as I could, but that will only go so > far. > > -j > > [1] FreeNAS is just a storage-tuned FreeBSD with a GUI. > > > On Jul 6, 2018, at 1:19 PM, Jim Kusznir <j...@palousetech.com> wrote: > > > > hi all: > > > > Once again my production ovirt cluster is collapsing in on itself. My > servers are intermittently unavailable or degrading, customers are noticing > and calling in. This seems to be yet another gluster failure that I > haven't been able to pin down. > > > > I posted about this a while ago, but didn't get anywhere (no replies > that I found). The problem started out as a glusterfsd process consuming > large amounts of ram (up to the point where ram and swap were exhausted and > the kernel OOM killer killed off the glusterfsd process). For reasons not > clear to me at this time, that resulted in any VMs running on that host and > that gluster volume to be paused with I/O error (the glusterfs process is > usually unharmed; why it didn't continue I/O with other servers is > confusing to me). > > > > I have 3 servers and a total of 4 gluster volumes (engine, iso, data, > and data-hdd). The first 3 are replica 2+arb; the 4th (data-hdd) is > replica 3. The first 3 are backed by an LVM partition (some thin > provisioned) on an SSD; the 4th is on a seagate hybrid disk (hdd + some > internal flash for acceleration). data-hdd is the only thing on the disk. > Servers are Dell R610 with the PERC/6i raid card, with the disks > individually passed through to the OS (no raid enabled). > > > > The above RAM usage issue came from the data-hdd volume. Yesterday, I > cought one of the glusterfsd high ram usage before the OOM-Killer had to > run. I was able to migrate the VMs off the machine and for good measure, > reboot the entire machine (after taking this opportunity to run the > software updates that ovirt said were pending). Upon booting back up, the > necessary volume healing began. However, this time, the healing caused all > three servers to go to very, very high load averages (I saw just under 200 > on one server; typically they've been 40-70) with top reporting IO Wait at > 7-20%. Network for this volume is a dedicated gig network. According to > bwm-ng, initially the network bandwidth would hit 50MB/s (yes, bytes), but > tailed off to mostly in the kB/s for a while. All machines' load averages > were still 40+ and gluster volume heal data-hdd info reported 5 items > needing healing. Server's were intermittently experiencing IO issues, even > on the 3 gluster volumes that appeared largely unaffected. Even the OS > activities on the hosts itself (logging in, running commands) would often > be very delayed. The ovirt engine was seemingly randomly throwing engine > down / engine up / engine failed notifications. Responsiveness on ANY VM > was horrific most of the time, with random VMs being inaccessible. > > > > I let the gluster heal run overnight. By morning, there were still 5 > items needing healing, all three servers were still experiencing high load, > and servers were still largely unstable. > > > > I've noticed that all of my ovirt outages (and I've had a lot, way more > than is acceptable for a production cluster) have come from gluster. I > still have 3 VMs who's hard disk images have become corrupted by my last > gluster crash that I haven't had time to repair / rebuild yet (I believe > this crash was caused by the OOM issue previously mentioned, but I didn't > know it at the time). > > > > Is gluster really ready for production yet? It seems so unstable to > me.... I'm looking at replacing gluster with a dedicated NFS server likely > FreeNAS. Any suggestions? What is the "right" way to do production > storage on this (3 node cluster)? Can I get this gluster volume stable > enough to get my VMs to run reliably again until I can deploy another > storage solution? > > > > --Jim > > _______________________________________________ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: https://www.ovirt.org/community/about/community- > guidelines/ > > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/ > message/YQX3LQFQQPW4JTCB7B6FY2LLR6NA2CB3/ > >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZATIAASOH3PSS2FBA4K5ANJKAV6AM476/