Thank you for the advice and help

I do plan on going 10Gbps networking; haven't quite jumped off that cliff
yet, though.

I did put my data-hdd (main VM storage volume) onto a dedicated 1Gbps
network, and I've watched throughput on that and never seen more than
60GB/s achieved (as reported by bwm-ng).  I have a separate 1Gbps network
for communication and ovirt migration, but I wanted to break that up
further (separate out VM traffice from migration/mgmt traffic).  My three
SSD-backed gluster volumes run the main network too, as I haven't been able
to get them to move to the new network (which I was trying to use as all
gluster).  I tried bonding, but that seamed to reduce performance rather
than improve it.

--Jim

On Fri, Jul 6, 2018 at 2:52 PM, Jamie Lawrence <jlawre...@squaretrade.com>
wrote:

> Hi Jim,
>
> I don't have any targeted suggestions, because there isn't much to latch
> on to. I can say Gluster replica three  (no arbiters) on dedicated servers
> serving a couple Ovirt VM clusters here have not had these sorts of issues.
>
> I suspect your long heal times (and the resultant long periods of high
> load) are at least partly related to 1G networking. That is just a matter
> of IO - heals of VMs involve moving a lot of bits. My cluster uses 10G
> bonded NICs on the gluster and ovirt boxes for storage traffic and separate
> bonded 1G for ovirtmgmt and communication with other machines/people, and
> we're occasionally hitting the bandwidth ceiling on the storage network.
> I'm starting to think about 40/100G, different ways of splitting up
> intensive systems, and considering iSCSI for specific volumes, although I
> really don't want to go there.
>
> I don't run FreeNAS[1], but I do run FreeBSD as storage servers for their
> excellent ZFS implementation, mostly for backups. ZFS will make your `heal`
> problem go away, but not your bandwidth problems, which become worse
> (because of fewer NICS pushing traffic). 10G hardware is not exactly in the
> impulse-buy territory, but if you can, I'd recommend doing some testing
> using it. I think at least some of your problems are related.
>
> If that's not possible, my next stops would be optimizing everything I
> could about sharding, healing and optimizing for serving the shard size to
> squeeze as much performance out of 1G as I could, but that will only go so
> far.
>
> -j
>
> [1] FreeNAS is just a storage-tuned FreeBSD with a GUI.
>
> > On Jul 6, 2018, at 1:19 PM, Jim Kusznir <j...@palousetech.com> wrote:
> >
> > hi all:
> >
> > Once again my production ovirt cluster is collapsing in on itself.  My
> servers are intermittently unavailable or degrading, customers are noticing
> and calling in.  This seems to be yet another gluster failure that I
> haven't been able to pin down.
> >
> > I posted about this a while ago, but didn't get anywhere (no replies
> that I found).  The problem started out as a glusterfsd process consuming
> large amounts of ram (up to the point where ram and swap were exhausted and
> the kernel OOM killer killed off the glusterfsd process).  For reasons not
> clear to me at this time, that resulted in any VMs running on that host and
> that gluster volume to be paused with I/O error (the glusterfs process is
> usually unharmed; why it didn't continue I/O with other servers is
> confusing to me).
> >
> > I have 3 servers and a total of 4 gluster volumes (engine, iso, data,
> and data-hdd).  The first 3 are replica 2+arb; the 4th (data-hdd) is
> replica 3.  The first 3 are backed by an LVM partition (some thin
> provisioned) on an SSD; the 4th is on a seagate hybrid disk (hdd + some
> internal flash for acceleration).  data-hdd is the only thing on the disk.
> Servers are Dell R610 with the PERC/6i raid card, with the disks
> individually passed through to the OS (no raid enabled).
> >
> > The above RAM usage issue came from the data-hdd volume.  Yesterday, I
> cought one of the glusterfsd high ram usage before the OOM-Killer had to
> run.  I was able to migrate the VMs off the machine and for good measure,
> reboot the entire machine (after taking this opportunity to run the
> software updates that ovirt said were pending).  Upon booting back up, the
> necessary volume healing began.  However, this time, the healing caused all
> three servers to go to very, very high load averages (I saw just under 200
> on one server; typically they've been 40-70) with top reporting IO Wait at
> 7-20%.  Network for this volume is a dedicated gig network.  According to
> bwm-ng, initially the network bandwidth would hit 50MB/s (yes, bytes), but
> tailed off to mostly in the kB/s for a while.  All machines' load averages
> were still 40+ and gluster volume heal data-hdd info reported 5 items
> needing healing.  Server's were intermittently experiencing IO issues, even
> on the 3 gluster volumes that appeared largely unaffected.  Even the OS
> activities on the hosts itself (logging in, running commands) would often
> be very delayed.  The ovirt engine was seemingly randomly throwing engine
> down / engine up / engine failed notifications.  Responsiveness on ANY VM
> was horrific most of the time, with random VMs being inaccessible.
> >
> > I let the gluster heal run overnight.  By morning, there were still 5
> items needing healing, all three servers were still experiencing high load,
> and servers were still largely unstable.
> >
> > I've noticed that all of my ovirt outages (and I've had a lot, way more
> than is acceptable for a production cluster) have come from gluster.  I
> still have 3 VMs who's hard disk images have become corrupted by my last
> gluster crash that I haven't had time to repair / rebuild yet (I believe
> this crash was caused by the OOM issue previously mentioned, but I didn't
> know it at the time).
> >
> > Is gluster really ready for production yet?  It seems so unstable to
> me....  I'm looking at replacing gluster with a dedicated NFS server likely
> FreeNAS.  Any suggestions?  What is the "right" way to do production
> storage on this (3 node cluster)?  Can I get this gluster volume stable
> enough to get my VMs to run reliably again until I can deploy another
> storage solution?
> >
> > --Jim
> > _______________________________________________
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-
> guidelines/
> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/
> message/YQX3LQFQQPW4JTCB7B6FY2LLR6NA2CB3/
>
>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZATIAASOH3PSS2FBA4K5ANJKAV6AM476/

Reply via email to