Thanks again for the info. You’re probably right about the testing method.
Though the reason I’m down this path in the first place is because I’m
seeing a problem in real world work loads. Many of my vms are used in
development environments where working with small files is common such as
npm installs working with large node_module folders, ci/cd doing lots of
mixed operations io and compute.

I started testing some of these things by comparing side to side with a vm
using same specs only difference being gluster vs nfs storage. Nfs backed
storage is performing about 3x better real world.

Gluster version is stock that comes with 4.3.7. I haven’t attempted
updating it outside of official ovirt updates.

I’d like to see if I could improve it to handle my workloads better. I also
understand that replication adds overhead.

I do wonder how much difference in performance there would be with replica
3 vs replica 3 arbiter. I’d assume arbiter setup would be faster but
perhaps not by a considerable difference.

I will check into c states as well

On Sat, Mar 7, 2020 at 2:52 AM Strahil Nikolov <hunter86...@yahoo.com>
wrote:

> On March 7, 2020 1:09:37 AM GMT+02:00, Jayme <jay...@gmail.com> wrote:
> >Strahil,
> >
> >Thanks for your suggestions. The config is pretty standard HCI setup
> >with
> >cockpit and hosts are oVirt node. XFS was handled by the deployment
> >automatically. The gluster volumes were optimized for virt store.
> >
> >I tried noop on the SSDs, that made zero difference in the tests I was
> >running above. I took a look at the random-io-profile and it looks like
> >it
> >really only sets vm.dirty_background_ratio = 2 & vm.dirty_ratio = 5 --
> >my
> >hosts already appear to have those sysctl values, and by default are
> >using virtual-host tuned profile.
> >
> >I'm curious what a test like "dd if=/dev/zero of=test2.img bs=512
> >count=1000 oflag=dsync" on one of your VMs would show for results?
> >
> >I haven't done much with gluster profiling but will take a look and see
> >if
> >I can make sense of it. Otherwise, the setup is pretty stock oVirt HCI
> >deployment with SSD backed storage and 10Gbe storage network.  I'm not
> >coming anywhere close to maxing network throughput.
> >
> >The NFS export I was testing was an export from a local server
> >exporting a
> >single SSD (same type as in the oVirt hosts).
> >
> >I might end up switching storage to NFS and ditching gluster if
> >performance
> >is really this much better...
> >
> >
> >On Fri, Mar 6, 2020 at 5:06 PM Strahil Nikolov <hunter86...@yahoo.com>
> >wrote:
> >
> >> On March 6, 2020 6:02:03 PM GMT+02:00, Jayme <jay...@gmail.com>
> >wrote:
> >> >I have 3 server HCI with Gluster replica 3 storage (10GBe and SSD
> >> >disks).
> >> >Small file performance inner-vm is pretty terrible compared to a
> >> >similar
> >> >spec'ed VM using NFS mount (10GBe network, SSD disk)
> >> >
> >> >VM with gluster storage:
> >> >
> >> ># dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
> >> >1000+0 records in
> >> >1000+0 records out
> >> >512000 bytes (512 kB) copied, 53.9616 s, 9.5 kB/s
> >> >
> >> >VM with NFS:
> >> >
> >> ># dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
> >> >1000+0 records in
> >> >1000+0 records out
> >> >512000 bytes (512 kB) copied, 2.20059 s, 233 kB/s
> >> >
> >> >This is a very big difference, 2 seconds to copy 1000 files on NFS
> >VM
> >> >VS 53
> >> >seconds on the other.
> >> >
> >> >Aside from enabling libgfapi is there anything I can tune on the
> >> >gluster or
> >> >VM side to improve small file performance? I have seen some guides
> >by
> >> >Redhat in regards to small file performance but I'm not sure what/if
> >> >any of
> >> >it applies to oVirt's implementation of gluster in HCI.
> >>
> >> You can use the rhgs-random-io tuned  profile from
> >>
> >
> ftp://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-storage-server-3.4.2.0-1.el7rhgs.src.rpm
> >> and try with that on your hosts.
> >> In my case, I have  modified  it so it's a mixture between
> >rhgs-random-io
> >> and the profile for Virtualization Host.
> >>
> >> Also,ensure that your bricks are  using XFS with relatime/noatime
> >mount
> >> option and your scheduler for the SSDs is either  'noop' or 'none'
> >.The
> >> default  I/O scheduler for RHEL7 is deadline which is giving
> >preference to
> >> reads and  your  workload  is  definitely 'write'.
> >>
> >> Ensure that the virt settings are  enabled for your gluster volumes:
> >> 'gluster volume set <volname> group virt'
> >>
> >> Also, are you running  on fully allocated disks for the VM or you
> >started
> >> thin ?
> >> I'm asking as creation of new shards  at gluster  level is a slow
> >task.
> >>
> >> Have you checked  gluster  profiling the volume?  It can clarify what
> >is
> >> going on.
> >>
> >>
> >> Also are you comparing apples to apples ?
> >> For example, 1 ssd  mounted  and exported  as NFS and a replica 3
> >volume
> >> of the same type of ssd ? If not,  the NFS can have more iops due to
> >> multiple disks behind it, while Gluster has to write the same thing
> >on all
> >> nodes.
> >>
> >> Best Regards,
> >> Strahil Nikolov
> >>
> >>
>
> Hi Jayme,
>
>
> My test are not quite good ,as I have a different setup:
>
> NVME - VDO - 4 thin LVs -XFS - 4  Gluster  volumes (replica 2 arbiter 1)
> - 4  storage domains  - striped  LV in each VM
>
> RHEL7 VM (fully stock):
> [root@node1 ~]# dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
> 1000+0 records in
> 1000+0 records out
> 512000 bytes (512 kB) copied, 19.8195 s, 25.8 kB/s
> [root@node1 ~]#
>
> Brick:
> [root@ovirt1 data_fast]# dd if=/dev/zero of=test2.img bs=512 count=1000
> oflag=dsync
> 1000+0 records in
> 1000+0 records out
> 512000 bytes (512 kB) copied, 1.41192 s, 363 kB/s
>
> As I use VDO with compression (on 1/4 of the NVMe) - I cannot expect any
> performance from it.
>
>
> Is your app really using dsync ? I have seen many times that performance
> testing with the wrong tools/tests cause more  trouble than it should.
>
> I would recommend you to test with a real workload before deciding to
> change the architecture.
>
> I forgot to mention that you need to disable c states for your systems if
> you are chasing performance.
> Run a gluster profile while you run real workload in your VMs and then
> provide that for analysis.
>
> Which version of Gluster are you using ?
>
> Best Regards,
> Strahil Nikolov
>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YRGEX37SBEUWNONYSV4KVTOVARHG53LU/

Reply via email to