Re: [Gluster-devel] [Gluster-users] BoF - Gluster for VM store use case

2017-11-03 Thread Ramon Selga
Below you can find three fio commands used for running each benchmark test, 
sequential write, random 4k read and random 4k write.


# fio --name=writefile --size=10G --filesize=10G --filename=fio_file --bs=1M 
--nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers 
--end_fsync=1 --iodepth=200 --ioengine=libaio


# fio --time_based --name=benchmark --size=10G --runtime=30 --filename=fio_file 
--ioengine=libaio --randrepeat=0 --iodepth=128 --direct=1 --invalidate=1 
--verify=0 --verify_fatal=0 --numjobs=4 --rw=randread --blocksize=4k 
--group_reporting


# fio --time_based --name=benchmark --size=10G --runtime=30 --filename=fio_file 
--ioengine=libaio --randrepeat=0 --iodepth=128 --direct=1 --invalidate=1 
--verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k 
--group_reporting


And here timed extraction of kernel source, first run:

# time tar xf linux-4.13.11.tar.xz

real    0m8.180s
user    0m5.932s
sys     0m2.924s

second run, after deleting first:

# rm -rf linux-4.13.11
# time tar xf linux-4.13.11.tar.xz

real    0m6.454s
user    0m6.012s
sys     0m2.440s


El 03/11/17 a les 09:33, Gandalf Corvotempesta ha escrit:

Could you please share fio command line used for this test?
Additionally, can you tell me the time needed to extract the kernel source?

Il 2 nov 2017 11:24 PM, "Ramon Selga" > ha scritto:


Hi,

Just for your reference we got some similar values in a customer setup
with three nodes single Xeon and 4x8TB HDD each with a double 10GbE 
backbone.

We did a simple benchmark with fio tool on a virtual disk (virtio) of a
1TiB of size, XFS formatted directly no partitions no LVM, inside a VM
(debian stretch, dual core 4GB RAM) deployed in a gluster volume disperse
3 redundancy 1 distributed 2, sharding enabled.

We run a sequential write test 10GB file in 1024k blocks, a random read
test with 4k blocks and a random write test also with 4k blocks several
times with results very similar to the following:

writefile: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, 
iodepth=200
fio-2.16
Starting 1 process

writefile: (groupid=0, jobs=1): err= 0: pid=11515: Thu Nov  2 16:50:05 2017
  write: io=10240MB, bw=473868KB/s, iops=462, runt= 22128msec
    slat (usec): min=20, max=98830, avg=1972.11, stdev=6612.81
    clat (msec): min=150, max=2979, avg=428.49, stdev=189.96
 lat (msec): min=151, max=2979, avg=430.47, stdev=189.90
    clat percentiles (msec):
 |  1.00th=[  204],  5.00th=[  249], 10.00th=[  273], 20.00th=[  293],
 | 30.00th=[  306], 40.00th=[  318], 50.00th=[  351], 60.00th=[  502],
 | 70.00th=[  545], 80.00th=[  578], 90.00th=[  603], 95.00th=[  627],
 | 99.00th=[  717], 99.50th=[  775], 99.90th=[ 2966], 99.95th=[ 2966],
 | 99.99th=[ 2966]
    lat (msec) : 250=5.09%, 500=54.65%, 750=39.64%, 1000=0.31%, 2000=0.07%
    lat (msec) : >=2000=0.24%
  cpu  : usr=7.81%, sys=1.48%, ctx=1221, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.3%, 
>=64=99.4%
 submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.1%
 issued    : total=r=0/w=10240/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
 latency   : target=0, window=0, percentile=100.00%, depth=200

Run status group 0 (all jobs):
  WRITE: io=10240MB, aggrb=473868KB/s, minb=473868KB/s, maxb=473868KB/s,
mint=22128msec, maxt=22128msec

Disk stats (read/write):
  vdg: ios=0/10243, merge=0/0, ticks=0/2745892, in_queue=2745884, util=99.18

benchmark: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio,
iodepth=128
...
fio-2.16
Starting 4 processes

benchmark: (groupid=0, jobs=4): err= 0: pid=11529: Thu Nov  2 16:52:40 2017
  read : io=1123.9MB, bw=38347KB/s, iops=9586, runt= 30011msec
    slat (usec): min=1, max=228886, avg=415.40, stdev=3975.72
    clat (usec): min=482, max=328648, avg=52664.65, stdev=30216.00
 lat (msec): min=9, max=527, avg=53.08, stdev=30.38
    clat percentiles (msec):
 |  1.00th=[   12],  5.00th=[   22], 10.00th=[   23], 20.00th=[   25],
 | 30.00th=[   33], 40.00th=[   38], 50.00th=[   47], 60.00th=[   55],
 | 70.00th=[   64], 80.00th=[   76], 90.00th=[   95], 95.00th=[  111],
 | 99.00th=[  151], 99.50th=[  163], 99.90th=[  192], 99.95th=[  196],
 | 99.99th=[  210]
    lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 10=0.03%, 20=3.59%, 50=52.41%, 100=36.01%, 250=7.96%
    lat (msec) : 500=0.01%
  cpu  : usr=0.29%, sys=1.10%, ctx=10157, majf=0, minf=549
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, 
>=64=99.9%
 submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 

Re: [Gluster-devel] [Gluster-users] BoF - Gluster for VM store use case

2017-11-03 Thread Gandalf Corvotempesta
Could you please share fio command line used for this test?
Additionally, can you tell me the time needed to extract the kernel source?

Il 2 nov 2017 11:24 PM, "Ramon Selga"  ha scritto:

> Hi,
>
> Just for your reference we got some similar values in a customer setup
> with three nodes single Xeon and 4x8TB HDD each with a double 10GbE
> backbone.
>
> We did a simple benchmark with fio tool on a virtual disk (virtio) of a
> 1TiB of size, XFS formatted directly no partitions no LVM, inside a VM
> (debian stretch, dual core 4GB RAM) deployed in a gluster volume disperse 3
> redundancy 1 distributed 2, sharding enabled.
>
> We run a sequential write test 10GB file in 1024k blocks, a random read
> test with 4k blocks and a random write test also with 4k blocks several
> times with results very similar to the following:
>
> writefile: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio,
> iodepth=200
> fio-2.16
> Starting 1 process
>
> writefile: (groupid=0, jobs=1): err= 0: pid=11515: Thu Nov  2 16:50:05 2017
>   write: io=10240MB, bw=473868KB/s, iops=462, runt= 22128msec
> slat (usec): min=20, max=98830, avg=1972.11, stdev=6612.81
> clat (msec): min=150, max=2979, avg=428.49, stdev=189.96
>  lat (msec): min=151, max=2979, avg=430.47, stdev=189.90
> clat percentiles (msec):
>  |  1.00th=[  204],  5.00th=[  249], 10.00th=[  273], 20.00th=[  293],
>  | 30.00th=[  306], 40.00th=[  318], 50.00th=[  351], 60.00th=[  502],
>  | 70.00th=[  545], 80.00th=[  578], 90.00th=[  603], 95.00th=[  627],
>  | 99.00th=[  717], 99.50th=[  775], 99.90th=[ 2966], 99.95th=[ 2966],
>  | 99.99th=[ 2966]
> lat (msec) : 250=5.09%, 500=54.65%, 750=39.64%, 1000=0.31%, 2000=0.07%
> lat (msec) : >=2000=0.24%
>   cpu  : usr=7.81%, sys=1.48%, ctx=1221, majf=0, minf=11
>   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.3%,
> >=64=99.4%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.1%
>  issued: total=r=0/w=10240/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>  latency   : target=0, window=0, percentile=100.00%, depth=200
>
> Run status group 0 (all jobs):
>   WRITE: io=10240MB, aggrb=473868KB/s, minb=473868KB/s, maxb=473868KB/s,
> mint=22128msec, maxt=22128msec
>
> Disk stats (read/write):
>   vdg: ios=0/10243, merge=0/0, ticks=0/2745892, in_queue=2745884,
> util=99.18
>
> benchmark: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio,
> iodepth=128
> ...
> fio-2.16
> Starting 4 processes
>
> benchmark: (groupid=0, jobs=4): err= 0: pid=11529: Thu Nov  2 16:52:40 2017
>   read : io=1123.9MB, bw=38347KB/s, iops=9586, runt= 30011msec
> slat (usec): min=1, max=228886, avg=415.40, stdev=3975.72
> clat (usec): min=482, max=328648, avg=52664.65, stdev=30216.00
>  lat (msec): min=9, max=527, avg=53.08, stdev=30.38
> clat percentiles (msec):
>  |  1.00th=[   12],  5.00th=[   22], 10.00th=[   23], 20.00th=[   25],
>  | 30.00th=[   33], 40.00th=[   38], 50.00th=[   47], 60.00th=[   55],
>  | 70.00th=[   64], 80.00th=[   76], 90.00th=[   95], 95.00th=[  111],
>  | 99.00th=[  151], 99.50th=[  163], 99.90th=[  192], 99.95th=[  196],
>  | 99.99th=[  210]
> lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
> lat (msec) : 10=0.03%, 20=3.59%, 50=52.41%, 100=36.01%, 250=7.96%
> lat (msec) : 500=0.01%
>   cpu  : usr=0.29%, sys=1.10%, ctx=10157, majf=0, minf=549
>   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
> >=64=99.9%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.1%
>  issued: total=r=287705/w=0/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0
>  latency   : target=0, window=0, percentile=100.00%, depth=128
>
> Run status group 0 (all jobs):
>READ: io=1123.9MB, aggrb=38346KB/s, minb=38346KB/s, maxb=38346KB/s,
> mint=30011msec, maxt=30011msec
>
> Disk stats (read/write):
>   vdg: ios=286499/2, merge=0/0, ticks=3707064/64, in_queue=3708680,
> util=99.83%
>
> benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio,
> iodepth=128
> ...
> fio-2.16
> Starting 4 processes
>
> benchmark: (groupid=0, jobs=4): err= 0: pid=11545: Thu Nov  2 16:55:54 2017
>   write: io=422464KB, bw=14079KB/s, iops=3519, runt= 30006msec
> slat (usec): min=1, max=230620, avg=1130.75, stdev=6744.31
> clat (usec): min=643, max=540987, avg=143999.57, stdev=66693.45
>  lat (msec): min=8, max=541, avg=145.13, stdev=67.01
> clat percentiles (msec):
>  |  1.00th=[   34],  5.00th=[   75], 10.00th=[   87], 20.00th=[  100],
>  | 30.00th=[  109], 40.00th=[  116], 50.00th=[  123], 60.00th=[  135],
>  | 70.00th=[  151], 80.00th=[  182], 90.00th=[  241], 95.00th=[  289],
>  | 99.00th=[  359], 99.50th=[  416], 99.90th=[  465], 99.95th=[  

Re: [Gluster-devel] [Gluster-users] BoF - Gluster for VM store use case

2017-11-02 Thread Alex K
Yes, I would be interested to hear more on the findings. Let us know once
you have them.

On Nov 1, 2017 13:10, "Shyam Ranganathan"  wrote:

> On 10/31/2017 08:36 PM, Ben Turner wrote:
>
>> * Erasure coded volumes with sharding - seen as a good fit for VM disk
>>> storage
>>>
>> I am working on this with a customer, we have been able to do 400-500 MB
>> / sec writes!  Normally things max out at ~150-250.  The trick is to use
>> multiple files, create the lvm stack and use native LVM striping.  We have
>> found that 4-6 files seems to give the best perf on our setup.  I don't
>> think we are using sharding on the EC vols, just multiple files and LVM
>> striping.  Sharding may be able to avoid the LVM striping, but I bet
>> dollars to doughnuts you won't see this level of perf:)   I am working on a
>> blog post for RHHI and RHEV + RHS performance where I am able to in some
>> cases get 2x+ the performance out of VMs / VM storage.  I'd be happy to
>> share my data / findings.
>>
>>
> Ben, we would like to hear more, so please do share your thoughts further.
> There are a fair number of users in the community who have this use-case
> and may have some interesting questions around the proposed method.
>
> Shyam
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] BoF - Gluster for VM store use case

2017-11-01 Thread Shyam Ranganathan

On 10/31/2017 08:36 PM, Ben Turner wrote:

* Erasure coded volumes with sharding - seen as a good fit for VM disk
storage

I am working on this with a customer, we have been able to do 400-500 MB / sec 
writes!  Normally things max out at ~150-250.  The trick is to use multiple 
files, create the lvm stack and use native LVM striping.  We have found that 
4-6 files seems to give the best perf on our setup.  I don't think we are using 
sharding on the EC vols, just multiple files and LVM striping.  Sharding may be 
able to avoid the LVM striping, but I bet dollars to doughnuts you won't see 
this level of perf:)   I am working on a blog post for RHHI and RHEV + RHS 
performance where I am able to in some cases get 2x+ the performance out of VMs 
/ VM storage.  I'd be happy to share my data / findings.



Ben, we would like to hear more, so please do share your thoughts 
further. There are a fair number of users in the community who have this 
use-case and may have some interesting questions around the proposed method.


Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] BoF - Gluster for VM store use case

2017-10-31 Thread Paul Cuzner
Just wanted to pick up on the EC for vm storage domains option..



> > * Erasure coded volumes with sharding - seen as a good fit for VM disk
> > storage
>
> I am working on this with a customer, we have been able to do 400-500 MB /
> sec writes!  Normally things max out at ~150-250.  The trick is to use
> multiple files, create the lvm stack and use native LVM striping.  We have
> found that 4-6 files seems to give the best perf on our setup.  I don't
> think we are using sharding on the EC vols, just multiple files and LVM
> striping.  Sharding may be able to avoid the LVM striping, but I bet
> dollars to doughnuts you won't see this level of perf :)  I am working on a
> blog post for RHHI and RHEV + RHS performance where I am able to in some
> cases get 2x+ the performance out of VMs / VM storage.  I'd be happy to
> share my data / findings.
>
>
The main reason IIRC for sharding was to break down the vdisk image file
into smaller chunks to improve self heal efficiency. With EC the vdisk
image is already split, so do we really need sharding as well - especially
given Ben's findings?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] BoF - Gluster for VM store use case

2017-10-31 Thread Ben Turner
- Original Message -
> From: "Sahina Bose" 
> To: gluster-us...@gluster.org
> Cc: "Gluster Devel" 
> Sent: Tuesday, October 31, 2017 11:46:57 AM
> Subject: [Gluster-users] BoF - Gluster for VM store use case
> 
> During Gluster Summit, we discussed gluster volumes as storage for VM images
> - feedback on the usecase and upcoming features that may benefit this
> usecase.
> 
> Some of the points discussed
> 
> * Need to ensure there are no issues when expanding a gluster volume when
> sharding is turned on.
> * Throttling feature for self-heal, rebalance process could be useful for
> this usecase
> * Erasure coded volumes with sharding - seen as a good fit for VM disk
> storage

I am working on this with a customer, we have been able to do 400-500 MB / sec 
writes!  Normally things max out at ~150-250.  The trick is to use multiple 
files, create the lvm stack and use native LVM striping.  We have found that 
4-6 files seems to give the best perf on our setup.  I don't think we are using 
sharding on the EC vols, just multiple files and LVM striping.  Sharding may be 
able to avoid the LVM striping, but I bet dollars to doughnuts you won't see 
this level of perf :)  I am working on a blog post for RHHI and RHEV + RHS 
performance where I am able to in some cases get 2x+ the performance out of VMs 
/ VM storage.  I'd be happy to share my data / findings.

> * Performance related
> ** accessing qemu images using gfapi driver does not perform as well as fuse
> access. Need to understand why.

+1 I have some ideas here that I have came up with in my research.  Happy to 
share these as well.

> ** Using zfs with cache or lvmcache for xfs filesystem is seen to improve
> performance

I have done some interesting stuff with customers here too, nothing with VMs 
iirc it was more for backing up bricks without geo-rep(was too slow for them).

-b

> 
> If you have any further inputs on this topic, please add to thread.
> 
> thanks!
> sahina
> 
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel