Hello, We would like to properly configure DRBD9 using RAM disk bdev's as an underlying devices on both ends for testing purposes.
So far we have seen quite unsatisfactory results using Linux kernel's BRD module for RAM disks and the following DRBD resource config: resource no_cas_test { device /dev/drbd0; on lab_1 { disk /dev/ram0; #cache dev meta-disk internal; address XXXX } on lab_2 { disk /dev/ram0; meta-disk internal; address XXXX } } These are HW machines running RHEL and the raw performance we get from RAM disks is over 16GiB/s (over 4400k IOPS) using fio: # fio --filename=/dev/ram0 --ioengine=libaio --direct=1 --rw=randwrite --bs=4k --iodepth=64 --numjobs=24 --name=4kiops_randwrite --group_reporting --size=4G 4kiops_randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 ... fio-3.19 Starting 24 processes Jobs: 16 (f=14): [w(1),_(1),w(5),_(1),w(1),_(1),f(2),_(5),w(7)][85.7%][w=15.3GiB/s][w=4023k IOPS][eta 00m:01s] 4kiops_randwrite: (groupid=0, jobs=24): err= 0: pid=3976: Fri Dec 3 15:53:43 2021 write: IOPS=4402k, BW=16.8GiB/s (18.0GB/s)(96.0GiB/5717msec); 0 zone resets However, as soon as we start a single, primary DRBD node on lab_1, we only get about 1300MiB/s (350k IOPS). # fio --filename=/dev/drbd0 --ioengine=libaio --direct=1 --rw=randwrite --bs=4k --iodepth=64 --numjobs=24 --name=4kiops_randwrite --group_reporting --size=4G 4kiops_randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 ... fio-3.19 Starting 24 processes Jobs: 24 (f=24): [w(24)][98.6%][w=1379MiB/s][w=353k IOPS][eta 00m:01s] 4kiops_randwrite: (groupid=0, jobs=24): err= 0: pid=4274: Fri Dec 3 15:59:22 2021 write: IOPS=351k, BW=1371MiB/s (1438MB/s)(96.0GiB/71704msec); 0 zone resets Now if we use physical NVMe drives, there is no difference in performance - both direct hardware (/dev/nvmeXXXXX) test and DRBD test yield maximum drive write performance: Non-DRBD test: # fio --filename=/dev/nvme1n1p1 --ioengine=libaio --direct=1 --rw=randwrite --bs=4k --iodepth=64 --numjobs=12 --name=4kiops_randwrite --group_reporting --size=4G 4kiops_randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 ... fio-3.19 Starting 12 processes Jobs: 1 (f=1): [_(11),w(1)][98.7%][w=607MiB/s][w=155k IOPS][eta 00m:01s] 4kiops_randwrite: (groupid=0, jobs=12): err= 0: pid=5312: Fri Dec 3 12:34:05 2021 write: IOPS=165k, BW=646MiB/s (677MB/s)(48.0GiB/76092msec); 0 zone resets slat (nsec): min=1419, max=310283, avg=3730.55, stdev=2879.91 clat (usec): min=5, max=23109, avg=4231.60, stdev=2497.87 lat (usec): min=9, max=23113, avg=4235.39, stdev=2497.50 clat percentiles (usec): | 1.00th=[ 86], 5.00th=[ 545], 10.00th=[ 1074], 20.00th=[ 1991], | 30.00th=[ 2638], 40.00th=[ 3425], 50.00th=[ 4080], 60.00th=[ 4817], | 70.00th=[ 5538], 80.00th=[ 6063], 90.00th=[ 7177], 95.00th=[ 8848], | 99.00th=[11469], 99.50th=[12256], 99.90th=[14091], 99.95th=[14877], | 99.99th=[16909] bw ( KiB/s): min=351792, max=1961536, per=100.00%, avg=725184.65, stdev=26445.60, samples=1653 iops : min=87948, max=490384, avg=181296.12, stdev=6611.40, samples=1653 lat (usec) : 10=0.05%, 20=0.23%, 50=0.44%, 100=0.38%, 250=1.11% lat (usec) : 500=2.36%, 750=2.48%, 1000=2.23% lat (msec) : 2=10.84%, 4=28.80%, 10=48.27%, 20=2.80%, 50=0.01% cpu : usr=3.19%, sys=7.10%, ctx=8562282, majf=0, minf=718 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=0,12582912,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=646MiB/s (677MB/s), 646MiB/s-646MiB/s (677MB/s-677MB/s), io=48.0GiB (51.5GB), run=76092-76092msec Disk stats (read/write): nvme1n1: ios=205/12567629, merge=0/0, ticks=197/53105077, in_queue=46882592, util=100.00% DRBD test: # fio --filename=/dev/drbd0 --ioengine=libaio --direct=1 --rw=randwrite --bs=4k --iodepth=64 --numjobs=12 --name=4kiops_randwrite --group_reporting --size=4G 4kiops_randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 ... fio-3.19 Starting 12 processes Jobs: 1 (f=1): [_(6),w(1),_(5)][95.1%][w=636MiB/s][w=163k IOPS][eta 00m:04s] 4kiops_randwrite: (groupid=0, jobs=12): err= 0: pid=5104: Fri Dec 3 12:28:03 2021 write: IOPS=162k, BW=634MiB/s (665MB/s)(48.0GiB/77474msec); 0 zone resets slat (nsec): min=1951, max=198869k, avg=49104.65, stdev=79888.62 clat (usec): min=7, max=207233, avg=4469.84, stdev=2174.11 lat (usec): min=13, max=207304, avg=4519.00, stdev=2178.99 clat percentiles (usec): | 1.00th=[ 498], 5.00th=[ 1500], 10.00th=[ 1860], 20.00th=[ 2376], | 30.00th=[ 3064], 40.00th=[ 3818], 50.00th=[ 4555], 60.00th=[ 5211], | 70.00th=[ 5669], 80.00th=[ 6128], 90.00th=[ 6849], 95.00th=[ 7832], | 99.00th=[10159], 99.50th=[11076], 99.90th=[13435], 99.95th=[15270], | 99.99th=[27395] bw ( KiB/s): min=411961, max=1893680, per=100.00%, avg=677991.74, stdev=20345.92, samples=1765 iops : min=102987, max=473420, avg=169496.99, stdev=5086.49, samples=1765 lat (usec) : 10=0.01%, 20=0.02%, 50=0.04%, 100=0.06%, 250=0.19% lat (usec) : 500=0.76%, 750=1.45%, 1000=0.74% lat (msec) : 2=9.48%, 4=29.65%, 10=56.43%, 20=1.15%, 50=0.03% lat (msec) : 250=0.01% cpu : usr=1.90%, sys=62.47%, ctx=811802, majf=0, minf=1305 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=0,12582912,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=634MiB/s (665MB/s), 634MiB/s-634MiB/s (665MB/s-665MB/s), io=48.0GiB (51.5GB), run=77474-77474msec Disk stats (read/write): drbd0: ios=0/12560419, merge=0/0, ticks=0/47859223, in_queue=47859223, util=100.00%, aggrios=0/12582912, aggrmerge=0/0, aggrticks=0/47319227, aggrin_queue=41223461, aggrutil=100.00% nvme1n1: ios=0/12582912, merge=0/0, ticks=0/47319227, in_queue=41223461, util=100.00% We have seen even worse performance on the VM's running DRBD9 on RAM disks. Is that behavior expected? Best regards, Krzysztof Majzerowicz-Jaszcz --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
_______________________________________________ Star us on GITHUB: https://github.com/LINBIT drbd-user mailing list drbd-user@lists.linbit.com https://lists.linbit.com/mailman/listinfo/drbd-user