I've just added another server ( same specs) with one osd and the behavior
is the same - bad performance ..cur MB/s 0
Check network with iperf3 ..no issues

So it is not a server issue since I am getting same behavior with 2
different servers

... but I checked network with iperf3 ..no issues

What can it be  ?

ceph osd df tree
ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL %USE  VAR  PGS TYPE NAME
-1       3.44714        -  588G 80693M  509G     0    0   - root default
-9       0.57458        -  588G 80693M  509G 13.39 1.13   -     host osd01
 5   hdd 0.57458  1.00000  588G 80693M  509G 13.39 1.13  64         osd.5
-7       1.14899        - 1176G   130G 1046G 11.06 0.94   -     host osd02
 0   hdd 0.57500  1.00000  588G 70061M  519G 11.63 0.98  50         osd.0
 1   hdd 0.57500  1.00000  588G 63200M  526G 10.49 0.89  41         osd.1
-3       1.14899        - 1176G   138G 1038G 11.76 1.00   -     host osd03
 2   hdd 0.57500  1.00000  588G 68581M  521G 11.38 0.96  48         osd.2
 3   hdd 0.57500  1.00000  588G 73185M  516G 12.15 1.03  53         osd.3
-4       0.57458        -     0      0     0     0    0   -     host osd04
 4   hdd 0.57458        0     0      0     0     0    0   0         osd.4

2018-04-10 15:11:58.542507 min lat: 0.0201432 max lat: 13.9308 avg lat:
0.466235
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
lat(s)
   40      16      1294      1278   127.785         0           -
0.466235
   41      16      1294      1278   124.668         0           -
0.466235
   42      16      1294      1278     121.7         0           -
0.466235
   43      16      1294      1278    118.87         0           -
0.466235
   44      16      1302      1286   116.896       6.4   0.0302793
0.469203
   45      16      1395      1379   122.564       372    0.312525
 0.51994
   46      16      1458      1442   125.377       252   0.0387492
0.501892
   47      16      1458      1442   122.709         0           -
0.501892
   48      16      1458      1442   120.153         0           -
0.501892
   49      16      1458      1442   117.701         0           -
0.501892
   50      16      1522      1506   120.466        64    0.137913
0.516969
   51      16      1522      1506   118.104         0           -
0.516969
   52      16      1522      1506   115.833         0           -
0.516969
   53      16      1522      1506   113.648         0           -
0.516969
   54      16      1522      1506   111.543         0           -
0.516969
   55      16      1522      1506   109.515         0           -
0.516969
   56      16      1522      1506   107.559         0           -
0.516969
   57      16      1522      1506   105.672         0           -
0.516969
   58      16      1522      1506   103.851         0           -
0.516969
Total time run:       58.927431
Total reads made:     1522
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   103.314
Average IOPS:         25
Stddev IOPS:          35
Max IOPS:             106
Min IOPS:             0
Average Latency(s):   0.618812
Max latency(s):       13.9308
Min latency(s):       0.0201432


iperf3 -c 192.168.0.181 -i1 -t 10
Connecting to host 192.168.0.181, port 5201
[  4] local 192.168.0.182 port 57448 connected to 192.168.0.181 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.15 GBytes  9.92 Gbits/sec    0    830 KBytes
[  4]   1.00-2.00   sec  1.15 GBytes  9.90 Gbits/sec    0    830 KBytes
[  4]   2.00-3.00   sec  1.15 GBytes  9.91 Gbits/sec    0    918 KBytes
[  4]   3.00-4.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
[  4]   4.00-5.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
[  4]   5.00-6.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
[  4]   6.00-7.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
[  4]   7.00-8.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
[  4]   8.00-9.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
[  4]   9.00-10.00  sec  1.15 GBytes  9.91 Gbits/sec    0    918 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec    0             sender
[  4]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec
receiver




On Tue, 10 Apr 2018 at 08:49, Steven Vacaroaia <ste...@gmail.com> wrote:

> Hi,
> Thanks for providing guidance
>
> VD0 is the SSD drive
> Many people suggested to not enable WB for SSD so that cache can be used
> for HDD where is needed more
>
> Setup is 3 identical DELL R620 server OSD01, OSD02, OSD04
> 10 GB  separate networks, 600 GB Entreprise HDD , 320 GB Entreprise SSD
> Blustore, separate WAL / DB on SSD ( 1 GB partition for WAL, 30GB for DB)
>
> With 2 OSD per servers and only OSD01, OSD02 , performance is as expected
> ( no gaps CUR MB/s )
>
> Adding one OSD from OSD04, tanks performance ( lots of gaps CUR MB/s 0 )
>
> See below
>
> ceph -s
>   cluster:
>     id:     1e98e57a-ef41-4327-b88a-dd2531912632
>     health: HEALTH_WARN
>             noscrub,nodeep-scrub flag(s) set
>
>
>
>
> WITH OSD04
>
> ceph osd tree
> ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF
> -1       2.87256 root default
> -7       1.14899     host osd02
>  0   hdd 0.57500         osd.0      up  1.00000 1.00000
>  1   hdd 0.57500         osd.1      up  1.00000 1.00000
> -3       1.14899     host osd03
>  2   hdd 0.57500         osd.2      up  1.00000 1.00000
>  3   hdd 0.57500         osd.3      up  1.00000 1.00000
> -4       0.57458     host osd04
>  4   hdd 0.57458         osd.4      up  1.00000 1.00000
>
>
> 2018-04-10 08:37:08.111037 min lat: 0.0128562 max lat: 13.1623 avg lat:
> 0.528273
>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> lat(s)
>   100      16      3001      2985   119.388        90   0.0169507
> 0.528273
>   101      16      3029      3013   119.315       112   0.0410565
> 0.524325
>   102      16      3029      3013   118.145         0           -
> 0.524325
>   103      16      3029      3013   116.998         0           -
> 0.524325
>   104      16      3029      3013   115.873         0           -
> 0.524325
>   105      16      3071      3055    116.37        42   0.0888923
>  0.54832
>   106      16      3156      3140   118.479       340   0.0162464
> 0.535244
>   107      16      3156      3140   117.372         0           -
> 0.535244
>   108      16      3156      3140   116.285         0           -
> 0.535244
>   109      16      3156      3140   115.218         0           -
> 0.535244
>   110      16      3156      3140   114.171         0           -
> 0.535244
>   111      16      3156      3140   113.142         0           -
> 0.535244
>   112      16      3156      3140   112.132         0           -
> 0.535244
>   113      16      3156      3140    111.14         0           -
> 0.535244
>   114      16      3156      3140   110.165         0           -
> 0.535244
>   115      16      3156      3140   109.207         0           -
> 0.535244
>   116      16      3230      3214   110.817      29.6   0.0169969
> 0.574856
>   117      16      3311      3295   112.639       324   0.0704851
> 0.565529
>   118      16      3311      3295   111.684         0           -
> 0.565529
>   119      16      3311      3295   110.746         0           -
> 0.565529
> 2018-04-10 08:37:28.112886 min lat: 0.0128562 max lat: 14.7293 avg lat:
> 0.565529
>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> lat(s)
>   120      16      3311      3295   109.823         0           -
> 0.565529
>   121      16      3311      3295   108.915         0           -
> 0.565529
>   122      16      3311      3295   108.022         0           -
> 0.565529
> Total time run:         122.568983
> Total writes made:      3312
> Write size:             4194304
> Object size:            4194304
> Bandwidth (MB/sec):     108.086
> Stddev Bandwidth:       121.191
> Max bandwidth (MB/sec): 520
> Min bandwidth (MB/sec): 0
> Average IOPS:           27
> Stddev IOPS:            30
> Max IOPS:               130
> Min IOPS:               0
> Average Latency(s):     0.591771
> Stddev Latency(s):      1.74753
> Max latency(s):         14.7293
> Min latency(s):         0.0128562
>
>
> AFTER ceph osd down osd.4; ceph osd out osd.4
>
> ceph osd tree
> ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF
> -1       2.87256 root default
> -7       1.14899     host osd02
>  0   hdd 0.57500         osd.0      up  1.00000 1.00000
>  1   hdd 0.57500         osd.1      up  1.00000 1.00000
> -3       1.14899     host osd03
>  2   hdd 0.57500         osd.2      up  1.00000 1.00000
>  3   hdd 0.57500         osd.3      up  1.00000 1.00000
> -4       0.57458     host osd04
>  4   hdd 0.57458         osd.4      up        0 1.00000
>
>
> 2018-04-10 08:46:55.193642 min lat: 0.0156532 max lat: 2.5884 avg lat:
> 0.310681
>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> lat(s)
>   100      16      5144      5128   205.097       220   0.0372222
> 0.310681
>   101      16      5196      5180   205.126       208    0.421245
> 0.310908
>   102      16      5232      5216   204.526       144    0.543723
> 0.311544
>   103      16      5271      5255   204.055       156    0.465998
> 0.312394
>   104      16      5310      5294   203.593       156    0.483188
> 0.313355
>   105      16      5357      5341   203.444       188   0.0313209
> 0.313267
>   106      16      5402      5386   203.223       180    0.517098
> 0.313714
>   107      16      5457      5441   203.379       220   0.0277359
> 0.313288
>   108      16      5515      5499   203.644       232    0.470556
> 0.313203
>   109      16      5565      5549   203.611       200    0.564713
> 0.313173
>   110      16      5606      5590    203.25       164   0.0223166
> 0.313596
>   111      16      5659      5643   203.329       212   0.0231103
> 0.313597
>   112      16      5703      5687   203.085       176    0.033348
> 0.314018
>   113      16      5757      5741   203.199       216     1.53862
> 0.313991
>   114      16      5798      5782   202.855       164      0.4711
> 0.314511
>   115      16      5852      5836   202.969       216   0.0350226
>  0.31424
>   116      16      5912      5896   203.288       240   0.0253188
> 0.313657
>   117      16      5964      5948   203.328       208   0.0223623
> 0.313562
>   118      16      6024      6008   203.639       240    0.174245
> 0.313531
>   119      16      6070      6054   203.473       184    0.712498
> 0.313582
> 2018-04-10 08:47:15.195873 min lat: 0.0154679 max lat: 2.5884 avg lat:
> 0.313564
>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> lat(s)
>   120      16      6120      6104   203.444       200   0.0351212
> 0.313564
> Total time run:         120.551897
> Total writes made:      6120
> Write size:             4194304
> Object size:            4194304
> Bandwidth (MB/sec):     203.066
> Stddev Bandwidth:       43.8329
> Max bandwidth (MB/sec): 480
> Min bandwidth (MB/sec): 128
> Average IOPS:           50
> Stddev IOPS:            10
> Max IOPS:               120
> Min IOPS:               32
> Average Latency(s):     0.314959
> Stddev Latency(s):      0.379298
> Max latency(s):         2.5884
> Min latency(s):         0.0154679
>
>
>
> On Tue, 10 Apr 2018 at 07:58, Kai Wagner <kwag...@suse.com> wrote:
>
>> Is this just from one server or from all servers? Just wondering why VD
>> 0 is using WriteThrough compared to the others. If that's the setup for
>> the OSD's you already have a cache setup problem.
>>
>>
>> On 10.04.2018 13:44, Mohamad Gebai wrote:
>> > megacli -LDGetProp -cache -Lall -a0
>> >
>> > Adapter 0-VD 0(target id: 0): Cache Policy:WriteThrough,
>> > ReadAheadNone, Direct, Write Cache OK if bad BBU
>> > Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAdaptive,
>> > Cached, No Write Cache if bad BBU
>> > Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAdaptive,
>> > Cached, No Write Cache if bad BBU
>> > Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAdaptive,
>> > Cached, No Write Cache if bad BBU
>>
>> --
>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB
>> 21284 (AG Nürnberg)
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to