Hi

With osd_debug increased to 5/5 I am seeing lots of these in the
ceph-osd.5.log
( newly added OSD)

Anyone know what it means ?

2018-04-10 16:05:33.317451 7f33610be700  5 osd.5 300 heartbeat:
osd_stat(43897 MB used, 545 GB avail, 588 GB total, peers [0,1,2,3,4] op
hist [0,0,0,0,0,1,0,3,0,0,2])
2018-04-10 16:05:33.455358 7f33630c2700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'42, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:33.463539 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'46, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:33.470797 7f33628c1700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'40, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:33.507682 7f33628c1700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'38, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:36.995387 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'36, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.000745 7f33648c5700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'42, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.005926 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.011209 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.016410 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'49, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.021478 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'50, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.038838 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'49, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.057197 7f33628c1700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'43, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.084913 7f33630c2700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.094249 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.095475 7f33628c1700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.129598 7f33628c1700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:37.148653 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.320251 7f33630c2700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.327982 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'50, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.334373 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'46, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.344398 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.357411 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'46, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.374542 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'42, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.385454 7f33640c4700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.393238 7f33638c3700  5 write_log_and_missing with:
dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from:
300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-04-10 16:05:38.617921 7f33610be700  5 osd.5 300 heartbeat:
osd_stat(43997 MB used, 545 GB avail, 588 GB total, peers [0,1,2,3,4] op
hist [0,0,0,0,0,0,0,0,1,3,0,2])



On Tue, 10 Apr 2018 at 15:31, Steven Vacaroaia <ste...@gmail.com> wrote:

> I've just added another server ( same specs) with one osd and the behavior
> is the same - bad performance ..cur MB/s 0
> Check network with iperf3 ..no issues
>
> So it is not a server issue since I am getting same behavior with 2
> different servers
>
> ... but I checked network with iperf3 ..no issues
>
> What can it be  ?
>
> ceph osd df tree
> ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL %USE  VAR  PGS TYPE NAME
> -1       3.44714        -  588G 80693M  509G     0    0   - root default
> -9       0.57458        -  588G 80693M  509G 13.39 1.13   -     host osd01
>  5   hdd 0.57458  1.00000  588G 80693M  509G 13.39 1.13  64         osd.5
> -7       1.14899        - 1176G   130G 1046G 11.06 0.94   -     host osd02
>  0   hdd 0.57500  1.00000  588G 70061M  519G 11.63 0.98  50         osd.0
>  1   hdd 0.57500  1.00000  588G 63200M  526G 10.49 0.89  41         osd.1
> -3       1.14899        - 1176G   138G 1038G 11.76 1.00   -     host osd03
>  2   hdd 0.57500  1.00000  588G 68581M  521G 11.38 0.96  48         osd.2
>  3   hdd 0.57500  1.00000  588G 73185M  516G 12.15 1.03  53         osd.3
> -4       0.57458        -     0      0     0     0    0   -     host osd04
>  4   hdd 0.57458        0     0      0     0     0    0   0         osd.4
>
> 2018-04-10 15:11:58.542507 min lat: 0.0201432 max lat: 13.9308 avg lat:
> 0.466235
>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> lat(s)
>    40      16      1294      1278   127.785         0           -
> 0.466235
>    41      16      1294      1278   124.668         0           -
> 0.466235
>    42      16      1294      1278     121.7         0           -
> 0.466235
>    43      16      1294      1278    118.87         0           -
> 0.466235
>    44      16      1302      1286   116.896       6.4   0.0302793
> 0.469203
>    45      16      1395      1379   122.564       372    0.312525
>  0.51994
>    46      16      1458      1442   125.377       252   0.0387492
> 0.501892
>    47      16      1458      1442   122.709         0           -
> 0.501892
>    48      16      1458      1442   120.153         0           -
> 0.501892
>    49      16      1458      1442   117.701         0           -
> 0.501892
>    50      16      1522      1506   120.466        64    0.137913
> 0.516969
>    51      16      1522      1506   118.104         0           -
> 0.516969
>    52      16      1522      1506   115.833         0           -
> 0.516969
>    53      16      1522      1506   113.648         0           -
> 0.516969
>    54      16      1522      1506   111.543         0           -
> 0.516969
>    55      16      1522      1506   109.515         0           -
> 0.516969
>    56      16      1522      1506   107.559         0           -
> 0.516969
>    57      16      1522      1506   105.672         0           -
> 0.516969
>    58      16      1522      1506   103.851         0           -
> 0.516969
> Total time run:       58.927431
> Total reads made:     1522
> Read size:            4194304
> Object size:          4194304
> Bandwidth (MB/sec):   103.314
> Average IOPS:         25
> Stddev IOPS:          35
> Max IOPS:             106
> Min IOPS:             0
> Average Latency(s):   0.618812
> Max latency(s):       13.9308
> Min latency(s):       0.0201432
>
>
> iperf3 -c 192.168.0.181 -i1 -t 10
> Connecting to host 192.168.0.181, port 5201
> [  4] local 192.168.0.182 port 57448 connected to 192.168.0.181 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec  1.15 GBytes  9.92 Gbits/sec    0    830 KBytes
> [  4]   1.00-2.00   sec  1.15 GBytes  9.90 Gbits/sec    0    830 KBytes
> [  4]   2.00-3.00   sec  1.15 GBytes  9.91 Gbits/sec    0    918 KBytes
> [  4]   3.00-4.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
> [  4]   4.00-5.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
> [  4]   5.00-6.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
> [  4]   6.00-7.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
> [  4]   7.00-8.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
> [  4]   8.00-9.00   sec  1.15 GBytes  9.90 Gbits/sec    0    918 KBytes
> [  4]   9.00-10.00  sec  1.15 GBytes  9.91 Gbits/sec    0    918 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec    0
>  sender
> [  4]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec
> receiver
>
>
>
>
> On Tue, 10 Apr 2018 at 08:49, Steven Vacaroaia <ste...@gmail.com> wrote:
>
>> Hi,
>> Thanks for providing guidance
>>
>> VD0 is the SSD drive
>> Many people suggested to not enable WB for SSD so that cache can be used
>> for HDD where is needed more
>>
>> Setup is 3 identical DELL R620 server OSD01, OSD02, OSD04
>> 10 GB  separate networks, 600 GB Entreprise HDD , 320 GB Entreprise SSD
>> Blustore, separate WAL / DB on SSD ( 1 GB partition for WAL, 30GB for DB)
>>
>> With 2 OSD per servers and only OSD01, OSD02 , performance is as expected
>> ( no gaps CUR MB/s )
>>
>> Adding one OSD from OSD04, tanks performance ( lots of gaps CUR MB/s 0 )
>>
>> See below
>>
>> ceph -s
>>   cluster:
>>     id:     1e98e57a-ef41-4327-b88a-dd2531912632
>>     health: HEALTH_WARN
>>             noscrub,nodeep-scrub flag(s) set
>>
>>
>>
>>
>> WITH OSD04
>>
>> ceph osd tree
>> ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF
>> -1       2.87256 root default
>> -7       1.14899     host osd02
>>  0   hdd 0.57500         osd.0      up  1.00000 1.00000
>>  1   hdd 0.57500         osd.1      up  1.00000 1.00000
>> -3       1.14899     host osd03
>>  2   hdd 0.57500         osd.2      up  1.00000 1.00000
>>  3   hdd 0.57500         osd.3      up  1.00000 1.00000
>> -4       0.57458     host osd04
>>  4   hdd 0.57458         osd.4      up  1.00000 1.00000
>>
>>
>> 2018-04-10 08:37:08.111037 min lat: 0.0128562 max lat: 13.1623 avg lat:
>> 0.528273
>>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
>> lat(s)
>>   100      16      3001      2985   119.388        90   0.0169507
>> 0.528273
>>   101      16      3029      3013   119.315       112   0.0410565
>> 0.524325
>>   102      16      3029      3013   118.145         0           -
>> 0.524325
>>   103      16      3029      3013   116.998         0           -
>> 0.524325
>>   104      16      3029      3013   115.873         0           -
>> 0.524325
>>   105      16      3071      3055    116.37        42   0.0888923
>>  0.54832
>>   106      16      3156      3140   118.479       340   0.0162464
>> 0.535244
>>   107      16      3156      3140   117.372         0           -
>> 0.535244
>>   108      16      3156      3140   116.285         0           -
>> 0.535244
>>   109      16      3156      3140   115.218         0           -
>> 0.535244
>>   110      16      3156      3140   114.171         0           -
>> 0.535244
>>   111      16      3156      3140   113.142         0           -
>> 0.535244
>>   112      16      3156      3140   112.132         0           -
>> 0.535244
>>   113      16      3156      3140    111.14         0           -
>> 0.535244
>>   114      16      3156      3140   110.165         0           -
>> 0.535244
>>   115      16      3156      3140   109.207         0           -
>> 0.535244
>>   116      16      3230      3214   110.817      29.6   0.0169969
>> 0.574856
>>   117      16      3311      3295   112.639       324   0.0704851
>> 0.565529
>>   118      16      3311      3295   111.684         0           -
>> 0.565529
>>   119      16      3311      3295   110.746         0           -
>> 0.565529
>> 2018-04-10 08:37:28.112886 min lat: 0.0128562 max lat: 14.7293 avg lat:
>> 0.565529
>>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
>> lat(s)
>>   120      16      3311      3295   109.823         0           -
>> 0.565529
>>   121      16      3311      3295   108.915         0           -
>> 0.565529
>>   122      16      3311      3295   108.022         0           -
>> 0.565529
>> Total time run:         122.568983
>> Total writes made:      3312
>> Write size:             4194304
>> Object size:            4194304
>> Bandwidth (MB/sec):     108.086
>> Stddev Bandwidth:       121.191
>> Max bandwidth (MB/sec): 520
>> Min bandwidth (MB/sec): 0
>> Average IOPS:           27
>> Stddev IOPS:            30
>> Max IOPS:               130
>> Min IOPS:               0
>> Average Latency(s):     0.591771
>> Stddev Latency(s):      1.74753
>> Max latency(s):         14.7293
>> Min latency(s):         0.0128562
>>
>>
>> AFTER ceph osd down osd.4; ceph osd out osd.4
>>
>> ceph osd tree
>> ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF
>> -1       2.87256 root default
>> -7       1.14899     host osd02
>>  0   hdd 0.57500         osd.0      up  1.00000 1.00000
>>  1   hdd 0.57500         osd.1      up  1.00000 1.00000
>> -3       1.14899     host osd03
>>  2   hdd 0.57500         osd.2      up  1.00000 1.00000
>>  3   hdd 0.57500         osd.3      up  1.00000 1.00000
>> -4       0.57458     host osd04
>>  4   hdd 0.57458         osd.4      up        0 1.00000
>>
>>
>> 2018-04-10 08:46:55.193642 min lat: 0.0156532 max lat: 2.5884 avg lat:
>> 0.310681
>>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
>> lat(s)
>>   100      16      5144      5128   205.097       220   0.0372222
>> 0.310681
>>   101      16      5196      5180   205.126       208    0.421245
>> 0.310908
>>   102      16      5232      5216   204.526       144    0.543723
>> 0.311544
>>   103      16      5271      5255   204.055       156    0.465998
>> 0.312394
>>   104      16      5310      5294   203.593       156    0.483188
>> 0.313355
>>   105      16      5357      5341   203.444       188   0.0313209
>> 0.313267
>>   106      16      5402      5386   203.223       180    0.517098
>> 0.313714
>>   107      16      5457      5441   203.379       220   0.0277359
>> 0.313288
>>   108      16      5515      5499   203.644       232    0.470556
>> 0.313203
>>   109      16      5565      5549   203.611       200    0.564713
>> 0.313173
>>   110      16      5606      5590    203.25       164   0.0223166
>> 0.313596
>>   111      16      5659      5643   203.329       212   0.0231103
>> 0.313597
>>   112      16      5703      5687   203.085       176    0.033348
>> 0.314018
>>   113      16      5757      5741   203.199       216     1.53862
>> 0.313991
>>   114      16      5798      5782   202.855       164      0.4711
>> 0.314511
>>   115      16      5852      5836   202.969       216   0.0350226
>>  0.31424
>>   116      16      5912      5896   203.288       240   0.0253188
>> 0.313657
>>   117      16      5964      5948   203.328       208   0.0223623
>> 0.313562
>>   118      16      6024      6008   203.639       240    0.174245
>> 0.313531
>>   119      16      6070      6054   203.473       184    0.712498
>> 0.313582
>> 2018-04-10 08:47:15.195873 min lat: 0.0154679 max lat: 2.5884 avg lat:
>> 0.313564
>>   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
>> lat(s)
>>   120      16      6120      6104   203.444       200   0.0351212
>> 0.313564
>> Total time run:         120.551897
>> Total writes made:      6120
>> Write size:             4194304
>> Object size:            4194304
>> Bandwidth (MB/sec):     203.066
>> Stddev Bandwidth:       43.8329
>> Max bandwidth (MB/sec): 480
>> Min bandwidth (MB/sec): 128
>> Average IOPS:           50
>> Stddev IOPS:            10
>> Max IOPS:               120
>> Min IOPS:               32
>> Average Latency(s):     0.314959
>> Stddev Latency(s):      0.379298
>> Max latency(s):         2.5884
>> Min latency(s):         0.0154679
>>
>>
>>
>> On Tue, 10 Apr 2018 at 07:58, Kai Wagner <kwag...@suse.com> wrote:
>>
>>> Is this just from one server or from all servers? Just wondering why VD
>>> 0 is using WriteThrough compared to the others. If that's the setup for
>>> the OSD's you already have a cache setup problem.
>>>
>>>
>>> On 10.04.2018 13:44, Mohamad Gebai wrote:
>>> > megacli -LDGetProp -cache -Lall -a0
>>> >
>>> > Adapter 0-VD 0(target id: 0): Cache Policy:WriteThrough,
>>> > ReadAheadNone, Direct, Write Cache OK if bad BBU
>>> > Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAdaptive,
>>> > Cached, No Write Cache if bad BBU
>>> > Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAdaptive,
>>> > Cached, No Write Cache if bad BBU
>>> > Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAdaptive,
>>> > Cached, No Write Cache if bad BBU
>>>
>>> --
>>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>>> HRB 21284 (AG Nürnberg)
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to