Hi With osd_debug increased to 5/5 I am seeing lots of these in the ceph-osd.5.log ( newly added OSD)
Anyone know what it means ? 2018-04-10 16:05:33.317451 7f33610be700 5 osd.5 300 heartbeat: osd_stat(43897 MB used, 545 GB avail, 588 GB total, peers [0,1,2,3,4] op hist [0,0,0,0,0,1,0,3,0,0,2]) 2018-04-10 16:05:33.455358 7f33630c2700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'42, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:33.463539 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'46, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:33.470797 7f33628c1700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'40, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:33.507682 7f33628c1700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'38, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:36.995387 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'36, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.000745 7f33648c5700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'42, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.005926 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.011209 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.016410 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'49, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.021478 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'50, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.038838 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'49, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.057197 7f33628c1700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'43, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.084913 7f33630c2700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.094249 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.095475 7f33628c1700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.129598 7f33628c1700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:37.148653 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.320251 7f33630c2700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.327982 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'50, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.334373 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'46, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.344398 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.357411 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'46, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.374542 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'42, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.385454 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.393238 7f33638c3700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'44, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.617921 7f33610be700 5 osd.5 300 heartbeat: osd_stat(43997 MB used, 545 GB avail, 588 GB total, peers [0,1,2,3,4] op hist [0,0,0,0,0,0,0,0,1,3,0,2]) On Tue, 10 Apr 2018 at 15:31, Steven Vacaroaia <ste...@gmail.com> wrote: > I've just added another server ( same specs) with one osd and the behavior > is the same - bad performance ..cur MB/s 0 > Check network with iperf3 ..no issues > > So it is not a server issue since I am getting same behavior with 2 > different servers > > ... but I checked network with iperf3 ..no issues > > What can it be ? > > ceph osd df tree > ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME > -1 3.44714 - 588G 80693M 509G 0 0 - root default > -9 0.57458 - 588G 80693M 509G 13.39 1.13 - host osd01 > 5 hdd 0.57458 1.00000 588G 80693M 509G 13.39 1.13 64 osd.5 > -7 1.14899 - 1176G 130G 1046G 11.06 0.94 - host osd02 > 0 hdd 0.57500 1.00000 588G 70061M 519G 11.63 0.98 50 osd.0 > 1 hdd 0.57500 1.00000 588G 63200M 526G 10.49 0.89 41 osd.1 > -3 1.14899 - 1176G 138G 1038G 11.76 1.00 - host osd03 > 2 hdd 0.57500 1.00000 588G 68581M 521G 11.38 0.96 48 osd.2 > 3 hdd 0.57500 1.00000 588G 73185M 516G 12.15 1.03 53 osd.3 > -4 0.57458 - 0 0 0 0 0 - host osd04 > 4 hdd 0.57458 0 0 0 0 0 0 0 osd.4 > > 2018-04-10 15:11:58.542507 min lat: 0.0201432 max lat: 13.9308 avg lat: > 0.466235 > sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg > lat(s) > 40 16 1294 1278 127.785 0 - > 0.466235 > 41 16 1294 1278 124.668 0 - > 0.466235 > 42 16 1294 1278 121.7 0 - > 0.466235 > 43 16 1294 1278 118.87 0 - > 0.466235 > 44 16 1302 1286 116.896 6.4 0.0302793 > 0.469203 > 45 16 1395 1379 122.564 372 0.312525 > 0.51994 > 46 16 1458 1442 125.377 252 0.0387492 > 0.501892 > 47 16 1458 1442 122.709 0 - > 0.501892 > 48 16 1458 1442 120.153 0 - > 0.501892 > 49 16 1458 1442 117.701 0 - > 0.501892 > 50 16 1522 1506 120.466 64 0.137913 > 0.516969 > 51 16 1522 1506 118.104 0 - > 0.516969 > 52 16 1522 1506 115.833 0 - > 0.516969 > 53 16 1522 1506 113.648 0 - > 0.516969 > 54 16 1522 1506 111.543 0 - > 0.516969 > 55 16 1522 1506 109.515 0 - > 0.516969 > 56 16 1522 1506 107.559 0 - > 0.516969 > 57 16 1522 1506 105.672 0 - > 0.516969 > 58 16 1522 1506 103.851 0 - > 0.516969 > Total time run: 58.927431 > Total reads made: 1522 > Read size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 103.314 > Average IOPS: 25 > Stddev IOPS: 35 > Max IOPS: 106 > Min IOPS: 0 > Average Latency(s): 0.618812 > Max latency(s): 13.9308 > Min latency(s): 0.0201432 > > > iperf3 -c 192.168.0.181 -i1 -t 10 > Connecting to host 192.168.0.181, port 5201 > [ 4] local 192.168.0.182 port 57448 connected to 192.168.0.181 port 5201 > [ ID] Interval Transfer Bandwidth Retr Cwnd > [ 4] 0.00-1.00 sec 1.15 GBytes 9.92 Gbits/sec 0 830 KBytes > [ 4] 1.00-2.00 sec 1.15 GBytes 9.90 Gbits/sec 0 830 KBytes > [ 4] 2.00-3.00 sec 1.15 GBytes 9.91 Gbits/sec 0 918 KBytes > [ 4] 3.00-4.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes > [ 4] 4.00-5.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes > [ 4] 5.00-6.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes > [ 4] 6.00-7.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes > [ 4] 7.00-8.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes > [ 4] 8.00-9.00 sec 1.15 GBytes 9.90 Gbits/sec 0 918 KBytes > [ 4] 9.00-10.00 sec 1.15 GBytes 9.91 Gbits/sec 0 918 KBytes > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec 0 > sender > [ 4] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec > receiver > > > > > On Tue, 10 Apr 2018 at 08:49, Steven Vacaroaia <ste...@gmail.com> wrote: > >> Hi, >> Thanks for providing guidance >> >> VD0 is the SSD drive >> Many people suggested to not enable WB for SSD so that cache can be used >> for HDD where is needed more >> >> Setup is 3 identical DELL R620 server OSD01, OSD02, OSD04 >> 10 GB separate networks, 600 GB Entreprise HDD , 320 GB Entreprise SSD >> Blustore, separate WAL / DB on SSD ( 1 GB partition for WAL, 30GB for DB) >> >> With 2 OSD per servers and only OSD01, OSD02 , performance is as expected >> ( no gaps CUR MB/s ) >> >> Adding one OSD from OSD04, tanks performance ( lots of gaps CUR MB/s 0 ) >> >> See below >> >> ceph -s >> cluster: >> id: 1e98e57a-ef41-4327-b88a-dd2531912632 >> health: HEALTH_WARN >> noscrub,nodeep-scrub flag(s) set >> >> >> >> >> WITH OSD04 >> >> ceph osd tree >> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >> -1 2.87256 root default >> -7 1.14899 host osd02 >> 0 hdd 0.57500 osd.0 up 1.00000 1.00000 >> 1 hdd 0.57500 osd.1 up 1.00000 1.00000 >> -3 1.14899 host osd03 >> 2 hdd 0.57500 osd.2 up 1.00000 1.00000 >> 3 hdd 0.57500 osd.3 up 1.00000 1.00000 >> -4 0.57458 host osd04 >> 4 hdd 0.57458 osd.4 up 1.00000 1.00000 >> >> >> 2018-04-10 08:37:08.111037 min lat: 0.0128562 max lat: 13.1623 avg lat: >> 0.528273 >> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg >> lat(s) >> 100 16 3001 2985 119.388 90 0.0169507 >> 0.528273 >> 101 16 3029 3013 119.315 112 0.0410565 >> 0.524325 >> 102 16 3029 3013 118.145 0 - >> 0.524325 >> 103 16 3029 3013 116.998 0 - >> 0.524325 >> 104 16 3029 3013 115.873 0 - >> 0.524325 >> 105 16 3071 3055 116.37 42 0.0888923 >> 0.54832 >> 106 16 3156 3140 118.479 340 0.0162464 >> 0.535244 >> 107 16 3156 3140 117.372 0 - >> 0.535244 >> 108 16 3156 3140 116.285 0 - >> 0.535244 >> 109 16 3156 3140 115.218 0 - >> 0.535244 >> 110 16 3156 3140 114.171 0 - >> 0.535244 >> 111 16 3156 3140 113.142 0 - >> 0.535244 >> 112 16 3156 3140 112.132 0 - >> 0.535244 >> 113 16 3156 3140 111.14 0 - >> 0.535244 >> 114 16 3156 3140 110.165 0 - >> 0.535244 >> 115 16 3156 3140 109.207 0 - >> 0.535244 >> 116 16 3230 3214 110.817 29.6 0.0169969 >> 0.574856 >> 117 16 3311 3295 112.639 324 0.0704851 >> 0.565529 >> 118 16 3311 3295 111.684 0 - >> 0.565529 >> 119 16 3311 3295 110.746 0 - >> 0.565529 >> 2018-04-10 08:37:28.112886 min lat: 0.0128562 max lat: 14.7293 avg lat: >> 0.565529 >> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg >> lat(s) >> 120 16 3311 3295 109.823 0 - >> 0.565529 >> 121 16 3311 3295 108.915 0 - >> 0.565529 >> 122 16 3311 3295 108.022 0 - >> 0.565529 >> Total time run: 122.568983 >> Total writes made: 3312 >> Write size: 4194304 >> Object size: 4194304 >> Bandwidth (MB/sec): 108.086 >> Stddev Bandwidth: 121.191 >> Max bandwidth (MB/sec): 520 >> Min bandwidth (MB/sec): 0 >> Average IOPS: 27 >> Stddev IOPS: 30 >> Max IOPS: 130 >> Min IOPS: 0 >> Average Latency(s): 0.591771 >> Stddev Latency(s): 1.74753 >> Max latency(s): 14.7293 >> Min latency(s): 0.0128562 >> >> >> AFTER ceph osd down osd.4; ceph osd out osd.4 >> >> ceph osd tree >> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >> -1 2.87256 root default >> -7 1.14899 host osd02 >> 0 hdd 0.57500 osd.0 up 1.00000 1.00000 >> 1 hdd 0.57500 osd.1 up 1.00000 1.00000 >> -3 1.14899 host osd03 >> 2 hdd 0.57500 osd.2 up 1.00000 1.00000 >> 3 hdd 0.57500 osd.3 up 1.00000 1.00000 >> -4 0.57458 host osd04 >> 4 hdd 0.57458 osd.4 up 0 1.00000 >> >> >> 2018-04-10 08:46:55.193642 min lat: 0.0156532 max lat: 2.5884 avg lat: >> 0.310681 >> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg >> lat(s) >> 100 16 5144 5128 205.097 220 0.0372222 >> 0.310681 >> 101 16 5196 5180 205.126 208 0.421245 >> 0.310908 >> 102 16 5232 5216 204.526 144 0.543723 >> 0.311544 >> 103 16 5271 5255 204.055 156 0.465998 >> 0.312394 >> 104 16 5310 5294 203.593 156 0.483188 >> 0.313355 >> 105 16 5357 5341 203.444 188 0.0313209 >> 0.313267 >> 106 16 5402 5386 203.223 180 0.517098 >> 0.313714 >> 107 16 5457 5441 203.379 220 0.0277359 >> 0.313288 >> 108 16 5515 5499 203.644 232 0.470556 >> 0.313203 >> 109 16 5565 5549 203.611 200 0.564713 >> 0.313173 >> 110 16 5606 5590 203.25 164 0.0223166 >> 0.313596 >> 111 16 5659 5643 203.329 212 0.0231103 >> 0.313597 >> 112 16 5703 5687 203.085 176 0.033348 >> 0.314018 >> 113 16 5757 5741 203.199 216 1.53862 >> 0.313991 >> 114 16 5798 5782 202.855 164 0.4711 >> 0.314511 >> 115 16 5852 5836 202.969 216 0.0350226 >> 0.31424 >> 116 16 5912 5896 203.288 240 0.0253188 >> 0.313657 >> 117 16 5964 5948 203.328 208 0.0223623 >> 0.313562 >> 118 16 6024 6008 203.639 240 0.174245 >> 0.313531 >> 119 16 6070 6054 203.473 184 0.712498 >> 0.313582 >> 2018-04-10 08:47:15.195873 min lat: 0.0154679 max lat: 2.5884 avg lat: >> 0.313564 >> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg >> lat(s) >> 120 16 6120 6104 203.444 200 0.0351212 >> 0.313564 >> Total time run: 120.551897 >> Total writes made: 6120 >> Write size: 4194304 >> Object size: 4194304 >> Bandwidth (MB/sec): 203.066 >> Stddev Bandwidth: 43.8329 >> Max bandwidth (MB/sec): 480 >> Min bandwidth (MB/sec): 128 >> Average IOPS: 50 >> Stddev IOPS: 10 >> Max IOPS: 120 >> Min IOPS: 32 >> Average Latency(s): 0.314959 >> Stddev Latency(s): 0.379298 >> Max latency(s): 2.5884 >> Min latency(s): 0.0154679 >> >> >> >> On Tue, 10 Apr 2018 at 07:58, Kai Wagner <kwag...@suse.com> wrote: >> >>> Is this just from one server or from all servers? Just wondering why VD >>> 0 is using WriteThrough compared to the others. If that's the setup for >>> the OSD's you already have a cache setup problem. >>> >>> >>> On 10.04.2018 13:44, Mohamad Gebai wrote: >>> > megacli -LDGetProp -cache -Lall -a0 >>> > >>> > Adapter 0-VD 0(target id: 0): Cache Policy:WriteThrough, >>> > ReadAheadNone, Direct, Write Cache OK if bad BBU >>> > Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAdaptive, >>> > Cached, No Write Cache if bad BBU >>> > Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAdaptive, >>> > Cached, No Write Cache if bad BBU >>> > Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAdaptive, >>> > Cached, No Write Cache if bad BBU >>> >>> -- >>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >>> HRB 21284 (AG Nürnberg) >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com