Re: [ceph-users] HEALTH_ERR, size and min_size
Is the "ceph osd tree" output before or after replacing the osds. I ask, because osd10 is not shown in the output? regards Bernhard On 26/12/2019 21:18, Ml Ml wrote: Hello List, i have size = 3 and min_size = 2 with 3 Nodes. My OSDs: ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 60.17775 root default -2 20.21155 host ceph01 0 hdd 1.71089 osd.0 up 1.0 1.0 8 hdd 1.71660 osd.8 up 1.0 1.0 9 hdd 2.67029 osd.9 up 1.0 1.0 11 hdd 1.71649 osd.11 up 1.0 1.0 12 hdd 2.67020 osd.12 up 1.0 1.0 14 hdd 2.67020 osd.14 up 1.0 1.0 18 hdd 1.71649 osd.18 up 1.0 1.0 22 hdd 2.67020 osd.22 up 1.0 1.0 23 hdd 2.67020 osd.23 up 1.0 1.0 -3 19.08154 host ceph02 2 hdd 2.67029 osd.2 up 1.0 1.0 3 hdd 2.7 osd.3 up 1.0 1.0 7 hdd 2.67029 osd.7 up 1.0 1.0 13 hdd 2.67020 osd.13 up 1.0 1.0 16 hdd 1.5 osd.16 up 1.0 1.0 19 hdd 2.38409 osd.19 up 1.0 1.0 24 hdd 2.67020 osd.24 up 1.0 1.0 25 hdd 1.71649 osd.25 up 1.0 1.0 -4 20.88466 host ceph03 1 hdd 1.71660 osd.1 up 1.0 1.0 4 hdd 2.67020 osd.4 up 1.0 1.0 5 hdd 1.71660 osd.5 up 1.0 1.0 6 hdd 1.71660 osd.6 up 1.0 1.0 15 hdd 2.67020 osd.15 up 1.0 1.0 17 hdd 1.62109 osd.17 up 1.0 1.0 20 hdd 1.71649 osd.20 up 1.0 1.0 21 hdd 2.67020 osd.21 up 1.0 1.0 27 hdd 1.71649 osd.27 up 1.0 1.0 32 hdd 2.67020 osd.32 up 1.0 1.0 I replaced two osds on node ceph01 and ran into "HEALTH_ERR". My problem: it waits for the backfilling process? Why did i run into HEALTH_ERR? I thought all data will be available on at least one more node. or even two: HEALTH_ERR 343351/10358292 objects misplaced (3.315%); Reduced data availability: 19 pgs inactive; Degraded data redundancy: 639455/10358292 objects degraded (6.173%), 208 pgs degraded, 204 pgs undersized; application not enabled on 1 pool(s); 29 slow requests are blocked > 32 sec. Implicated osds ; 29 stuck requests are blocked > 4096 sec. Implicated osds 2,19,24 OBJECT_MISPLACED 343351/10358292 objects misplaced (3.315%) PG_AVAILABILITY Reduced data availability: 19 pgs inactive pg 0.4 is stuck inactive for 4227.236803, current state undersized+degraded+remapped+backfilling+peered, last acting [19] pg 0.12 is stuck inactive for 4227.267137, current state undersized+degraded+remapped+backfilling+peered, last acting [13] pg 0.1b is stuck inactive for 4198.153642, current state undersized+degraded+remapped+backfill_wait+peered, last acting [24] pg 0.1f is stuck inactive for 4226.574006, current state undersized+degraded+remapped+backfilling+peered, last acting [19] pg 0.61 is stuck inactive for 4227.316336, current state undersized+degraded+remapped+backfilling+peered, last acting [2] pg 0.85 is stuck inactive for 4227.287134, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13] pg 0.88 is stuck inactive for 4197.261935, current state undersized+degraded+remapped+backfill_wait+peered, last acting [24] pg 0.bd is stuck inactive for 4226.607646, current state undersized+degraded+remapped+backfilling+peered, last acting [2] pg 0.fc is stuck inactive for 4226.642664, current state undersized+degraded+remapped+backfill_wait+peered, last acting [13] pg 0.140 is stuck inactive for 4198.277165, current state undersized+degraded+remapped+backfilling+peered, last acting [2] pg 0.16c is stuck inactive for 4198.268985, current state undersized+degraded+remapped+backfilling+peered, last acting [7] pg 0.21f is stuck inactive for 4198.228206, current state undersized+degraded+remapped+backfilling+peered, last acting [2] pg 0.222 is stuck inactive for 4198.241280, current state undersized+degraded+remapped+backfilling+peered, last acting [2] pg 0.27f is stuck inactive for 4198.201034, current state undersized+degraded+remapped+backfill_wait+peered, last acting [19] pg 0.297 is stuck inactive for 4197.247869, current state undersized+degraded+remapped+backfilling+peered, last acting [24] pg 0.298 is stuck inactive for 4226.572652, current state undersized+degraded+remapped+backfilling+peered, last acting [19] pg 0.2cd is stuck inactive for 4226.643455, current state undersized+degraded+remapped+backfilling+peered, last acting [16] pg 0.314 is stuck inactive for 4227.339749, current state
[ceph-users] ceph log level
Hi all, OSD servers generate huge number of log. I configure 'debug_osd' to 1/5 or 1/20, but it seems not working. Is there any other option which overrides this configuration? Ceph version mimic(13.2.5) Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs kernel client io performance decreases extremely
hi,Stefan could you please provide further guidence? Brs | | renjianxinlover | | renjianxinlo...@163.com | 签名由网易邮箱大师定制 On 12/28/2019 21:44,renjianxinlover wrote: Sorry what i said was fuzzy before. Currently, my mds is running with certain osds at same node in which SSD drive serves as cache device. | | renjianxinlover | | renjianxinlo...@163.com | 签名由网易邮箱大师定制 On 12/28/2019 15:49,Stefan Kooman wrote: Quoting renjianxinlover (renjianxinlo...@163.com): HI, Nathan, thanks for your quick reply! comand 'ceph status' outputs warning including about ten clients failing to respond to cache pressure; in addition, in mds node, 'iostat -x 1' shows drive io usage of mds within five seconds as follow, You should run this iostat -x 1 on the OSD nodes ... MDS is not doing any IO in and of itself as far as Ceph is concerned. Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark diffrence between rados bench and rbd bench
rados bench -p scbench 60 seq --io-size 8192 --io-threads 256 Read size:4194304 rados bench doesn't have --io-size option testing sequential read with 8K I/O size is a strange idea anyway though -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Benchmark diffrence between rados bench and rbd bench
Hello, is there a difference between those two benchmarks? I mean, with rados i get the expected results, and rbd bench ist far away from it. I was not able to see any bottlenecks at all. It looked like with rados bench the disk did a lot more work than with rbd bench. Any ideas on this? rados bench -p scbench 60 seq --io-size 8192 --io-threads 256 ... Total time run: 46.565262 Total reads made: 8021 Read size:4194304 Object size: 4194304 Bandwidth (MB/sec): 689.011 Average IOPS: 172 Stddev IOPS: 14 Max IOPS: 190 Min IOPS: 114 Average Latency(s): 0.0921206 Max latency(s): 1.54236 Min latency(s): 0.0113188 rbd -p rbdbench bench bench1_image --io-type read --io-size 8192 --io-threads 256 --io-total 10G --io-pattern seq ... elapsed: 571 ops: 1310720 ops/sec: 2295.01 bytes/sec: 18800700.54 (18MB/sec) I tried to use the same parameters in order to compare two same benchmarks. Why do i get so diffrent IOPS/Sec and MB/Sec results? Any ideas on this? Thanks, Mario ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_ERR, size and min_size
Quoting Ml Ml (mliebher...@googlemail.com): > Hello Stefan, > > The status was "HEALTH_OK" before i ran those commands. \o/ > root@ceph01:~# ceph osd crush rule dump > [ > { > "rule_id": 0, > "rule_name": "replicated_ruleset", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -1, > "item_name": "default" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" ^^ This is the important part ... host as failure domain (not osd), but that's fine in your case. Make sure you only remove OSDs within the same failure domain at a time and your safe. Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_ERR, size and min_size
Hello Stefan, The status was "HEALTH_OK" before i ran those commands. I removed the OSDs with: ceph osd out osd.10 ceph auth del osd.10 systemctl stop ceph-osd@10 ceph osd rm 10 umount /var/lib/ceph/osd/ceph-10 ceph osd crush remove osd.10 dd if=/dev/zero of=/dev/sdc ceph osd out osd.9 ceph auth del osd.9 systemctl stop ceph-osd@9 ceph osd rm 9 umount /var/lib/ceph/osd/ceph-9 ceph osd crush remove osd.9 root@ceph01:~# ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_ruleset", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] Thanks, Mario On Sun, Dec 29, 2019 at 2:16 PM Stefan Kooman wrote: > > Quoting Ml Ml (mliebher...@googlemail.com): > > Hello List, > > i have size = 3 and min_size = 2 with 3 Nodes. > > That's good. > > > > > > > I replaced two osds on node ceph01 and ran into "HEALTH_ERR". > > My problem: it waits for the backfilling process? > > Why did i run into HEALTH_ERR? I thought all data will be available on > > at least one more node. or even two: > > How did you replace them? Did you first set them "out" and waited for > the data to be repicated elsewhere before you removed them? > > It *might* be because your CRUSH rule set is replicating over "OSD" and > not host. What does a "ceph osd crush rule dump" shows? > > Gr. Stefan > > -- > | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_ERR, size and min_size
Quoting Ml Ml (mliebher...@googlemail.com): > Hello List, > i have size = 3 and min_size = 2 with 3 Nodes. That's good. > > > I replaced two osds on node ceph01 and ran into "HEALTH_ERR". > My problem: it waits for the backfilling process? > Why did i run into HEALTH_ERR? I thought all data will be available on > at least one more node. or even two: How did you replace them? Did you first set them "out" and waited for the data to be repicated elsewhere before you removed them? It *might* be because your CRUSH rule set is replicating over "OSD" and not host. What does a "ceph osd crush rule dump" shows? Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com