Hi Caspar, Yes, cluster was working fine with number of PGs per OSD warning up until now. I am not sure how to recover from stale down/inactive PGs. If you happen to know about this can you let me know?
Current State: [root@fre101 ~]# ceph -s 2019-01-04 05:22:05.942349 7f314f613700 -1 asok(0x7f31480017a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph-guests/ceph-client.admin.1053724.139849638091088.asok': (2) No such file or directory cluster: id: adb9ad8e-f458-4124-bf58-7963a8d1391f health: HEALTH_ERR 3 pools have many more objects per pg than average 505714/12392650 objects misplaced (4.081%) 3883 PGs pending on creation Reduced data availability: 6519 pgs inactive, 1870 pgs down, 1 pg peering, 886 pgs stale Degraded data redundancy: 42987/12392650 objects degraded (0.347%), 634 pgs degraded, 16 pgs undersized 125827 slow requests are blocked > 32 sec 2 stuck requests are blocked > 4096 sec too many PGs per OSD (2758 > max 200) services: mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02 osd: 39 osds: 39 up, 39 in; 76 remapped pgs rgw: 1 daemon active data: pools: 18 pools, 54656 pgs objects: 6051k objects, 10944 GB usage: 21933 GB used, 50688 GB / 72622 GB avail pgs: 11.927% pgs not active 42987/12392650 objects degraded (0.347%) 505714/12392650 objects misplaced (4.081%) 48080 active+clean 3885 activating 1111 down 759 stale+down 614 activating+degraded 74 activating+remapped 46 stale+active+clean 35 stale+activating 21 stale+activating+remapped 9 stale+active+undersized 9 stale+activating+degraded 5 stale+activating+undersized+degraded+remapped 3 activating+degraded+remapped 1 stale+activating+degraded+remapped 1 stale+active+undersized+degraded 1 remapped+peering 1 active+clean+remapped 1 activating+undersized+degraded+remapped io: client: 0 B/s rd, 25397 B/s wr, 4 op/s rd, 4 op/s wr I will update number of PGs per OSD once these inactive or stale PGs come online. I am not able to access VMs (VMs, Images) which are using Ceph. Thanks Arun On Fri, Jan 4, 2019 at 4:53 AM Caspar Smit <caspars...@supernas.eu> wrote: > Hi Arun, > > How did you end up with a 'working' cluster with so many pgs per OSD? > > "too many PGs per OSD (2968 > max 200)" > > To (temporarily) allow this kind of pgs per osd you could try this: > > Change these values in the global section in your ceph.conf: > > mon max pg per osd = 200 > osd max pg per osd hard ratio = 2 > > It allows 200*2 = 400 Pgs per OSD before disabling the creation of new > pgs. > > Above are the defaults (for Luminous, maybe other versions too) > You can check your current settings with: > > ceph daemon mon.ceph-mon01 config show |grep pg_per_osd > > Since your current pgs per osd ratio is way higher then the default you > could set them to for instance: > > mon max pg per osd = 1000 > osd max pg per osd hard ratio = 5 > > Which allow for 5000 pgs per osd before disabling creation of new pgs. > > You'll need to inject the setting into the mons/osds and restart mgrs to > make them active. > > ceph tell mon.* injectargs ‘--mon_max_pg_per_osd 1000’ > ceph tell mon.* injectargs ‘--osd_max_pg_per_osd_hard_ratio 5’ > ceph tell osd.* injectargs ‘--mon_max_pg_per_osd 1000’ > ceph tell osd.* injectargs ‘--osd_max_pg_per_osd_hard_ratio 5’ > restart mgrs > > Kind regards, > Caspar > > > Op vr 4 jan. 2019 om 04:28 schreef Arun POONIA < > arun.poo...@nuagenetworks.net>: > >> Hi Chris, >> >> Indeed that's what happened. I didn't set noout flag either and I did >> zapped disk on new server every time. In my cluster status fre201 is only >> new server. >> >> Current Status after enabling 3 OSDs on fre201 host. >> >> [root@fre201 ~]# ceph osd tree >> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >> -1 70.92137 root default >> -2 5.45549 host fre101 >> 0 hdd 1.81850 osd.0 up 1.00000 1.00000 >> 1 hdd 1.81850 osd.1 up 1.00000 1.00000 >> 2 hdd 1.81850 osd.2 up 1.00000 1.00000 >> -9 5.45549 host fre103 >> 3 hdd 1.81850 osd.3 up 1.00000 1.00000 >> 4 hdd 1.81850 osd.4 up 1.00000 1.00000 >> 5 hdd 1.81850 osd.5 up 1.00000 1.00000 >> -3 5.45549 host fre105 >> 6 hdd 1.81850 osd.6 up 1.00000 1.00000 >> 7 hdd 1.81850 osd.7 up 1.00000 1.00000 >> 8 hdd 1.81850 osd.8 up 1.00000 1.00000 >> -4 5.45549 host fre107 >> 9 hdd 1.81850 osd.9 up 1.00000 1.00000 >> 10 hdd 1.81850 osd.10 up 1.00000 1.00000 >> 11 hdd 1.81850 osd.11 up 1.00000 1.00000 >> -5 5.45549 host fre109 >> 12 hdd 1.81850 osd.12 up 1.00000 1.00000 >> 13 hdd 1.81850 osd.13 up 1.00000 1.00000 >> 14 hdd 1.81850 osd.14 up 1.00000 1.00000 >> -6 5.45549 host fre111 >> 15 hdd 1.81850 osd.15 up 1.00000 1.00000 >> 16 hdd 1.81850 osd.16 up 1.00000 1.00000 >> 17 hdd 1.81850 osd.17 up 0.79999 1.00000 >> -7 5.45549 host fre113 >> 18 hdd 1.81850 osd.18 up 1.00000 1.00000 >> 19 hdd 1.81850 osd.19 up 1.00000 1.00000 >> 20 hdd 1.81850 osd.20 up 1.00000 1.00000 >> -8 5.45549 host fre115 >> 21 hdd 1.81850 osd.21 up 1.00000 1.00000 >> 22 hdd 1.81850 osd.22 up 1.00000 1.00000 >> 23 hdd 1.81850 osd.23 up 1.00000 1.00000 >> -10 5.45549 host fre117 >> 24 hdd 1.81850 osd.24 up 1.00000 1.00000 >> 25 hdd 1.81850 osd.25 up 1.00000 1.00000 >> 26 hdd 1.81850 osd.26 up 1.00000 1.00000 >> -11 5.45549 host fre119 >> 27 hdd 1.81850 osd.27 up 1.00000 1.00000 >> 28 hdd 1.81850 osd.28 up 1.00000 1.00000 >> 29 hdd 1.81850 osd.29 up 1.00000 1.00000 >> -12 5.45549 host fre121 >> 30 hdd 1.81850 osd.30 up 1.00000 1.00000 >> 31 hdd 1.81850 osd.31 up 1.00000 1.00000 >> 32 hdd 1.81850 osd.32 up 1.00000 1.00000 >> -13 5.45549 host fre123 >> 33 hdd 1.81850 osd.33 up 1.00000 1.00000 >> 34 hdd 1.81850 osd.34 up 1.00000 1.00000 >> 35 hdd 1.81850 osd.35 up 1.00000 1.00000 >> -27 5.45549 host fre201 >> 36 hdd 1.81850 osd.36 up 1.00000 1.00000 >> 37 hdd 1.81850 osd.37 up 1.00000 1.00000 >> 38 hdd 1.81850 osd.38 up 1.00000 1.00000 >> [root@fre201 ~]# >> [root@fre201 ~]# >> [root@fre201 ~]# >> [root@fre201 ~]# >> [root@fre201 ~]# >> [root@fre201 ~]# ceph -s >> cluster: >> id: adb9ad8e-f458-4124-bf58-7963a8d1391f >> health: HEALTH_ERR >> 3 pools have many more objects per pg than average >> 585791/12391450 objects misplaced (4.727%) >> 2 scrub errors >> 2374 PGs pending on creation >> Reduced data availability: 6578 pgs inactive, 2025 pgs down, >> 74 pgs peering, 1234 pgs stale >> Possible data damage: 2 pgs inconsistent >> Degraded data redundancy: 64969/12391450 objects degraded >> (0.524%), 616 pgs degraded, 20 pgs undersized >> 96242 slow requests are blocked > 32 sec >> 228 stuck requests are blocked > 4096 sec >> too many PGs per OSD (2768 > max 200) >> >> services: >> mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 >> mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02 >> osd: 39 osds: 39 up, 39 in; 96 remapped pgs >> rgw: 1 daemon active >> >> data: >> pools: 18 pools, 54656 pgs >> objects: 6050k objects, 10942 GB >> usage: 21900 GB used, 50721 GB / 72622 GB avail >> pgs: 0.002% pgs unknown >> 12.050% pgs not active >> 64969/12391450 objects degraded (0.524%) >> 585791/12391450 objects misplaced (4.727%) >> 47489 active+clean >> 3670 activating >> 1098 stale+down >> 923 down >> 575 activating+degraded >> 563 stale+active+clean >> 105 stale+activating >> 78 activating+remapped >> 72 peering >> 25 stale+activating+degraded >> 23 stale+activating+remapped >> 9 stale+active+undersized >> 6 stale+activating+undersized+degraded+remapped >> 5 stale+active+undersized+degraded >> 4 down+remapped >> 4 activating+degraded+remapped >> 2 active+clean+inconsistent >> 1 stale+activating+degraded+remapped >> 1 stale+active+clean+remapped >> 1 stale+remapped+peering >> 1 remapped+peering >> 1 unknown >> >> io: >> client: 0 B/s rd, 208 kB/s wr, 22 op/s rd, 22 op/s wr >> >> >> >> Thanks >> Arun >> >> >> On Thu, Jan 3, 2019 at 7:19 PM Chris <bitskr...@bitskrieg.net> wrote: >> >>> If you added OSDs and then deleted them repeatedly without waiting for >>> replication to finish as the cluster attempted to re-balance across them, >>> its highly likely that you are permanently missing PGs (especially if the >>> disks were zapped each time). >>> >>> If those 3 down OSDs can be revived there is a (small) chance that you >>> can right the ship, but 1400pg/OSD is pretty extreme. I'm surprised >>> the cluster even let you do that - this sounds like a data loss event. >>> >>> Bring back the 3 OSD and see what those 2 inconsistent pgs look like >>> with ceph pg query. >>> >>> On January 3, 2019 21:59:38 Arun POONIA <arun.poo...@nuagenetworks.net> >>> wrote: >>> >>>> Hi, >>>> >>>> Recently I tried adding a new node (OSD) to ceph cluster using >>>> ceph-deploy tool. Since I was experimenting with tool and ended up deleting >>>> OSD nodes on new server couple of times. >>>> >>>> Now since ceph OSDs are running on new server cluster PGs seems to be >>>> inactive (10-15%) and they are not recovering or rebalancing. Not sure what >>>> to do. I tried shutting down OSDs on new server. >>>> >>>> Status: >>>> [root@fre105 ~]# ceph -s >>>> 2019-01-03 18:56:42.867081 7fa0bf573700 -1 asok(0x7fa0b80017a0) >>>> AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to >>>> bind the UNIX domain socket to >>>> '/var/run/ceph-guests/ceph-client.admin.4018644.140328258509136.asok': (2) >>>> No such file or directory >>>> cluster: >>>> id: adb9ad8e-f458-4124-bf58-7963a8d1391f >>>> health: HEALTH_ERR >>>> 3 pools have many more objects per pg than average >>>> 373907/12391198 objects misplaced (3.018%) >>>> 2 scrub errors >>>> 9677 PGs pending on creation >>>> Reduced data availability: 7145 pgs inactive, 6228 pgs >>>> down, 1 pg peering, 2717 pgs stale >>>> Possible data damage: 2 pgs inconsistent >>>> Degraded data redundancy: 178350/12391198 objects degraded >>>> (1.439%), 346 pgs degraded, 1297 pgs undersized >>>> 52486 slow requests are blocked > 32 sec >>>> 9287 stuck requests are blocked > 4096 sec >>>> too many PGs per OSD (2968 > max 200) >>>> >>>> services: >>>> mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 >>>> mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02 >>>> osd: 39 osds: 36 up, 36 in; 51 remapped pgs >>>> rgw: 1 daemon active >>>> >>>> data: >>>> pools: 18 pools, 54656 pgs >>>> objects: 6050k objects, 10941 GB >>>> usage: 21727 GB used, 45308 GB / 67035 GB avail >>>> pgs: 13.073% pgs not active >>>> 178350/12391198 objects degraded (1.439%) >>>> 373907/12391198 objects misplaced (3.018%) >>>> 46177 active+clean >>>> 5054 down >>>> 1173 stale+down >>>> 1084 stale+active+undersized >>>> 547 activating >>>> 201 stale+active+undersized+degraded >>>> 158 stale+activating >>>> 96 activating+degraded >>>> 46 stale+active+clean >>>> 42 activating+remapped >>>> 34 stale+activating+degraded >>>> 23 stale+activating+remapped >>>> 6 stale+activating+undersized+degraded+remapped >>>> 6 activating+undersized+degraded+remapped >>>> 2 activating+degraded+remapped >>>> 2 active+clean+inconsistent >>>> 1 stale+activating+degraded+remapped >>>> 1 stale+active+clean+remapped >>>> 1 stale+remapped >>>> 1 down+remapped >>>> 1 remapped+peering >>>> >>>> io: >>>> client: 0 B/s rd, 208 kB/s wr, 28 op/s rd, 28 op/s wr >>>> >>>> Thanks >>>> -- >>>> Arun Poonia >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>> >> >> -- >> Arun Poonia >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Arun Poonia
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com