Hi/Hej Magnus, We had a similar issue going from latest hammer to jewel (so might not be applicable for you), with PGs stuck peering / data misplaced, right after updating all mons to latest jewel at that time 10.2.10. Finally setting the require_jewel_osds put everything back in place ( we were going to do this after restarting all OSDs, following the docs/changelogs ). What does your ceph health detail look like? Did you perform any other commands after starting your mon upgrade? Any commands that might change the crush-map might cause issues AFAIK (correct me if im wrong, but i think we ran into this once) if your mons and osds are different versions. // david On jul 12 2018, at 11:45 am, Magnus Grönlund <mag...@gronlund.se> wrote: > > Hi list, > > Things went from bad to worse, tried to upgrade some OSDs to Luminous to see > if that could help but that didn’t appear to make any difference. > But for each restarted OSD there was a few PGs that the OSD seemed to > “forget” and the number of undersized PGs grew until some PGs had been > “forgotten” by all 3 acting OSDs and became stale, even though all OSDs (and > their disks) where available. > Then the OSDs grew so big that the servers ran out of memory (48GB per server > with 10 2TB-disks per server) and started killing the OSDs… > All OSDs where then shutdown to try and preserve some data on the disks at > least, but maybe it is too late? > > /Magnus > > 2018-07-11 21:10 GMT+02:00 Magnus Grönlund <mag...@gronlund.se > (mailto:mag...@gronlund.se)>: > > Hi Paul, > > > > No all OSDs are still jewel , the issue started before I had even started > > to upgrade the first OSD and they don't appear to be flapping. > > ceph -w shows a lot of slow request etc, but nothing unexpected as far as I > > can tell considering the state the cluster is in. > > > > 2018-07-11 20:40:09.396642 osd.37 [WRN] 100 slow requests, 2 included > > below; oldest blocked for > 25402.278824 secs > > 2018-07-11 20:40:09.396652 osd.37 [WRN] slow request 1920.957326 seconds > > old, received at 2018-07-11 20:08:08.439214: > > osd_op(client.73540057.0:8289463 2.e57b3e32 (undecoded) > > ack+ondisk+retry+write+known_if_redirected e160294) currently waiting for > > peered > > 2018-07-11 20:40:09.396660 osd.37 [WRN] slow request 1920.048094 seconds > > old, received at 2018-07-11 20:08:09.348446: > > osd_op(client.671628641.0:998704 2.42f88232 (undecoded) > > ack+ondisk+retry+write+known_if_redirected e160475) currently waiting for > > peered > > 2018-07-11 20:40:10.397008 osd.37 [WRN] 100 slow requests, 2 included > > below; oldest blocked for > 25403.279204 secs > > 2018-07-11 20:40:10.397017 osd.37 [WRN] slow request 1920.043860 seconds > > old, received at 2018-07-11 20:08:10.353060: > > osd_op(client.231731103.0:1007729 3.e0ff5786 (undecoded) > > ondisk+write+known_if_redirected e137428) currently waiting for peered > > 2018-07-11 20:40:10.397023 osd.37 [WRN] slow request 1920.034101 seconds > > old, received at 2018-07-11 20:08:10.362819: > > osd_op(client.207458703.0:2000292 3.a8143b86 (undecoded) > > ondisk+write+known_if_redirected e137428) currently waiting for peered > > 2018-07-11 20:40:10.790573 mon.0 [INF] pgmap 4104 pgs: 5 down+peering, 1142 > > peering, 210 remapped+peering, 5 active+recovery_wait+degraded, 1551 > > active+clean, 2 activating+undersized+degraded+remapped, 15 > > active+remapped+backfilling, 178 unknown, 1 active+remapped, 3 > > activating+remapped, 78 active+undersized+degraded+remapped+backfill_wait, > > 6 active+recovery_wait+degraded+remapped, 3 > > undersized+degraded+remapped+backfill_wait+peered, 5 > > active+undersized+degraded+remapped+backfilling, 295 > > active+remapped+backfill_wait, 3 active+recovery_wait+undersized+degraded, > > 21 activating+undersized+degraded, 559 active+undersized+degraded, 4 > > remapped, 17 undersized+degraded+peered, 1 > > active+recovery_wait+undersized+degraded+remapped; 13439 GB data, 42395 GB > > used, 160 TB / 201 TB avail; 4069 B/s rd, 746 kB/s wr, 5 op/s; > > 534753/10756032 objects degraded (4.972%); 779027/10756032 objects > > misplaced (7.243%); 256 MB/s, 65 objects/s recovering > > > > > > > > > > There are a lot of things in the OSD-log files that I'm unfamiliar with but > > so far I haven't found anything that has given me a clue on how to fix the > > issue. > > BTW restarting a OSD doesn't seem to help, on the contrary, that sometimes > > results in PGs beeing stuck undersized! > > I have attaced a osd-log from when a OSD i restarted started up. > > > > Best regards > > /Magnus > > > > > > 2018-07-11 20:39 GMT+02:00 Paul Emmerich <paul.emmer...@croit.io > > (mailto:paul.emmer...@croit.io)>: > > > Did you finish the upgrade of the OSDs? Are OSDs flapping? (ceph -w) Is > > > there anything weird in the OSDs' log files? > > > > > > > > > > > > Paul > > > > > > 2018-07-11 20:30 GMT+02:00 Magnus Grönlund <mag...@gronlund.se > > > (mailto:mag...@gronlund.se)>: > > > > Hi, > > > > > > > > Started to upgrade a ceph-cluster from Jewel (10.2.10) to Luminous > > > > (12.2.6) > > > > > > > > After upgrading and restarting the mons everything looked OK, the mons > > > > had quorum, all OSDs where up and in and all the PGs where active+clean. > > > > But before I had time to start upgrading the OSDs it became obvious > > > > that something had gone terribly wrong. > > > > All of a sudden 1600 out of 4100 PGs where inactive and 40% of the data > > > > was misplaced! > > > > > > > > The mons appears OK and all OSDs are still up and in, but a few hours > > > > later there was still 1483 pgs stuck inactive, essentially all of them > > > > in peering! > > > > Investigating one of the stuck PGs it appears to be looping between > > > > “inactive”, “remapped+peering” and “peering” and the epoch number is > > > > rising fast, see the attached pg query outputs. > > > > > > > > We really can’t afford to loose the cluster or the data so any help or > > > > suggestions on how to debug or fix this issue would be very, very > > > > appreciated! > > > > > > > > > > > > health: HEALTH_ERR > > > > 1483 pgs are stuck inactive for more than 60 seconds > > > > 542 pgs backfill_wait > > > > 14 pgs backfilling > > > > 11 pgs degraded > > > > 1402 pgs peering > > > > 3 pgs recovery_wait > > > > 11 pgs stuck degraded > > > > 1483 pgs stuck inactive > > > > 2042 pgs stuck unclean > > > > 7 pgs stuck undersized > > > > 7 pgs undersized > > > > 111 requests are blocked > 32 sec > > > > 10586 requests are blocked > 4096 sec > > > > recovery 9472/11120724 objects degraded (0.085%) > > > > recovery 1181567/11120724 objects misplaced (10.625%) > > > > noout flag(s) set > > > > mon.eselde02u32 low disk space > > > > > > > > services: > > > > mon: 3 daemons, quorum eselde02u32,eselde02u33,eselde02u34 > > > > mgr: eselde02u32(active), standbys: eselde02u33, eselde02u34 > > > > osd: 111 osds: 111 up, 111 in; 800 remapped pgs > > > > flags noout > > > > > > > > data: > > > > pools: 18 pools, 4104 pgs > > > > objects: 3620k objects, 13875 GB > > > > usage: 42254 GB used, 160 TB / 201 TB avail > > > > pgs: 1.876% pgs unknown > > > > 34.259% pgs not active > > > > 9472/11120724 objects degraded (0.085%) > > > > 1181567/11120724 objects misplaced (10.625%) > > > > 2062 active+clean > > > > 1221 peering > > > > 535 active+remapped+backfill_wait > > > > 181 remapped+peering > > > > 77 unknown > > > > 13 active+remapped+backfilling > > > > 7 active+undersized+degraded+remapped+backfill_wait > > > > 4 remapped > > > > 3 active+recovery_wait+degraded+remapped > > > > 1 active+degraded+remapped+backfilling > > > > > > > > io: > > > > recovery: 298 MB/s, 77 objects/s > > > > > > > > > > > > _______________________________________________ > > > > ceph-users mailing list > > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > > > > > > -- > > > Paul Emmerich > > > > > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > > croit GmbH > > > Freseniusstr. 31h > > > (https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g) > > > 81247 München > > > (https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g) > > > www.croit.io (http://www.croit.io) > > > Tel: +49 89 1896585 90 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com