Excuse the top-posting.
When looking at the logs it helps to filter by the actual thread that crashed.
$ grep 7f08af3b6700 ceph-osd.27.log.last.error.txt|tail -15
-1001> 2019-10-30 12:55:41.498823 7f08af3b6700 1 --
129.20.199.93:6803/977508 --> 129.20.199.7:0/2975967502 --
The "best" health i was able to get was :
HEALTH_ERR norecover flag(s) set; 1733/37482459 objects misplaced (0.005%); 5
scrub errors; Possible data damage: 2 pgs inconsistent; Degraded data
redundancy: 7461/37482459 objects degraded (0.020%), 24 pgs degraded, 2 pgs
undersized
OSDMAP_FLAGS
Thus spake Brad Hubbard (bhubb...@redhat.com) on mercredi 30 octobre 2019 à
12:50:50:
> Maybe you should set nodown and noout while you do these maneuvers?
> That will minimise peering and recovery (data movement).
As the commands don't take too long, i just had a few slow requests before
the
On Tue, Oct 29, 2019 at 9:09 PM Jérémy Gardais
wrote:
>
> Thus spake Brad Hubbard (bhubb...@redhat.com) on mardi 29 octobre 2019 à
> 08:20:31:
> > Yes, try and get the pgs healthy, then you can just re-provision the down
> > OSDs.
> >
> > Run a scrub on each of these pgs and then use the
Thus spake Brad Hubbard (bhubb...@redhat.com) on mardi 29 octobre 2019 à
08:20:31:
> Yes, try and get the pgs healthy, then you can just re-provision the down
> OSDs.
>
> Run a scrub on each of these pgs and then use the commands on the
> following page to find out more information for each
Yes, try and get the pgs healthy, then you can just re-provision the down OSDs.
Run a scrub on each of these pgs and then use the commands on the
following page to find out more information for each case.
https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-pg/
Focus on the
Hello,
From several weeks, i have some OSDs flapping before ending out of the
cluster by Ceph…
I was hoping some Ceph's magic and just gave it sometime to auto heal
(and be able to do all the side work…) but it was a bad idea (what a
surprise :D). Also got some inconsistents PGs, but i was