BTW -- i've created https://tracker.ceph.com/issues/55169 to ask that
we add some input validation. Injecting such a crush map would ideally
not be possible.
-- dan
On Mon, Apr 4, 2022 at 11:02 AM Dan van der Ster wrote:
>
> Excellent news!
> After everything is back to active+clean, don't forge
Excellent news!
After everything is back to active+clean, don't forget to set min_size to 4 :)
have a nice day
On Mon, Apr 4, 2022 at 10:59 AM Fulvio Galeazzi wrote:
>
> Yesss! Fixing the choose/chooseleaf thing did make the magic. :-)
>
>Thanks a lot for your support Dan. Lots of lessons l
Yesss! Fixing the choose/chooseleaf thing did make the magic. :-)
Thanks a lot for your support Dan. Lots of lessons learned from my
side, I'm really grateful.
All PGs are now active, will let Ceph rebalance.
Ciao ciao
Fulvio
On 4/4/22 10:50, Dan van der Ster
Could you share the output of `ceph pg 85.25 query`.
Then increase the crush weights of those three osds to 0.1, then check
if the PG goes active.
(It is possible that the OSDs are not registering as active while they
have weight zero).
-- dan
On Mon, Apr 4, 2022 at 10:01 AM Fulvio Galeazzi wr
Hi Fulvio,
Yes -- that choose/chooseleaf thing is definitely a problem.. Good catch!
I suggest to fix it and inject the new crush map and see how it goes.
Next, in your crush map for the storage type, you have an error:
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5
Hi again Dan!
Things are improving, all OSDs are up, but still that one PG is down.
More info below.
On 4/1/22 19:26, Dan van der Ster wrote:
Here is the output of "pg 85.12 query":
https://pastebin.ubuntu.com/p/ww3JdwDXVd/
and its status (also showing the other 85.XX, for refere
We're on the right track!
On Fri, Apr 1, 2022 at 6:57 PM Fulvio Galeazzi wrote:
>
> Ciao Dan, thanks for your messages!
>
> On 4/1/22 11:25, Dan van der Ster wrote:
> > The PGs are stale, down, inactive *because* the OSDs don't start.
> > Your main efforts should be to bring OSDs up, without purg
Ciao Dan, thanks for your messages!
On 4/1/22 11:25, Dan van der Ster wrote:
The PGs are stale, down, inactive *because* the OSDs don't start.
Your main efforts should be to bring OSDs up, without purging or
zapping or anyting like that.
(Currently your cluster is down, but there are hopes to re
The PGs are stale, down, inactive *because* the OSDs don't start.
Your main efforts should be to bring OSDs up, without purging or
zapping or anyting like that.
(Currently your cluster is down, but there are hopes to recover. If
you start purging things that can result in permanent data loss.).
Mo
Don't purge anything!
On Fri, Apr 1, 2022 at 9:38 AM Fulvio Galeazzi wrote:
>
> Ciao Dan,
> thanks for your time!
>
> So you are suggesting that my problems with PG 85.25 may somehow resolve
> if I manage to bring up the three OSDs currently "down" (possibly due to
> PG 85.12, and other PGs)
Ciao Dan,
thanks for your time!
So you are suggesting that my problems with PG 85.25 may somehow resolve
if I manage to bring up the three OSDs currently "down" (possibly due to
PG 85.12, and other PGs)?
Looking for the string 'start interval does not contain the required
bound' I found
Hi Fulvio,
I'm not sure why that PG doesn't register.
But let's look into your log. The relevant lines are:
-635> 2022-03-30 14:49:57.810 7ff904970700 -1 log_channel(cluster)
log [ERR] : 85.12s0 past_intervals [616435,616454) start interval does
not contain the required bound [605868,616454) st
Ciao Dan,
this is what I did with chunk s3, copying it from osd.121 to
osd.176 (which is managed by the same host).
But still
pg 85.25 is stuck stale for 85029.707069, current state
stale+down+remapped, last acting
[2147483647,2147483647,96,2147483647,2147483647]
So "health detail" appa
Hi Fulvio,
I don't think upmap will help -- that is used to remap where data
should be "up", but your problem is more that the PG chunks are not
going active due to the bug.
What happens if you export one of the PG chunks then import it to
another OSD -- does that chunk become active?
-- dan
Hallo again Dan, I am afraid I'd need a little more help, please...
Current status is as follows.
This is where I moved the chunk which was on osd.121:
~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/cephpa1-176
--no-mon-config --op list-pgs | grep ^85\.25
85.25s3
while other chunks
Thanks a lot, Dan!
> The EC pgs have a naming convention like 85.25s1 etc.. for the various
> k/m EC shards.
That was the bit of information I was missing... I was looking for the
wrong object.
I can now go on and export/import that one PGid chunk.
Thanks again!
Ful
Hi Fulvio,
You can check (offline) which PGs are on an OSD with the list-pgs op, e.g.
ceph-objectstore-tool --data-path /var/lib/ceph/osd/cephpa1-158/ --op list-pgs
The EC pgs have a naming convention like 85.25s1 etc.. for the various
k/m EC shards.
-- dan
On Mon, Mar 28, 2022 at 2:29 PM F
17 matches
Mail list logo