Re: [ceph-users] offending shards are crashing osd's

Ronny Aasen Fri, 21 Oct 2016 01:36:26 -0700

On 19. okt. 2016 13:00, Ronny Aasen wrote:

On 06. okt. 2016 13:41, Ronny Aasen wrote:

hello


I have a few osd's in my cluster that are regularly crashing.

[snip]



ofcourse having 3 osd's dying regularly is not good for my health. so i
have set noout, to avoid heavy recoveries.

googeling this error messages gives exactly 1 hit:
https://github.com/ceph/ceph/pull/6946

where it saies:  "the shard must be removed so it can be reconstructed"
but with my 3 osd's failing, i am not certain witch of them contain the
broken shard. (or perhaps all 3 of them?)

a bit reluctant to delete on all 3. I have 4+2 erasure coding.
( erasure size 6 min_size 4 ) so finding out witch one is bad would be
nice.

hope someone have an idea how to progress.

kind regards
Ronny Aasen


i again have this problem with crashing osd's. a more detailed log is on
the tail of this mail.

Does anyone have any suggestions on how i can identify what shard that
needs to be removed to allow the EC to recover. ?

and more importantly how i can stop the osd's from crashing?


kind regards
Ronny Aasen



Answering my own question for googleabillity.

using this one liner.

for dir in $(find /var/lib/ceph/osd/ceph-* -maxdepth 2 -type d -name'5.26*' | sort | uniq) ; do find $dir -name'*3a3938238e1f29.00000000002d80ca*' -type f -ls ;done


i got a list of all shards of the problematic object.

One of the object had size 0 but was otherways readable without any ioerrors. I guess this explains the inconsistent size, but it does notexplain why ceph decides it's better to crash 3 osd's, rather then movea 0 byte file into a "LOST+FOUND" style directory structure.

Or just delete it, since it will not have any useful data anyway.

Deleting this file (mv to /tmp). allowed the 3 broken osd's to start,and have been running for >24h now. while usualy they crash within 10minutes. Yay!

Generally you need to check _all_ shards on the given pg. Not just the 3crashing. This was what confused me since i only focused on the crashingosd's

I used the oneliner that checked osd's for the pg since due tobackfilling the pg was spread all over the place. And i could run itfrom ansible to reduce tedious work.

Also it would be convinient to be able to mark a broken/inconsistent pgmanually "inactive". Instead of crashing 3 osd's and taking lots ofother pg's with them down. One could set the pg inactive whiletroubleshooting, and unset pg-inactive when done. without having osd'scrash and all the following high load rebalancing.

Also i ran a find for 0 size files on that pg and there are multipleother files. are a 0 byte rbd_data file on a pg a normal occurence, orcan i have more similar problems in the future due to the other 0 sizefiles ?



kind regards
Ronny Aasen


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] offending shards are crashing osd's

Reply via email to