Yay! Finally after about exactly one month I finally am able to mount the 
drive!  Now is time to see how my data is doing. =P  Doesn't look too bad 
though.
Got to love the open source. =)  I downloaded ceph source code.  Built them.  
Then tried to run ceph-objectstore-export on that osd.4.   Then started 
debugging it.  Obviously don't have any idea of what everything do... but was 
able to trace to the error message.  The corruption appears to be at the mount 
region.  When it tries to decode a buffer, most buffers had very periodic 
(looking at the printfs I put in) access to data but then few of them had huge 
number.  Oh that "1" that didn't make sense was from the corruption happened, 
and that struct_v portion of the data changed to ASCII value of 1, which 
happily printed 1. =P  Since it was a mount portion... and hoping it doesn't 
impact the data much... went ahead and allowed those corrupted values.  I was 
able to export osd.4 with journal!
Then imported that page..  But OSDs wouldn't take them.. as it decided to 
create empty page 1.28 and assigned them active.  So.. just as "Incomplete PGs 
Oh My!" page sugeested,pulled those osds down and removed those empty heads and 
started back up.  At that point, no more incomplete data!
Working on that inconsistent data. looks like this is somewhat new in the 
10.2s.  I was able to get it working with rados get and put and 
deep-scrub.https://www.spinics.net/lists/ceph-users/msg39063.html

At this point, everything was active+clean.  But MDS wasn't happy.  Seems to 
suggest journal is broke.HEALTH_ERR mds rank 0 is damaged; mds cluster is 
degraded; no legacy OSD present but 'sortbitwise' flag is not set

Found this. Did everything down to cephfs-table-tool all reset 
sessionhttp://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/

Restarted MDS.  HEALTH_WARN no legacy OSD present but 'sortbitwise' flag is not 
set
Mounted!  Thank you everyone for the help!  Learned alot!
Regards,Hong
 

    On Friday, September 22, 2017 1:01 AM, hjcho616 <hjcho...@yahoo.com> wrote:
 

 Ronny,
Could you help me with this log?  I got this with debug osd=20 filestore=20 
ms=20.  This one is running "ceph pg repair 2.7"  This is one of the smaller 
page, thus log was smaller.  Others have similar errors.  I can see the lines 
with ERR, but other than that is there something I should be paying attention 
to? 
https://drive.google.com/file/d/0By7YztAJNGUWNkpCV090dHBmOWc/view?usp=sharing
Error messages looks like this.2017-09-21 23:53:31.545510 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 2: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od ffffffff 
alloc_hint [0 0])2017-09-21 23:53:31.545520 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 7: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od ffffffff 
alloc_hint [0 0])2017-09-21 23:53:31.545531 7f51682df700 -1 
log_channel(cluster) log [ERR] : 2.7 soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head: failed to pick suitable auth 
object
I did try to move that object to different location as suggested from this 
page.http://ceph.com/geen-categorie/ceph-manually-repair-object/

This is what I ran.systemctl stop ceph-osd@7ceph-osd -i 7 --flush-journalcd 
/var/lib/ceph/osd/ceph-7cd current/2.7_head/mv 
rb.0.145d.2ae8944a.0000000000bb__head_6F5DBE87__2 ~/ceph osd treesystemctl 
start ceph-osd@7ceph pg repair 2.7
Then I just get this..2017-09-22 00:41:06.495399 7f22ac3bd700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 2: soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head data_digest 0x62b74a1f != 
data_digest 0x43d61c5d from auth oi 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head(12962'694 osd.2.0:90545 
dirty|data_digest|omap_digest s 4194304 uv 484 dd 43d61c5d od ffffffff 
alloc_hint [0 0])2017-09-22 00:41:06.495417 7f22ac3bd700 -1 
log_channel(cluster) log [ERR] : 2.7 shard 7 missing 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head2017-09-22 00:41:06.495424 
7f22ac3bd700 -1 log_channel(cluster) log [ERR] : 2.7 soid 
2:e17dbaf6:::rb.0.145d.2ae8944a.0000000000bb:head: failed to pick suitable auth 
object
Moving from osd.2 results in similar error message, just says missing on top 
one instead. =P

I was hoping this time would give me a different result as I let one more osd 
copy one from OSD1 by turning down osd.7 and set noout.  But it doesn't appear 
to care about that extra data. Maybe only true when size is 3?  Basically since 
I had most osds alive on OSD1 I was trying to favor data from OSD1. =P
What can I do in this case? According to 
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/ inconsistent data can be 
expected with skip journal replay, and I had to use it as export crashed 
without it. =P  But doesn't say much about what to do in that case.If all went 
well, then your cluster is now back to 100% active+clean / HEALTH_OK state. 
Note that you may still have inconsistent or stale data stored inside the PG. 
This is because the state of the data on the OSD that failed is a bit unknown, 
especially if you had to use the ‘–skip-journal-replay’ option on the export. 
For RBD data, the client which utilizes the RBD should run a filesystem check 
against the RBD.

Regards,Hong 

    On Thursday, September 21, 2017 1:46 AM, Ronny Aasen 
<ronny+ceph-us...@aasen.cx> wrote:
 

 On 21. sep. 2017 00:35, hjcho616 wrote:
> # rados list-inconsistent-pg data
> ["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"]
> # rados list-inconsistent-pg metadata
> ["1.d","1.3d"]
> # rados list-inconsistent-pg rbd
> ["2.7"]
> # rados list-inconsistent-obj 0.0 --format=json-pretty
> {
>      "epoch": 23112,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.5 --format=json-pretty
> {
>      "epoch": 23078,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.a --format=json-pretty
> {
>      "epoch": 22954,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.e --format=json-pretty
> {
>      "epoch": 23068,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.1c --format=json-pretty
> {
>      "epoch": 22954,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.29 --format=json-pretty
> {
>      "epoch": 22974,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 0.2c --format=json-pretty
> {
>      "epoch": 23194,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 1.d --format=json-pretty
> {
>      "epoch": 23072,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 1.3d --format=json-pretty
> {
>      "epoch": 23221,
>      "inconsistents": []
> }
> # rados list-inconsistent-obj 2.7 --format=json-pretty
> {
>      "epoch": 23032,
>      "inconsistents": []
> }
> 
> Looks like not much information is there.  Could you elaborate on the 
> items you mentioned in find the object?  How do I check metadata.  What 
> are we looking for in md5sum?
> 
> - find the object  :: manually check the objects, check the object 
> metadata, run md5sum on them all and compare. check objects on the 
> nonrunning osd's and compare there as well. anything to try to determine 
> what object is ok and what is bad.
> 
> I tried that Ceph: manually repair object - Ceph 
> <http://ceph.com/geen-categorie/ceph-manually-repair-object/> methods on 
> PG 2.7 before..Tried 3 replica case, which would result in shard 
> missing, regardless of which one I moved,  2 replica case, hmm... I 
> guess I don't know how long is "wait a bit" is, I just turned it back on 
> after a minute or so, just returns back to same inconsistent message.. 
> =P  Are we looking for entire stopped OSD to map to different OSD and 
> get 3 replica when running stopped OSD again?
> 
> Regards,
> Hong


since your  list-inconsistent-obj is empty, you need to up debugging on 
all osd's and grep the logs to find the objects with issues. this is 
explained in the link.  ceph ph  map [pg]  tells you what osd's to look 
at, and the log will have hints to the reason for the error. keep in 
mind that it can be a while since the scrub errors out, so you may need 
to look at older logs. or trigger a scrub, and wait for it to finish so 
you can check the current log.

once you have the object names you can find them with the find command.

after removing/fixing the broken object, and restaring osd, you issue 
the repair, and wait for the repair and scrub of that pg to finish. you 
can probably follow along by tailing the log.

good luck


   

   
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to