Re: Replacing a failed disk/OSD: unfound object

2011-07-13 Thread Meng Zhao
On Wed, 13 Jul 2011 09:19:40 -0700, Tommi Virtanen wrote: On Wed, Jul 13, 2011 at 03:15, Meng Zhao wrote: active+clean; 349 MB data, 1394 MB used, 408 MB / 2046 MB avail; 49/224 degraded (21.875%) =>for some reason osd2 failed during object replication If you lose osds while in degraded mode

Re: Replacing a failed disk/OSD: unfound object

2011-07-13 Thread Tommi Virtanen
On Wed, Jul 13, 2011 at 03:15, Meng Zhao wrote: > active+clean; 349 MB data, 1394 MB used, 408 MB / 2046 MB avail; 49/224 > degraded (21.875%) > =>for some reason osd2 failed during object replication If you lose osds while in degraded mode, you very much can lose objects permanently. Degraded me

Re: Replacing a failed disk/OSD: unfound object

2011-07-13 Thread Meng Zhao
The test setting is like this: I build a ceph3.0 system with 3mon, 3mds, 3osd on 3 machines. Then I copied some file around. on a ceph client and wait until ceph -w shows regular healthy states. Now I turn one machine off. Here are the logs for this situation of unfound object:please look for

Re: Replacing a failed disk/OSD: unfound object

2011-07-12 Thread Sage Weil
On Tue, 12 Jul 2011, Meng Zhao wrote: > Thanks Tommi. I rebuilt the ceph cluster a few times just to reproduce the > situation. The result seems mixed, more likely btrfs failed (after power > reset). But it does happen anyway. > > The big question is: However rare, unfound object situation makes

Re: Replacing a failed disk/OSD: unfound object

2011-07-12 Thread Meng Zhao
Thanks Tommi. I rebuilt the ceph cluster a few times just to reproduce the situation. The result seems mixed, more likely btrfs failed (after power reset). But it does happen anyway. The big question is: However rare, unfound object situation makes the *entire* ceph file system not mountable,

Re: Replacing a failed disk/OSD: unfound object

2011-07-08 Thread Tommi Virtanen
[It seems I dropped the Cc: to ceph-devel, added it back.. Please reply to this message instead, and sorry about that. I'm starting to dislike Google Apps for mailing list traffic :( ] On Fri, Jul 8, 2011 at 10:07, Tommi Virtanen wrote: > On Fri, Jul 8, 2011 at 01:23, Meng Zhao wrote: >> I was t

Replacing a failed disk/OSD: unfound object

2011-07-08 Thread Meng Zhao
Hi, I was trying to replace a disk for an osd by following instruction at: http://ceph.newdream.net/wiki/Replacing_a_failed_disk/OSD Now, ceph -w getting 2011-07-08 15:52:39.702881pg v1602: 602 pgs: 49 active+degraded, 553 active+clean+degraded; 349 MB data, 333 MB used, 566 MB / 1023 MB