No, you wouldn’t need to re-replicate the whole disk for a single bad sector.  
The way to deal with that if the object is on the primary is to remove the file 
manually from the OSD’s filesystem and perform a repair of the PG that holds 
that object.  This will copy the object back from one of the replicas.

David

On Nov 17, 2013, at 10:46 PM, Chris Dunlop <ch...@onthe.net.au> wrote:

> Hi David,
> 
> On Fri, Nov 15, 2013 at 10:00:37AM -0800, David Zafman wrote:
>> 
>> Replication does not occur until the OSD is “out.”  This creates a new 
>> mapping in the cluster of where the PGs should be and thus data begins to 
>> move and/or create sufficient copies.  This scheme lets you control how and 
>> when you want the replication to occur.  If you have plenty of space and you 
>> aren’t going to replace the drive immediately, just mark the OSD “down" AND 
>> “out.".  If you are going to replace the drive immediately, set the “noout” 
>> flag.  Take the OSD “down” and replace drive.  Assuming it is mounted in the 
>> same place as the bad drive, bring the OSD back up.  This will replicate 
>> exactly the same PGs the bad drive held back to the replacement drive.  As 
>> was stated before don’t forget to “ceph osd unset noout"
>> 
>> Keep in mind that in the case of a machine that has a hardware failure and 
>> takes OSD(s) down there is an automatic timeout which will mark them “out" 
>> for unattended operation.  Unless you are monitoring the cluster 24/7 you 
>> should have enough disk space available to handle failures.
>> 
>> Related info in:
>> 
>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
>> 
>> David Zafman
>> Senior Developer
>> http://www.inktank.com
> 
> 
> Are you saying, if a disk suffers from a bad sector in an object
> for which it's primary, and for which good data exists on other
> replica PGs, there's no way for ceph to recover other than by
> (re-)replicating the whole disk?
> 
> I.e., even if the disk is able to remap the bad sector using a
> spare, so the disk is ok (albeit missing a sector's worth of
> object data), the only way to recover is to basically blow away
> all the data on that disk and start again, replicating
> everything back to the disk (or to other disks)?
> 
> Cheers,
> 
> Chris.






_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to