Hi,

>>My thoughts on the subject are that even though checksums do allow to find 
>>which replica is corrupt without having to figure which 2 out of 3 copies are 
>>the same, this is not the only reason min_size=2 was required.

AFAIK, 

compare copies (like 2 out of 3 copies are the same) has never been implemented.
pg_repair for example, still copy the primary pg to replicas. (even if it's 
corrupt).


and old topic about this:
http://ceph-users.ceph.narkive.com/zS2yZ2FL/how-safe-is-ceph-pg-repair-these-days

----- Mail original -----
De: "Janne Johansson" <icepic...@gmail.com>
À: c...@jack.fr.eu.org
Cc: "ceph-users" <ceph-users@lists.ceph.com>
Envoyé: Jeudi 24 Mai 2018 08:33:32
Objet: Re: [ceph-users] Ceph replication factor of 2

Den tors 24 maj 2018 kl 00:20 skrev Jack < [ mailto:c...@jack.fr.eu.org | 
c...@jack.fr.eu.org ] >: 


Hi, 

I have to say, this is a common yet worthless argument 
If I have 3000 OSD, using 2 or 3 replica will not change much : the 
probability of losing 2 devices is still "high" 
On the other hand, if I have a small cluster, less than a hundred OSD, 
that same probability become "low" 



It's about losing the 2 or 3 OSDs that any particular PG is on that matters, 
not if there are 1000 other OSDs in the next rack. 
Losing data is rather binary, its not a from 0.0 -> 1.0 scale. Either a piece 
of data is lost because its storage units are not there 
or its not. Murphys law will make it so that this lost piece of data is rather 
important to you. And Murphy will of course pick the 
2-3 OSDs that are the worst case for you. 

BQ_BEGIN

I do not buy the "if someone is making a maintenance and a device fails" 
either : this is a no-limit goal: what is X servers burns at the same 
time ? What if an admin make a mistake and drop 5 OSD ? What is some 
network tor or routers blow away ? 
Should we do one replica par OSD ? 


BQ_END

From my viewpoint, maintenance must happen. Unplanned maintenance will happen 
even if I wish it not to. 
So the 2-vs-3 is about what situation you end up in when one replica is under 
(planned or not) maintenance. 
Is this a "any surprise makes me lose data now" mode, or is it "many surprises 
need to occur"? 

BQ_BEGIN

I would like people, especially the Ceph's devs and other people who 
knows how it works deeply (read the code!) to give us their advices 

BQ_END

How about listening to people who have lost data during 20+ year long careers 
in storage? 
They will know a lot more on how the very improbable or "impossible" still 
happened to them 
at the most unfortunate moment, regardless of what the code readers say. 

This is all about weighing risks. If the risk for you is "ok, then I have to 
redownload that lost ubuntu-ISO again" its fine 
to stick to data in only one place. 

If the company goes out of business or at least faces 2 days total stop while 
some sleep-deprived admin tries the 
bare metal restores for the first time of her life then the price of SATA disks 
to cover 3 replicas will be literally 
nothing compared to that. 

To me it sounds like you are chasing some kind of validation of an answer you 
already have while asking the questions, 
so if you want to go 2-replicas, then just do it. But you don't get to complain 
to ceph or ceph-users when you also figure 
that the Mean-Time-Between-Failure ratings on the stickers of the disks is 
bogus and what you really needed was 
"mean time between surprises", and thats always less than MTBF. 

-- 
May the most significant bit of your life be positive. 

_______________________________________________ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to