Re: OSDMap checksums

2014-08-22 Thread Sage Weil
On Fri, 22 Aug 2014, Gregory Farnum wrote: > On Thu, Aug 21, 2014 at 5:38 PM, Sage Weil wrote: > > On Tue, 19 Aug 2014, Gregory Farnum wrote: > >> As far as I can tell, checksumming incrementals are good for two > >> things besides detecting bit flips: > >> 1) It's easy to extend to signing the In

Re: OSDMap checksums

2014-08-22 Thread Gregory Farnum
On Thu, Aug 21, 2014 at 5:38 PM, Sage Weil wrote: > On Tue, 19 Aug 2014, Gregory Farnum wrote: >> As far as I can tell, checksumming incrementals are good for two >> things besides detecting bit flips: >> 1) It's easy to extend to signing the Incremental, which is more secure >> 2) It protects aga

Re: OSDMap checksums

2014-08-21 Thread Sage Weil
On Tue, 19 Aug 2014, Gregory Farnum wrote: > As far as I can tell, checksumming incrementals are good for two > things besides detecting bit flips: > 1) It's easy to extend to signing the Incremental, which is more secure > 2) It protects against accidental divergence like we saw when we added > th

Re: OSDMap checksums

2014-08-19 Thread Gregory Farnum
On Tue, Aug 19, 2014 at 9:49 PM, Sage Weil wrote: >> Right, so let's talk about how we get into that situation: >> 1) Our existing OSDMap is "bad." >> a) We were never "correct" >> b) ...we went bad and didn't notice? >> 2) The Incremental we got is "bad". >> a) It's not the original Increme

Re: OSDMap checksums

2014-08-19 Thread Sage Weil
> Right, so let's talk about how we get into that situation: > 1) Our existing OSDMap is "bad." > a) We were never "correct" > b) ...we went bad and didn't notice? > 2) The Incremental we got is "bad". > a) It's not the original Incremental generated by the mon cluster > b) ...it got corrup

Re: OSDMap checksums

2014-08-19 Thread Gregory Farnum
On Tue, Aug 19, 2014 at 5:32 PM, Sage Weil wrote: > On Tue, 19 Aug 2014, Gregory Farnum wrote: >> On Tue, Aug 19, 2014 at 3:43 PM, Sage Weil wrote: >> > We have had a range of bugs come up in the past because OSDs or mons have >> > been running different versions of the code and have encoded diff

Re: OSDMap checksums

2014-08-19 Thread Sage Weil
On Tue, 19 Aug 2014, Gregory Farnum wrote: > On Tue, Aug 19, 2014 at 3:43 PM, Sage Weil wrote: > > We have had a range of bugs come up in the past because OSDs or mons have > > been running different versions of the code and have encoded different > > variations of the same OSDMap epoch. When two

Re: OSDMap checksums

2014-08-19 Thread Gregory Farnum
On Tue, Aug 19, 2014 at 3:43 PM, Sage Weil wrote: > We have had a range of bugs come up in the past because OSDs or mons have > been running different versions of the code and have encoded different > variations of the same OSDMap epoch. When two nodes in the system > disagree about what the data

OSDMap checksums

2014-08-19 Thread Sage Weil
We have had a range of bugs come up in the past because OSDs or mons have been running different versions of the code and have encoded different variations of the same OSDMap epoch. When two nodes in the system disagree about what the data distribution is, all manner of things can go wrong and