On 2019/3/20 上午7:41, Anand Jain wrote: > >>> But csum verification is a point in verification and its not a >>> tree based transid verification. Which means if there is a stale data >>> with matching csum we may return a junk data silently. >> >> Then the normal idea is to use stronger but slower csum in the first >> place, to avoid the csum match case. > > This is just a general observational comment, its ok lets assume > current point in csum verification works (as opposed to tree based > parent transid verification). > >>> This problem is >>> easily reproducible when csum is disabled but not impossible to >>> achieve >>> when csum is not disabled as well. >> >> Under this case, it's the user to be blamed for the decision to disable >> the csum in the first place. > > The point here is. The logic isn't aware of the write hole on the other > disk on which the metadata is not verified. I disagree that nocsum or > the user to be blamed. > >> >>> A tree based integrity verification >>> is important for all data, which is missing. >>> Fix: >>> In this RFC patch it proposes to use same disk from with the >>> metadata >>> is read to read the data. >> >> The obvious problem I found is, the idea only works for RAID1/10. >> >> For striped profile it makes no sense, or even have a worse chance to >> get stale data. >> >> >> To me, the idea of using possible better mirror makes some sense, but >> very profile limited. > > Yep. This problem and fix is only for the mirror based profiles > such as raid1/raid10.
Then current implementation lacks such check. Further more, data and metadata can lie in different chunks and have different chunk types. > >> >> Another idea I get inspired from the idea is, make it more generic so >> that bad/stale device get a lower priority. > > When it comes to reading junk data, its not about the priority its > about the eliminating. When the problem is only few blocks, I am > against making the whole disk as bad. > >> Although it suffers the same problem as I described. >> >> To make the point short, the use case looks very limited. > > It applies to raid1/raid10 with nodatacow (which implies nodatasum). > In my understanding that's not rare. > > Any comments on the fix offered here? The implementation part is, is eb->read_mirror reliable? E.g. if the data and the eb are in different chunks, and the stale happens in the chunk of eb but not in the data chunk? Thanks, Qu > > Thanks, Anand > > >> Thanks, >> Qu
signature.asc
Description: OpenPGP digital signature