On Thu, Jun 28, 2018 at 11:16 AM, Andrei Borzenkov <arvidj...@gmail.com> wrote: > On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >> >> >> On 2018年06月28日 11:14, r...@georgianit.com wrote: >>> >>> >>> On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: >>> >>>> >>>> Please get yourself clear of what other raid1 is doing. >>> >>> A drive failure, where the drive is still there when the computer reboots, >>> is a situation that *any* raid 1, (or for that matter, raid 5, raid 6, >>> anything but raid 0) will recover from perfectly without raising a sweat. >>> Some will rebuild the array automatically, >> >> WOW, that's black magic, at least for RAID1. >> The whole RAID1 has no idea of which copy is correct unlike btrfs who >> has datasum. >> >> Don't bother other things, just tell me how to determine which one is >> correct? >> > > When one drive fails, it is recorded in meta-data on remaining drives; > probably configuration generation number is increased. Next time drive > with older generation is not incorporated. Hardware controllers also > keep this information in NVRAM and so do not even depend on scanning > of other disks. > >> The only possibility is that, the misbehaved device missed several super >> block update so we have a chance to detect it's out-of-date. >> But that's not always working. >> > > Why it should not work as long as any write to array is suspended > until superblock on remaining devices is updated? > >> If you're talking about missing generation check for btrfs, that's >> valid, but it's far from a "major design flaw", as there are a lot of >> cases where other RAID1 (mdraid or LVM mirrored) can also be affected >> (the brain-split case). >> > > That's different. Yes, with software-based raid there is usually no > way to detect outdated copy if no other copies are present. Having > older valid data is still very different from corrupting newer data. > >>> others will automatically kick out the misbehaving drive. *none* of them >>> will take back the the drive with old data and start commingling that data >>> with good copy.)\ This behaviour from BTRFS is completely abnormal.. and >>> defeats even the most basic expectations of RAID. >> >> RAID1 can only tolerate 1 missing device, it has nothing to do with >> error detection. >> And it's impossible to detect such case without extra help. >> >> Your expectation is completely wrong. >> > > Well ... somehow it is my experience as well ... :)
s/experience/expectation/ sorry. > >>> >>> I'm not the one who has to clear his expectations here. >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html