15.09.2017 08:50, Goffredo Baroncelli пишет: > On 09/15/2017 05:55 AM, Andrei Borzenkov wrote: >> 15.09.2017 01:00, Goffredo Baroncelli пишет: >>> >>> 2) The second bug, is a more severe bug. If during a writing of a buffer >>> with O_DIRECT, the buffer is updated at the same time by a second process, >>> the checksum may be incorrect. >>> >> >> Is it btrfs specific ? If buffer is updated before it was actually >> consumed by kernel, this likely means data corruption on any filesystem. > > I don't see any corruption in other FS. The fact that application push to > filesystem garbage, doesn't allow the filesystem to be corrupted.
I did not say "filesystem corruption", I said "data corruption". > In this case the filesystem became corrupted, because another application > which try to read the data (without O_DIRECT) may got -EIO. > No. *Data* on this filesystem was corrupted and luckily btrfs makes you aware of it. On different filesystem you still may have the same data corruption, but silent. > I repeat, the problem is a data race when the data is in the FS camp, and the > kernel does wrong checksum. > Of course it is race. But again - I expect that when pwrite() returns it means data buffer can be reused. Otherwise I cannot see how O_DIRECT can be sensibly used at all. In this case you need to demonstrate that data corruption happens after pwrite() returns - this makes it btrfs issue indeed. If data corruption happens while thread is waiting for pwrite() to return, I say this is expected behavior and application fault - it need to protect against concurrent write and modification. > > IMHO, BTRFS should disallow O_DIRECT (which is the same thing that does ZFS > on linux); I think that it could be allowed only for nodatasum files. > >> I.e. there should be clear indication from kernel that buffer can be >> reused by application, in your example - when pwrite returns. So when >> data corruption happens - during pwrite or after? >> If data is corrupted >> during pwrite, it is arguably application fault - it should disallow >> concurrent access. > > > > > >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html