Not so gentle ping. IMHO this fix itself should be worthy a minor release.
Thanks, Qu On 2019/3/27 下午10:48, Qu Wenruo wrote: > > > On 2019/3/27 下午10:07, Adam Borowski wrote: >> On Wed, Mar 27, 2019 at 05:46:50PM +0800, Qu Wenruo wrote: >>> This urgent patchset can be fetched from github: >>> https://github.com/adam900710/btrfs-progs/tree/flush_super >>> Which is based on v4.20.2. >>> >>> Before this patch, btrfs-progs writes to the fs has no barrier at all. >>> All metadata and superblock are just buffered write, no barrier between >>> super blocks and metadata writes at all. >>> >>> No wonder why even clear space cache can cause serious transid >>> corruption to the originally good fs. >>> >>> Please merge this fix as soon as possible as I really don't want to see >>> btrfs-progs corrupting any fs any more. >> >> How often does this happen in practice? I'm slightly incredulous about >> btrfs-progs crashing often. Especially that pwrite() is buffered on the >> kernel side, so we'd need a _kernel_ crash (usually a power loss) to break >> consistency. Obviously, a potential data loss bug is always something that >> needs fixing, I'm just wondering about severity. > > Here is a valid case where a crash could cause transid error: > > - transaction 1 > new em at 16K (fs root, gen = 1) > new em at 32K (extent root, gen = 1) > new em at 48K (tree root, gen = 1) > sb->fs root = gen 1 > sb->extent root = gen 1 > sb->tree root = gen 1 > > - transaction 2 > new em at 64K (extent root, gen = 2) > new em at 80K (tree root, gen = 2) > sb->fs root = gen 1 at 16K > sb->extent root = gen 2 > sb->tree root = gen 2 > > - transaction 3, half backed due to error commit transaction > new eb at 16K (tree root, gen = 3) submitted > > In above case, we will write the newest eb at 16K to disk, but with sb > from transaction 2. > > Then sb expects to read out a tree with gen 1, but get a tree with gen 3. > Further more, even we ignore the generation mismatch, the content of em > 16K is completely wrong, super block of gen 2 expects fs root content > from em at 16K, but its content is tree root. > > This should explain the severity much better. > > Thanks, > Qu > >> >> Or do I understand this wrong? >> >> Asking because Dimitri John Ledkov stepped down as Debian's maintainer of >> this package, and I'm taking up the mantle (with Nicholas D Steeves being >> around) -- modulo any updates other than important bug fixes being on hold >> because of Debian's freeze. Thus, I wonder if this is important enough to >> ask for a freeze exception. >> >> >> Meow! >> >
signature.asc
Description: OpenPGP digital signature