On 09/26, Eric Biggers wrote: > On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote: > > There is one case can cause data corruption. > > > > - write 4k to fileA > > - fsync fileA, 4k data is writebacked to lbaA > > - write 4k to fileA > > - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet > > - write 4k to fileB > > - kworker flush 4k to lbaA due to SSR > > - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's > > data > > > > One solution is tracking all fsynced file's block history, and disallow > > SSR overwrite on newly invalidated block on that file. > > > > However, during recovery, no matter the dnode is flushed or fsynced, all > > previous dnodes until last fsynced one in node chain can be recovered, > > that means we need to record all block change in flushed dnode, which > > will cause heavy cost, so let's just use simple fix by forbidding SSR > > overwrite directly. > > > > Signed-off-by: Chao Yu <yuch...@huawei.com> > > --- > > fs/f2fs/segment.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > index 9d9d9a050d59..69b3b553ee6b 100644 > > --- a/fs/f2fs/segment.c > > +++ b/fs/f2fs/segment.c > > @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info > > *sbi, block_t blkaddr, int del) > > if (!f2fs_test_and_set_bit(offset, se->discard_map)) > > sbi->discard_blks--; > > > > - /* don't overwrite by SSR to keep node chain */ > > - if (IS_NODESEG(se->type) && > > - !is_sbi_flag_set(sbi, SBI_CP_DISABLED)) { > > + /* > > + * SSR should never reuse block which is checkpointed > > + * or newly invalidated. > > + */ > > + if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) { > > if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map)) > > se->ckpt_valid_blocks++; > > } > > -- > > FYI, this commit caused xfstests generic/064 to start failing:
Yup, I was looking at this. > > $ kvm-xfstests -c f2fs generic/064 > ... > generic/064 3s ... [13:36:37][ 5.946293] run fstests generic/064 at > 2019-09-26 13:36:37 > [13:36:41]- output mismatch (see > /results/f2fs/results-default/generic/064.out.bad) > --- tests/generic/064.out 2019-09-18 04:53:46.000000000 -0700 > +++ /results/f2fs/results-default/generic/064.out.bad 2019-09-26 > 13:36:41.533018683 -0700 > @@ -1,2 +1,3 @@ > QA output created by 064 > Extent count after inserts is in range > +extents mismatched before = 1 after = 50 > ... > (Run 'diff -u /root/xfstests/tests/generic/064.out > /results/f2fs/results-default/generic/064.out.bad' to see the entire diff) > Ran: generic/064 > Failures: generic/064 > Failed 1 of 1 tests