On 2018/4/27 0:36, Jaegeuk Kim wrote: > On 04/26, Chao Yu wrote: >> On 2018/4/26 23:48, Jaegeuk Kim wrote: >>> On 04/26, Chao Yu wrote: >>>> Thread A Thread B >>>> - f2fs_ioc_commit_atomic_write >>>> - commit_inmem_pages >>>> - f2fs_submit_merged_write_cond >>>> : write data >>>> - write_checkpoint >>>> - do_checkpoint >>>> : commit all node within CP >>>> -> SPO >>>> - f2fs_do_sync_file >>>> - file_write_and_wait_range >>>> : wait data writeback >>>> >>>> In above race condition, data/node can be flushed in reversed order when >>>> coming a checkpoint before f2fs_do_sync_file, after SPOR, it results in >>>> atomic written data being corrupted. >>> >>> Wait, what is the problem here? Thread B could succeed checkpoint, there is >>> no problem. If it fails, there is no fsync mark where we can recover it, so >> >> Node is flushed by checkpoint before data, with reversed order, that's the >> problem. > > What do you mean? Data should be in disk, in order to proceed checkpoint.
1. thread A: commit_inmem_pages submit data into block layer, but haven't waited it writeback. 2. thread A: commit_inmem_pages update related node. 3. thread B: do checkpoint, flush all nodes to disk 4. SPOR Then, atomic file becomes corrupted since nodes is flushed before data. Thanks, > >> >> Thanks, >> >>> we can just ignore the last written data as nothing. >>> >>>> >>>> This patch adds f2fs_wait_on_page_writeback in __revoke_inmem_pages() to >>>> keep data and node of atomic file being flushed orderly. >>>> >>>> Signed-off-by: Chao Yu <yuch...@huawei.com> >>>> --- >>>> fs/f2fs/file.c | 4 ++++ >>>> fs/f2fs/segment.c | 3 +++ >>>> 2 files changed, 7 insertions(+) >>>> >>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c >>>> index be7578774a47..a352804af244 100644 >>>> --- a/fs/f2fs/file.c >>>> +++ b/fs/f2fs/file.c >>>> @@ -217,6 +217,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t >>>> start, loff_t end, >>>> >>>> trace_f2fs_sync_file_enter(inode); >>>> >>>> + if (atomic) >>>> + goto write_done; >>>> + >>>> /* if fdatasync is triggered, let's do in-place-update */ >>>> if (datasync || get_dirty_pages(inode) <= SM_I(sbi)->min_fsync_blocks) >>>> set_inode_flag(inode, FI_NEED_IPU); >>>> @@ -228,6 +231,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t >>>> start, loff_t end, >>>> return ret; >>>> } >>>> >>>> +write_done: >>>> /* if the inode is dirty, let's recover all the time */ >>>> if (!f2fs_skip_inode_update(inode, datasync)) { >>>> f2fs_write_inode(inode, NULL); >>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>> index 584483426584..9ca3d0a43d93 100644 >>>> --- a/fs/f2fs/segment.c >>>> +++ b/fs/f2fs/segment.c >>>> @@ -230,6 +230,8 @@ static int __revoke_inmem_pages(struct inode *inode, >>>> >>>> lock_page(page); >>>> >>>> + f2fs_wait_on_page_writeback(page, DATA, true); >>>> + >>>> if (recover) { >>>> struct dnode_of_data dn; >>>> struct node_info ni; >>>> @@ -415,6 +417,7 @@ static int __commit_inmem_pages(struct inode *inode) >>>> /* drop all uncommitted pages */ >>>> __revoke_inmem_pages(inode, &fi->inmem_pages, true, false); >>>> } else { >>>> + /* wait all committed IOs writeback and release them from list >>>> */ >>>> __revoke_inmem_pages(inode, &revoke_list, false, false); >>>> } >>>> >>>> -- >>>> 2.15.0.55.gc2ece9dc4de6 > > . > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel