Hi Jaegeuk, On Thu, Jun 4, 2020 at 1:19 AM Jaegeuk Kim <jaeg...@kernel.org> wrote: > > Hi Hongwei, > > On 05/29, Hongwei wrote: > > Hi, > > >On 05/28, Hongwei wrote: > > >> Hi F2FS experts, > > >> As written in f2fs_do_sync_file(): > > >> "Both of fdatasync() and fsync() are able to be recovered from > > >> sudden-power-off." > > >> > > >> Please consider this workflow: > > >> 1. Start atomic write > > >> 2. Multiple file writes > > >> 3. Commit atomic write > > >> 4. fdatasync() > > >> 5. Powerloss. > > >> > > >> In the 4th step, the fdatasync() doesn't wait for node writeback. > > >> So we may loss node blocks after powerloss. > > >> > > >> If the data blocks are persisted but node blocks aren't, can the > > >> recovery program recover the transaction? > > > > > >#3 will guarantee the blocks written by #2. So, if there's no written > > >between #3 > > >and #4, I think we have nothing to recover. > > >Does this make sense to you? > > > > Thanks for your reply. Please consider this: > > f2fs_do_sync_file() doesn't wait for node writeback if atomic==1. So it is > > possible that after #3, node is still writing back. > > #4 fdatasync() doesn't wait for node write back either. > > Considering node writeback BIO is flagged with PREFLUSH and FUA, it may > > take a long time to complete. > > Therefore, when #5 power failure happens, it is possible that the node > > block is not persisted? > > If I was correct about this, can the recovery program recover the > > transaction? > > I see. That can be the issue tho, is there a real usecase for this? I mean, > given atomic writes by sqlite, next transaction will be also serialized with > another atomic writes, which we could bypass waiting node writes. >
Thanks for your reply. I think the use case is from SQLite. I'm writing an SQLite test program and I need to decide whether to use fdatasync() or fsync() after the F2FS transaction to ensure durability. E.g., if the SQLite receives an INSERT, it needs to ensure the data's persistency before returning the SQL handler. My guess is that in this case, the SQLite needs to use fsync(). This further drives me to think that whether we can optimize F2FS so that in this case we can use fdatasync() instead of fsync(). My concern is that under current implementation, it is possible that after #4, the data is still volatile (data BIOs are not flagged with FUA so waiting for data page writeback can't guarantee its persistency). Therefore, if we add the FUA flag to data BIOs, maybe we can at lease guarantee that the data blocks are durable after fdatasync()? If all of my understandings are correct, can F2FS roll forward the transaction if all its data blocks are persisted while missing node blocks? (My guess is no because in such case we don't know the file offset of the data blocks) Or, maybe this just doesn't happen in reality? > Thanks, > > > > > > > > >> > > >> Thanks! > > >> > > >> Hongwei > > >> _______________________________________________ > > >> Linux-f2fs-devel mailing list > > >> Linux-f2fs-devel@lists.sourceforge.net > > >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > > _______________________________________________ > > Linux-f2fs-devel mailing list > > Linux-f2fs-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > > > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel