Hi Jaegeuk,

On Thu, Jun 4, 2020 at 1:19 AM Jaegeuk Kim <jaeg...@kernel.org> wrote:
>
> Hi Hongwei,
>
> On 05/29, Hongwei wrote:
> > Hi,
> > >On 05/28, Hongwei wrote:
> > >> Hi F2FS experts,
> > >> As written in f2fs_do_sync_file():
> > >> "Both of fdatasync() and fsync() are able to be recovered from 
> > >> sudden-power-off."
> > >>
> > >> Please consider this workflow:
> > >> 1. Start atomic write
> > >> 2. Multiple file writes
> > >> 3. Commit atomic write
> > >> 4. fdatasync()
> > >> 5. Powerloss.
> > >>
> > >> In the 4th step, the fdatasync() doesn't wait for node writeback.
> > >> So we may loss node blocks after powerloss.
> > >>
> > >> If the data blocks are persisted but node blocks aren't, can the 
> > >> recovery program recover the transaction?
> > >
> > >#3 will guarantee the blocks written by #2. So, if there's no written 
> > >between #3
> > >and #4, I think we have nothing to recover.
> > >Does this make sense to you?
> >
> > Thanks for your reply. Please consider this:
> > f2fs_do_sync_file() doesn't wait for node writeback if atomic==1. So it is 
> > possible that after #3, node is still writing back.
> > #4 fdatasync() doesn't wait for node write back either.
> > Considering node writeback BIO is flagged with PREFLUSH and FUA, it may 
> > take a long time to complete.
> > Therefore, when #5 power failure happens, it is possible that the node 
> > block is not persisted?
> > If I was correct about this, can the recovery program recover the 
> > transaction?
>
> I see. That can be the issue tho, is there a real usecase for this? I mean,
> given atomic writes by sqlite, next transaction will be also serialized with
> another atomic writes, which we could bypass waiting node writes.
>

Thanks for your reply. I think the use case is from SQLite.
I'm writing an SQLite test program and I need to decide whether to use
fdatasync() or fsync() after the F2FS transaction to ensure
durability.
E.g., if the SQLite receives an INSERT, it needs to ensure the data's
persistency before returning the SQL handler.
My guess is that in this case, the SQLite needs to use fsync().

This further drives me to think that whether we can optimize F2FS so
that in this case we can use fdatasync() instead of fsync().
My concern is that under current implementation, it is possible that
after #4, the data is still volatile (data BIOs are not flagged with
FUA so waiting for data page writeback can't guarantee its
persistency).
Therefore, if we add the FUA flag to data BIOs, maybe we can at lease
guarantee that the data blocks are durable after fdatasync()?

If all of my understandings are correct, can F2FS roll forward the
transaction if all its data blocks are persisted while missing node
blocks? (My guess is no because in such case we don't know the file
offset of the data blocks)

Or, maybe this just doesn't happen in reality?

> Thanks,
>
> >
> > >
> > >>
> > >> Thanks!
> > >>
> > >> Hongwei
> > >> _______________________________________________
> > >> Linux-f2fs-devel mailing list
> > >> Linux-f2fs-devel@lists.sourceforge.net
> > >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to