On Aug 29, 2019 / 22:42, Chao Yu wrote:
> On 2019-8-28 19:53, Shinichiro Kawasaki wrote:
> > On Aug 27, 2019 / 10:25, Chao Yu wrote:
> >> On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> >>> When sudden f2fs shutdown happens on zoned block devices, write
> >>> pointers can be inconsistent with valid blocks counts in meta data.
> >>> The failure scenario is as follows:
> >>>
> >>> - Just before a sudden shutdown, a new segment in a new zone is selected
> >>>   for a current segment. Write commands were executed to the segment.
> >>>   and the zone has a write pointer not at zone start.
> >>> - Before the write commands complete, shutdown happens. Meta data is
> >>>   not updated and still keeps zero valid blocks count for the zone.
> >>> - After next mount of the file system, the zone is selected for the next
> >>>   write target because it has zero valid blocks count. However, it has
> >>>   the write pointer not at zone start. Then "Unaligned write command"
> >>>   error happens.
> >>>
> >>> To avoid this potential error path, reset write pointers if the zone
> >>> does not have a current segment, the write pointer is not at the zone
> >>> start and the zone has no valid blocks.
> >>>
> >>> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawas...@wdc.com>
> >>> ---
> >>>  fsck/fsck.c | 30 +++++++++++++++++++++++++++++-
> >>>  1 file changed, 29 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/fsck/fsck.c b/fsck/fsck.c
> >>> index 21a06ac..cc9bbc0 100644
> >>> --- a/fsck/fsck.c
> >>> +++ b/fsck/fsck.c
> >>> @@ -2595,6 +2595,7 @@ static int fsck_chk_write_pointer(int i, struct 
> >>> blk_zone *blkz, void *opaque)
> >>>   int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
> >>>   unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
> >>>   void *zero_blk;
> >>> + block_t zone_valid_blocks = 0;
> >>>  
> >>>   if (blk_zone_conv(blkz))
> >>>           return 0;
> >>> @@ -2615,8 +2616,35 @@ static int fsck_chk_write_pointer(int i, struct 
> >>> blk_zone *blkz, void *opaque)
> >>>                   break;
> >>>   }
> >>>  
> >>> - if (cs_index >= NR_CURSEG_TYPE)
> >>> + if (cs_index >= NR_CURSEG_TYPE) {
> >>> +         for (b = zone_block; b < zone_block + c.zone_blocks &&
> >>> +                      IS_VALID_BLK_ADDR(sbi, b); b += c.blks_per_seg) {
> >>> +                 se = get_seg_entry(sbi, GET_SEGNO(sbi, b));
> >>> +                 zone_valid_blocks += se->valid_blocks;
> >>> +         }
> >>> +         if (wp_block == zone_block || zone_valid_blocks)
> >>> +                 return 0;
> >>> +
> >>> +         /*
> >>> +          * The write pointer is not at zone start but there is no valid
> >>> +          * block in the zone. Segments in the zone can be selected for
> >>> +          * next write. Need to reset the write pointer to avoid
> >>> +          * unaligned write command error.
> >>
> >> In SPOR (sudden power-off recovery) of kernel side, we may revalidate 
> >> blocks
> >> belong to fsynced file in such zone within range of [0, write pointer], if 
> >> we
> >> just reset zone, will we lose those data for ever?
> > 
> > Yes. This patch resets zone and the data will be lost. I walked through
> > fs/f2fs/recovery.c and learned that nodes with fsync mark are recovered at
> > remount. Such fsync recovery cannot be done after zone reset. To avoid the
> > data loss, I would like to drop this fourth patch at this moment.
> > 
> > Later on, I will consider safer approach not to reset the zone, but to set 
> > next
> > write target block at the write pointer. I guess this approach will need 
> > kernel
> > side patch to change block selection logic.
> 
> I guess below commit can help to recognize fsynced data in unclean umounted
> image, maybe we can skip invalidating those data during zone write pointer 
> recovery.
> 
> f2fs-tools: fix to skip block allocation for fsynced data
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git/commit/?h=dev-test&id=a50cfc89e56ce8c022e295bf4de619af070fabe9

Thank you for the idea! The find_fsync_inode() function in the commit looks
useful. Will try to create a separated patch on top of the commit.

--
Best Regards,
Shin'ichiro Kawasaki

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to