Re: [f2fs-dev] [External Mail] Re: [External Mail] Re: [PATCH] f2fs: prevent the current section from being selected as a victim during garbage collection

2025-04-05 Thread yohan.jo...@sk.com
> From: Chao Yu 
> Sent: Thursday, March 27, 2025 4:30 PM
> To: 정요한(JOUNG YOHAN) Mobile AE ; Yohan Joung
> ; jaeg...@kernel.org; daeh...@gmail.com
> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; linux-
> ker...@vger.kernel.org; 김필현(KIM PILHYUN) Mobile AE 
> Subject: [External Mail] Re: [External Mail] Re: [PATCH] f2fs: prevent the
> current section from being selected as a victim during garbage collection
> 
> On 3/27/25 14:43, yohan.jo...@sk.com wrote:
> >> From: Chao Yu 
> >> Sent: Thursday, March 27, 2025 3:02 PM
> >> To: Yohan Joung ; jaeg...@kernel.org;
> >> daeh...@gmail.com
> >> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; linux-
> >> ker...@vger.kernel.org; 정요한(JOUNG YOHAN) Mobile AE
> >> 
> >> Subject: [External Mail] Re: [PATCH] f2fs: prevent the current
> >> section from being selected as a victim during garbage collection
> >>
> >> On 3/26/25 22:14, Yohan Joung wrote:
> >>> When selecting a victim using next_victim_seg in a large section,
> >>> the selected section might already have been cleared and designated
> >>> as the new current section, making it actively in use.
> >>> This behavior causes inconsistency between the SIT and SSA.
> >>
> >> Hi, does this fix your issue?
> >
> > This is an issue that arises when dividing a large section into
> > segments for garbage collection.
> > caused by the background GC (garbage collection) thread in large
> > section
> > f2fs_gc(victim_section) ->
> > f2fs_clear_prefree_segments(victim_section)->
> > cursec(victim_section) -> f2fs_gc(victim_section by next_victim_seg)
> 
> I didn't get it, why f2fs_get_victim() will return section which is used
> by curseg? It should be avoided by checking w/ sec_usage_check().
> 
> Or we missed to check gcing section which next_victim_seg points to during
> get_new_segment()?
> 
> Can this happen?
> 
> e.g.
> - bggc selects sec #0
> - next_victim_seg: seg #0
> - migrate seg #0 and stop
> - next_victim_seg: seg #1
> - checkpoint, set sec #0 free if sec #0 has no valid blocks
> - allocate seg #0 in sec #0 for curseg
> - curseg moves to seg #1 after allocation
> - bggc tries to migrate seg #1
> 
> Thanks,
That's correct
In f2fs_get_victim, we use next_victim_seg to 
directly jump to got_result, thereby bypassing sec_usage_check
What do you think about this change?

@@ -850,15 +850,20 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned 
int *result,
p.min_segno = sbi->next_victim_seg[BG_GC];
*result = p.min_segno;
sbi->next_victim_seg[BG_GC] = NULL_SEGNO;
-   goto got_result;
}
if (gc_type == FG_GC &&
sbi->next_victim_seg[FG_GC] != NULL_SEGNO) {
p.min_segno = sbi->next_victim_seg[FG_GC];
*result = p.min_segno;
sbi->next_victim_seg[FG_GC] = NULL_SEGNO;
-   goto got_result;
}
+
+   secno = GET_SEC_FROM_SEG(sbi, segno);
+
+   if (sec_usage_check(sbi, secno))
+   goto next;
+
+   goto got_result;
}
> 
> >
> > Because the call stack is different,
> > I think that in order to handle everything at once, we need to address
> > it within do_garbage_collect, or otherwise include it on both sides.
> > What do you think?
> >
> > [30146.337471][ T1300] F2FS-fs (dm-54): Inconsistent segment (70961)
> > type [0, 1] in SSA and SIT [30146.346151][ T1300] Call trace:
> > [30146.346152][ T1300]  dump_backtrace+0xe8/0x10c [30146.346157][
> > T1300]  show_stack+0x18/0x28 [30146.346158][ T1300]
> > dump_stack_lvl+0x50/0x6c [30146.346161][ T1300]  dump_stack+0x18/0x28
> > [30146.346162][ T1300]  f2fs_stop_checkpoint+0x1c/0x3c [30146.346165][
> > T1300]  do_garbage_collect+0x41c/0x271c [30146.346167][ T1300]
> > f2fs_gc+0x27c/0x828 [30146.346168][ T1300]  gc_thread_func+0x290/0x88c
> > [30146.346169][ T1300]  kthread+0x11c/0x164 [30146.346172][ T1300]
> > ret_from_fork+0x10/0x20
> >
> > struct curseg_info : 0xff803f95e800 {
> > segno: 0x11531 : 70961
> > }
> >
> > struct f2fs_sb_info : 0xff8811d12000 {
> > next_victim_seg[0] : 0x11531 : 70961
> > }
> >
> >>
> >> https://lore.kernel.org/linux-f2fs-devel/20250325080646.3291947-2-
> >> c...@kernel.org
> >>
> >> Thanks,
> >>
> >>>
> >>> Signed-off-b

Re: [f2fs-dev] [External Mail] Re: [PATCH] f2fs: prevent the current section from being selected as a victim during garbage collection

2025-03-26 Thread yohan.jo...@sk.com
> From: Chao Yu 
> Sent: Thursday, March 27, 2025 3:02 PM
> To: Yohan Joung ; jaeg...@kernel.org; daeh...@gmail.com
> Cc: c...@kernel.org; linux-f2fs-devel@lists.sourceforge.net; linux-
> ker...@vger.kernel.org; 정요한(JOUNG YOHAN) Mobile AE 
> Subject: [External Mail] Re: [PATCH] f2fs: prevent the current section
> from being selected as a victim during garbage collection
> 
> On 3/26/25 22:14, Yohan Joung wrote:
> > When selecting a victim using next_victim_seg in a large section, the
> > selected section might already have been cleared and designated as the
> > new current section, making it actively in use.
> > This behavior causes inconsistency between the SIT and SSA.
> 
> Hi, does this fix your issue?

This is an issue that arises when dividing 
a large section into segments for garbage collection.
caused by the background GC (garbage collection) thread in large section
f2fs_gc(victim_section) -> f2fs_clear_prefree_segments(victim_section)-> 
cursec(victim_section) -> f2fs_gc(victim_section by next_victim_seg)

Because the call stack is different, 
I think that in order to handle everything at once, 
we need to address it within do_garbage_collect, 
or otherwise include it on both sides. What do you think?

[30146.337471][ T1300] F2FS-fs (dm-54): Inconsistent segment (70961) type [0, 
1] in SSA and SIT
[30146.346151][ T1300] Call trace:
[30146.346152][ T1300]  dump_backtrace+0xe8/0x10c
[30146.346157][ T1300]  show_stack+0x18/0x28
[30146.346158][ T1300]  dump_stack_lvl+0x50/0x6c
[30146.346161][ T1300]  dump_stack+0x18/0x28
[30146.346162][ T1300]  f2fs_stop_checkpoint+0x1c/0x3c
[30146.346165][ T1300]  do_garbage_collect+0x41c/0x271c
[30146.346167][ T1300]  f2fs_gc+0x27c/0x828
[30146.346168][ T1300]  gc_thread_func+0x290/0x88c
[30146.346169][ T1300]  kthread+0x11c/0x164
[30146.346172][ T1300]  ret_from_fork+0x10/0x20

struct curseg_info : 0xff803f95e800 {
segno: 0x11531 : 70961
}

struct f2fs_sb_info : 0xff8811d12000 {
next_victim_seg[0] : 0x11531 : 70961
}

> 
> https://lore.kernel.org/linux-f2fs-devel/20250325080646.3291947-2-
> c...@kernel.org
> 
> Thanks,
> 
> >
> > Signed-off-by: Yohan Joung 
> > ---
> >  fs/f2fs/gc.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index
> > 2b8f9239bede..4b5d18e395eb 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1926,6 +1926,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct
> f2fs_gc_control *gc_control)
> > goto stop;
> > }
> >
> > +   if (__is_large_section(sbi) &&
> > +   IS_CURSEC(sbi, GET_SEC_FROM_SEG(sbi, segno)))
> > +   goto stop;
> > +
> > seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type,
> > gc_control->should_migrate_blocks,
> > gc_control->one_time);


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [External Mail] Re: [PATCH] f2fs: optimize f2fs DIO overwrites

2025-03-11 Thread yohan.jo...@sk.com
> From: Daeho Jeong 
> Sent: Wednesday, March 12, 2025 6:14 AM
> To: Chao Yu 
> Cc: Yohan Joung ; jaeg...@kernel.org; linux-f2fs-
> de...@lists.sourceforge.net; linux-ker...@vger.kernel.org; 정요한(JOUNG
> YOHAN) Mobile AE 
> Subject: [External Mail] Re: [PATCH] f2fs: optimize f2fs DIO overwrites
> 
> On Tue, Mar 11, 2025 at 5:00 AM Chao Yu  wrote:
> >
> > On 3/7/25 22:56, Yohan Joung wrote:
> > > this is unnecessary when we know we are overwriting already
> > > allocated blocks and the overhead of starting a transaction can be
> > > significant especially for multithreaded workloads doing small writes.
> >
> > Hi Yohan,
> >
> > So you're trying to avoid f2fs_map_lock() in dio write path, right?
> >
> > Thanks,
> >
> > >
> > > Signed-off-by: Yohan Joung 
> > > ---
> > >  fs/f2fs/data.c | 20   fs/f2fs/f2fs.h |  1 +
> > > fs/f2fs/file.c |  5 -
> > >  3 files changed, 25 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index
> > > 09437dbd1b42..728630037b74 100644
> > > --- a/fs/f2fs/data.c
> > > +++ b/fs/f2fs/data.c
> > > @@ -4267,6 +4267,26 @@ static int f2fs_iomap_begin(struct inode *inode,
> loff_t offset, loff_t length,
> > >   return 0;
> > >  }
> > >
> > > +static int f2fs_iomap_overwrite_begin(struct inode *inode, loff_t
> offset,
> > > + loff_t length, unsigned flags, struct iomap *iomap,
> > > + struct iomap *srcmap)
> > > +{
> > > + int ret;
> > > +
> > > + /*
> > > +  * Even for writes we don't need to allocate blocks, so just
> pretend
> > > +  * we are reading to save overhead of starting a transaction.
> > > +  */
> > > + flags &= ~IOMAP_WRITE;
> > > + ret = f2fs_iomap_begin(inode, offset, length, flags, iomap,
> srcmap);
> > > + WARN_ON_ONCE(!ret && iomap->type != IOMAP_MAPPED);
> > > + return ret;
> > > +}
> > > +
> > >  const struct iomap_ops f2fs_iomap_ops = {
> > >   .iomap_begin= f2fs_iomap_begin,
> > >  };
> > > +
> > > +const struct iomap_ops f2fs_iomap_overwrite_ops = {
> > > + .iomap_begin= f2fs_iomap_overwrite_begin,
> > > +};
> > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index
> > > c6cc2694f9ac..0511ab5ed42a 100644
> > > --- a/fs/f2fs/f2fs.h
> > > +++ b/fs/f2fs/f2fs.h
> > > @@ -3936,6 +3936,7 @@ void f2fs_destroy_post_read_processing(void);
> > >  int f2fs_init_post_read_wq(struct f2fs_sb_info *sbi);  void
> > > f2fs_destroy_post_read_wq(struct f2fs_sb_info *sbi);  extern const
> > > struct iomap_ops f2fs_iomap_ops;
> > > +extern const struct iomap_ops f2fs_iomap_overwrite_ops;
> > >
> > >  static inline struct page *f2fs_find_data_page(struct inode *inode,
> > >   pgoff_t index, pgoff_t *next_pgofs) diff --git
> > > a/fs/f2fs/file.c b/fs/f2fs/file.c index 82b21baf5628..bb2fe6dac9b6
> > > 100644
> > > --- a/fs/f2fs/file.c
> > > +++ b/fs/f2fs/file.c
> > > @@ -4985,6 +4985,7 @@ static ssize_t f2fs_dio_write_iter(struct kiocb
> *iocb, struct iov_iter *from,
> > >   const ssize_t count = iov_iter_count(from);
> > >   unsigned int dio_flags;
> > >   struct iomap_dio *dio;
> > > + const struct iomap_ops *iomap_ops = &f2fs_iomap_ops;
> > >   ssize_t ret;
> > >
> > >   trace_f2fs_direct_IO_enter(inode, iocb, count, WRITE); @@
> > > -5025,7 +5026,9 @@ static ssize_t f2fs_dio_write_iter(struct kiocb
> *iocb, struct iov_iter *from,
> > >   dio_flags = 0;
> > >   if (pos + count > inode->i_size)
> > >   dio_flags |= IOMAP_DIO_FORCE_WAIT;
> > > - dio = __iomap_dio_rw(iocb, from, &f2fs_iomap_ops,
> > > + else if (f2fs_overwrite_io(inode, pos, count))
> > > + iomap_ops = &f2fs_iomap_overwrite_ops;
> > > + dio = __iomap_dio_rw(iocb, from, iomap_ops,
> > >&f2fs_iomap_dio_write_ops, dio_flags, NULL, 0);
> > >   if (IS_ERR_OR_NULL(dio)) {
> > >   ret = PTR_ERR_OR_ZERO(dio);
> 
> I think we can add the overwrite check in f2fs_iomap_begin() before
> setting the map.m_may_create, instead of adding a new function
> f2fs_iomap_overwrite_begin().
> What do you think?
Daeho, Is this the way you want it changed? If so, I'll upload it like this 
static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
unsigned int flags, struct iomap *iomap,
struct iomap *srcmap)
{
struct f2fs_map_blocks map = {};
pgoff_t next_pgofs = 0;
int err;

map.m_lblk = F2FS_BYTES_TO_BLK(offset);
map.m_len = F2FS_BYTES_TO_BLK(offset + length - 1) - map.m_lblk + 1;
map.m_next_pgofs = &next_pgofs;
map.m_seg_type = f2fs_rw_hint_to_seg_type(F2FS_I_SB(inode),
inode->i_write_hint);
if ((flags & IOMAP_WRITE) && !f2fs_overwrite_io(inode, offset, length))
map.m_may_create = true;

> 
> >

___
Linux-f2fs-devel mailing list
Linux-f2fs-deve