> On 12/08, Yonggil Song wrote:
> > Overview
> > ========
> > 
> > This patch introduces a new way to preference data sections when selecting
> > GC victims. Migration of data blocks causes invalidation of node blocks.
> > Therefore, in situations where GC is frequent, selecting data blocks as
> > victims can reduce unnecessary block migration by invalidating node blocks.
> > For exceptional situations where free sections are insufficient, node blocks
> > are selected as victims instead of data blocks to get extra free sections.
> > 
> > Problem
> > =======
> > 
> > If the total amount of nodes is larger than the size of one section, nodes
> > occupy multiple sections, and node victims are often selected because the
> > gc cost is lowered by data block migration in GC. Since moving the data
> > section causes frequent node victim selection, victim threshing occurs in
> > the node section. This results in an increase in WAF.
> > 
> > Experiment
> > ==========
> > 
> > Test environment is as follows.
> > 
> >         System info
> >           - 3.6GHz, 16 core CPU
> >           - 36GiB Memory
> >         Device info
> >           - a conventional null_blk with 228MiB
> >           - a sequential null_blk with 4068 zones of 8MiB
> >         Format
> >           - mkfs.f2fs <conv null_blk> -c <seq null_blk> -m -Z 8 -o 3.89
> >         Mount
> >           - mount <conv null_blk> <mount point>
> >         Fio script
> >       - fio --rw=randwrite --bs=4k --ba=4k --filesize=31187m --norandommap 
> > --overwrite=1 --name=job1 --filename=./mnt/sustain --io_size=128g
> >     WAF calculation
> >       - (IOs on conv. null_blk + IOs on seq. null_blk) / random write IOs
> > 
> > Conclusion
> > ==========
> > 
> > This experiment showed that the WAF was reduced by 29% (18.75 -> 13.3) when
> > the data section was selected first when selecting GC victims. This was
> > achieved by reducing the migration of the node blocks by 69.4%
> > (253,131,743 blks -> 77,463,278 blks). It is possible to achieve low WAF
> > performance with the GC victim selection method in environments where the
> > section size is relatively small.
> > 
> > Signed-off-by: Yonggil Song <[email protected]>
> > ---
> >  fs/f2fs/f2fs.h |  1 +
> >  fs/f2fs/gc.c   | 98 ++++++++++++++++++++++++++++++++++++++------------
> >  2 files changed, 77 insertions(+), 22 deletions(-)
> > 
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 9043cedfa12b..578d57f6022f 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -1649,6 +1649,7 @@ struct f2fs_sb_info {
> >     struct f2fs_mount_info mount_opt;       /* mount options */
> >  
> >     /* for cleaning operations */
> > +   bool need_node_clean;                   /* need to clean dirty nodes */
> >     struct f2fs_rwsem gc_lock;              /*
> >                                              * semaphore for GC, avoid
> >                                              * race between GC and GC or CP
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index f550cdeaa663..682dcf0de59e 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -368,6 +368,14 @@ static inline unsigned int get_gc_cost(struct 
> > f2fs_sb_info *sbi,
> >     if (p->alloc_mode == SSR)
> >             return get_seg_entry(sbi, segno)->ckpt_valid_blocks;
> >  
> > +   /*
> > +    * If we don't need to clean dirty nodes,
> > +    * we just skip node victims.
> > +    */
> > +   if (IS_NODESEG(get_seg_entry(sbi, segno)->type) &&
> > +           !sbi->need_node_clean)
> > +           return get_max_cost(sbi, p);
> 
> How about differentiating the gc cost between data vs. node by adding some
> weights? By default, data is preferred, while node is better in the worst 
> case?
> 

Okay, I will work on v3 with your comments

> > +
> >     /* alloc_mode == LFS */
> >     if (p->gc_mode == GC_GREEDY)
> >             return get_valid_blocks(sbi, segno, true);
> > @@ -557,6 +565,14 @@ static void atgc_lookup_victim(struct f2fs_sb_info 
> > *sbi,
> >     if (ve->mtime >= max_mtime || ve->mtime < min_mtime)
> >             goto skip;
> >  
> > +   /*
> > +    * If we don't need to clean dirty nodes,
> > +    * we just skip node victims.
> > +    */
> > +   if (IS_NODESEG(get_seg_entry(sbi, ve->segno)->type) &&
> > +       !sbi->need_node_clean)
> > +           goto skip;
> > +
> >     /* age = 10000 * x% * 60 */
> >     age = div64_u64(accu * (max_mtime - ve->mtime), total_time) *
> >                                                             age_weight;
> > @@ -913,7 +929,21 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned 
> > int *result,
> >             goto retry;
> >     }
> >  
> > +
> >     if (p.min_segno != NULL_SEGNO) {
> > +           if (sbi->need_node_clean &&
> > +               IS_DATASEG(get_seg_entry(sbi, p.min_segno)->type)) {
> > +                    /*
> > +                     * we need to clean node sections.
> > +                     * but, data victim cost is the lowest.
> > +                     * if free sections are enough, stop cleaning node 
> > victim.
> > +                     * if not, it goes on by GCing data victims.
> > +                     */
> > +                   if (has_enough_free_secs(sbi, prefree_segments(sbi), 
> > 0)) {
> > +                           p.min_segno = NULL_SEGNO;
> > +                           goto out;
> > +                   }
> > +           }
> >  got_it:
> >             *result = (p.min_segno / p.ofs_unit) * p.ofs_unit;
> >  got_result:
> > @@ -1830,8 +1860,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct 
> > f2fs_gc_control *gc_control)
> >             goto stop;
> >     }
> >  
> > +   __get_secs_required(sbi, NULL, &upper_secs, NULL);
> > +
> > +   /*
> > +    * Write checkpoint to reclaim prefree segments.
> > +    * We need more three extra sections for writer's data/node/dentry.
> > +    */
> > +   if (free_sections(sbi) <= upper_secs + NR_GC_CHECKPOINT_SECS) {
> > +           sbi->need_node_clean = true;
> > +
> > +           if (prefree_segments(sbi)) {
> > +                   stat_inc_cp_call_count(sbi, TOTAL_CALL);
> > +                   ret = f2fs_write_checkpoint(sbi, &cpc);
> > +                   if (ret)
> > +                           goto stop;
> > +                   /* Reset due to checkpoint */
> > +                   sec_freed = 0;
> > +           }
> > +   }
> > +
> >     /* Let's run FG_GC, if we don't have enough space. */
> > -   if (has_not_enough_free_secs(sbi, 0, 0)) {
> > +   if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0, 0)) {
> >             gc_type = FG_GC;
> >  
> >             /*
> > @@ -1858,10 +1907,22 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct 
> > f2fs_gc_control *gc_control)
> >     ret = __get_victim(sbi, &segno, gc_type);
> >     if (ret) {
> >             /* allow to search victim from sections has pinned data */
> > -           if (ret == -ENODATA && gc_type == FG_GC &&
> > -                           f2fs_pinned_section_exists(DIRTY_I(sbi))) {
> > -                   f2fs_unpin_all_sections(sbi, false);
> > -                   goto retry;
> > +           if (ret == -ENODATA && gc_type == FG_GC) {
> > +                   if (f2fs_pinned_section_exists(DIRTY_I(sbi))) {
> > +                           f2fs_unpin_all_sections(sbi, false);
> > +                           goto retry;
> > +                   }
> > +                   /*
> > +                    * If we have no more data victims, let's start to
> > +                    * clean dirty nodes.
> > +                    */
> > +                   if (!sbi->need_node_clean) {
> > +                           sbi->need_node_clean = true;
> > +                           goto retry;
> > +                   }
> > +                   /* node cleaning is over */
> > +                   else if (sbi->need_node_clean)
> > +                           goto stop;
> >             }
> >             goto stop;
> >     }
> > @@ -1882,7 +1943,13 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct 
> > f2fs_gc_control *gc_control)
> >                     if (!gc_control->no_bg_gc &&
> >                         total_sec_freed < gc_control->nr_free_secs)
> >                             goto go_gc_more;
> > -                   goto stop;
> > +                   /*
> > +                    * If the need_node_clean flag is set
> > +                    * even though there are enough free
> > +                    * sections, node cleaning will continue.
> > +                    */
> > +                   if (!sbi->need_node_clean)
> > +                           goto stop;
> >             }
> >             if (sbi->skipped_gc_rwsem)
> >                     skipped_round++;
> > @@ -1897,21 +1964,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct 
> > f2fs_gc_control *gc_control)
> >             goto stop;
> >     }
> >  
> > -   __get_secs_required(sbi, NULL, &upper_secs, NULL);
> > -
> > -   /*
> > -    * Write checkpoint to reclaim prefree segments.
> > -    * We need more three extra sections for writer's data/node/dentry.
> > -    */
> > -   if (free_sections(sbi) <= upper_secs + NR_GC_CHECKPOINT_SECS &&
> > -                           prefree_segments(sbi)) {
> > -           stat_inc_cp_call_count(sbi, TOTAL_CALL);
> > -           ret = f2fs_write_checkpoint(sbi, &cpc);
> > -           if (ret)
> > -                   goto stop;
> > -           /* Reset due to checkpoint */
> > -           sec_freed = 0;
> > -   }
> >  go_gc_more:
> >     segno = NULL_SEGNO;
> >     goto gc_more;
> > @@ -1920,8 +1972,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct 
> > f2fs_gc_control *gc_control)
> >     SIT_I(sbi)->last_victim[ALLOC_NEXT] = 0;
> >     SIT_I(sbi)->last_victim[FLUSH_DEVICE] = gc_control->victim_segno;
> >  
> > -   if (gc_type == FG_GC)
> > +   if (gc_type == FG_GC) {
> >             f2fs_unpin_all_sections(sbi, true);
> > +           sbi->need_node_clean = false;
> > +   }
> >  
> >     trace_f2fs_gc_end(sbi->sb, ret, total_freed, total_sec_freed,
> >                             get_pages(sbi, F2FS_DIRTY_NODES),
> > -- 
> > 2.34.1


_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to