Re: [f2fs-dev] [PATCH v2 1/2] f2fs: introduce checkpoint=merge mount option

2021-01-15 Thread Chao Yu

On 2021/1/15 22:00, Daeho Jeong wrote:

ktime_get() returns time based ns unit, in extreme scenario, average
time cp cost will overflow 32-bit variable, I doubt.


sum_diff is already turned into msec using ktime_ms_delta() above.


Yup, I missed ktime_ms_delta().


On 2021/1/15 22:23, Daeho Jeong wrote:
>>> How about updating queued_ckpt and total_ckpt in batch, update atomic
>>> variable one by one is low efficient.
>>>
>> You mean like using spin_lock()?
>>
> Ah, you mean like updating these values as much as the count of the
> loop at once?

Correct. :)

Thanks,


Re: [f2fs-dev] [PATCH v2 1/2] f2fs: introduce checkpoint=merge mount option

2021-01-15 Thread Daeho Jeong
2021년 1월 15일 (금) 오후 11:00, Daeho Jeong 님이 작성:
>
> 2021년 1월 15일 (금) 오후 6:22, Chao Yu 님이 작성:
> >
> > On 2021/1/14 14:23, Daeho Jeong wrote:
> > > From: Daeho Jeong 
> > >
> > > We've added a new mount option "checkpoint=merge", which creates a
> > > kernel daemon and makes it to merge concurrent checkpoint requests as
> > > much as possible to eliminate redundant checkpoint issues. Plus, we
> > > can eliminate the sluggish issue caused by slow checkpoint operation
> > > when the checkpoint is done in a process context in a cgroup having
> > > low i/o budget and cpu shares, and The below verification result
> > > explains this.
> > > The basic idea has come from https://opensource.samsung.com.
> > >
> > > [Verification]
> > > Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
> > > Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
> > > Set "strict_guarantees" to "1" in BFQ tunables
> > >
> > > In "fg" cgroup,
> > > - thread A => trigger 1000 checkpoint operations
> > >"for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
> > > done"
> > > - thread B => gererating async. I/O
> > >"fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
> > > --filename=test_img --name=test"
> > >
> > > In "bg" cgroup,
> > > - thread C => trigger repeated checkpoint operations
> > >"echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
> > > fsync test_dir2; done"
> > >
> > > We've measured thread A's execution time.
> > >
> > > [ w/o patch ]
> > > Elapsed Time: Avg. 68 seconds
> > > [ w/  patch ]
> > > Elapsed Time: Avg. 48 seconds
> > >
> > > Signed-off-by: Daeho Jeong 
> > > Signed-off-by: Sungjong Seo 
> > > ---
> > > v2:
> > > - inlined ckpt_req_control into f2fs_sb_info and collected stastics
> > >of checkpoint merge operations
> > > ---
> > >   Documentation/filesystems/f2fs.rst |   6 ++
> > >   fs/f2fs/checkpoint.c   | 163 +
> > >   fs/f2fs/debug.c|  12 +++
> > >   fs/f2fs/f2fs.h |  27 +
> > >   fs/f2fs/super.c|  56 +-
> > >   5 files changed, 260 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/Documentation/filesystems/f2fs.rst 
> > > b/Documentation/filesystems/f2fs.rst
> > > index dae15c96e659..bccc021bf31a 100644
> > > --- a/Documentation/filesystems/f2fs.rst
> > > +++ b/Documentation/filesystems/f2fs.rst
> > > @@ -247,6 +247,12 @@ checkpoint=%s[:%u[%]] Set to "disable" to turn 
> > > off checkpointing. Set to "enabl
> > >hide up to all remaining free space. The actual 
> > > space that
> > >would be unusable can be viewed at 
> > > /sys/fs/f2fs//unusable
> > >This space is reclaimed once checkpoint=enable.
> > > +  Here is another option "merge", which creates a 
> > > kernel daemon
> > > +  and makes it to merge concurrent checkpoint 
> > > requests as much
> > > +  as possible to eliminate redundant checkpoint 
> > > issues. Plus,
> > > +  we can eliminate the sluggish issue caused by slow 
> > > checkpoint
> > > +  operation when the checkpoint is done in a process 
> > > context in
> > > +  a cgroup having low i/o budget and cpu shares.
> > >   compress_algorithm=%sControl compress algorithm, currently f2fs 
> > > supports "lzo",
> > >"lz4", "zstd" and "lzo-rle" algorithm.
> > >   compress_log_size=%u Support configuring compress cluster size, 
> > > the size will
> > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > > index 897edb7c951a..e0668cec3b80 100644
> > > --- a/fs/f2fs/checkpoint.c
> > > +++ b/fs/f2fs/checkpoint.c
> > > @@ -13,6 +13,7 @@
> > >   #include 
> > >   #include 
> > >   #include 
> > > +#include 
> > >
> > >   #include "f2fs.h"
> > >   #include "node.h"
> > > @@ -20,6 +21,8 @@
> > >   #include "trace.h"
> > >   #include 
> > >
> > > +#define DEFAULT_CHECKPOINT_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))
> > > +
> > >   static struct kmem_cache *ino_entry_slab;
> > >   struct kmem_cache *f2fs_inode_entry_slab;
> > >
> > > @@ -1707,3 +1710,163 @@ void f2fs_destroy_checkpoint_caches(void)
> > >   kmem_cache_destroy(ino_entry_slab);
> > >   kmem_cache_destroy(f2fs_inode_entry_slab);
> > >   }
> > > +
> > > +static int __write_checkpoint_sync(struct f2fs_sb_info *sbi)
> > > +{
> > > + struct cp_control cpc = { .reason = CP_SYNC, };
> > > + int err;
> > > +
> > > + down_write(>gc_lock);
> > > + err = f2fs_write_checkpoint(sbi, );
> > > + up_write(>gc_lock);
> > > +
> > > + return err;
> > > +}
> > > +
> > > +static void __checkpoint_and_complete_reqs(struct f2fs_sb_info *sbi)
> > > +{
> > > + struct ckpt_req_control *cprc = >cprc_info;
> > > + struct ckpt_req *req, *next;
> > > + struct llist_node *dispatch_list;
> 

Re: [f2fs-dev] [PATCH v2 1/2] f2fs: introduce checkpoint=merge mount option

2021-01-15 Thread Daeho Jeong
2021년 1월 15일 (금) 오후 6:22, Chao Yu 님이 작성:
>
> On 2021/1/14 14:23, Daeho Jeong wrote:
> > From: Daeho Jeong 
> >
> > We've added a new mount option "checkpoint=merge", which creates a
> > kernel daemon and makes it to merge concurrent checkpoint requests as
> > much as possible to eliminate redundant checkpoint issues. Plus, we
> > can eliminate the sluggish issue caused by slow checkpoint operation
> > when the checkpoint is done in a process context in a cgroup having
> > low i/o budget and cpu shares, and The below verification result
> > explains this.
> > The basic idea has come from https://opensource.samsung.com.
> >
> > [Verification]
> > Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
> > Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
> > Set "strict_guarantees" to "1" in BFQ tunables
> >
> > In "fg" cgroup,
> > - thread A => trigger 1000 checkpoint operations
> >"for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
> > done"
> > - thread B => gererating async. I/O
> >"fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
> > --filename=test_img --name=test"
> >
> > In "bg" cgroup,
> > - thread C => trigger repeated checkpoint operations
> >"echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
> > fsync test_dir2; done"
> >
> > We've measured thread A's execution time.
> >
> > [ w/o patch ]
> > Elapsed Time: Avg. 68 seconds
> > [ w/  patch ]
> > Elapsed Time: Avg. 48 seconds
> >
> > Signed-off-by: Daeho Jeong 
> > Signed-off-by: Sungjong Seo 
> > ---
> > v2:
> > - inlined ckpt_req_control into f2fs_sb_info and collected stastics
> >of checkpoint merge operations
> > ---
> >   Documentation/filesystems/f2fs.rst |   6 ++
> >   fs/f2fs/checkpoint.c   | 163 +
> >   fs/f2fs/debug.c|  12 +++
> >   fs/f2fs/f2fs.h |  27 +
> >   fs/f2fs/super.c|  56 +-
> >   5 files changed, 260 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/filesystems/f2fs.rst 
> > b/Documentation/filesystems/f2fs.rst
> > index dae15c96e659..bccc021bf31a 100644
> > --- a/Documentation/filesystems/f2fs.rst
> > +++ b/Documentation/filesystems/f2fs.rst
> > @@ -247,6 +247,12 @@ checkpoint=%s[:%u[%]] Set to "disable" to turn off 
> > checkpointing. Set to "enabl
> >hide up to all remaining free space. The actual 
> > space that
> >would be unusable can be viewed at 
> > /sys/fs/f2fs//unusable
> >This space is reclaimed once checkpoint=enable.
> > +  Here is another option "merge", which creates a 
> > kernel daemon
> > +  and makes it to merge concurrent checkpoint requests 
> > as much
> > +  as possible to eliminate redundant checkpoint 
> > issues. Plus,
> > +  we can eliminate the sluggish issue caused by slow 
> > checkpoint
> > +  operation when the checkpoint is done in a process 
> > context in
> > +  a cgroup having low i/o budget and cpu shares.
> >   compress_algorithm=%sControl compress algorithm, currently f2fs 
> > supports "lzo",
> >"lz4", "zstd" and "lzo-rle" algorithm.
> >   compress_log_size=%u Support configuring compress cluster size, 
> > the size will
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index 897edb7c951a..e0668cec3b80 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -13,6 +13,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >
> >   #include "f2fs.h"
> >   #include "node.h"
> > @@ -20,6 +21,8 @@
> >   #include "trace.h"
> >   #include 
> >
> > +#define DEFAULT_CHECKPOINT_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))
> > +
> >   static struct kmem_cache *ino_entry_slab;
> >   struct kmem_cache *f2fs_inode_entry_slab;
> >
> > @@ -1707,3 +1710,163 @@ void f2fs_destroy_checkpoint_caches(void)
> >   kmem_cache_destroy(ino_entry_slab);
> >   kmem_cache_destroy(f2fs_inode_entry_slab);
> >   }
> > +
> > +static int __write_checkpoint_sync(struct f2fs_sb_info *sbi)
> > +{
> > + struct cp_control cpc = { .reason = CP_SYNC, };
> > + int err;
> > +
> > + down_write(>gc_lock);
> > + err = f2fs_write_checkpoint(sbi, );
> > + up_write(>gc_lock);
> > +
> > + return err;
> > +}
> > +
> > +static void __checkpoint_and_complete_reqs(struct f2fs_sb_info *sbi)
> > +{
> > + struct ckpt_req_control *cprc = >cprc_info;
> > + struct ckpt_req *req, *next;
> > + struct llist_node *dispatch_list;
> > + u64 sum_diff = 0, diff, count = 0;
> > + int ret;
> > +
> > + dispatch_list = llist_del_all(>issue_list);
> > + if (!dispatch_list)
> > + return;
> > + dispatch_list = llist_reverse_order(dispatch_list);
> > +
> > + ret = __write_checkpoint_sync(sbi);
> > 

Re: [f2fs-dev] [PATCH v2 1/2] f2fs: introduce checkpoint=merge mount option

2021-01-15 Thread Chao Yu

On 2021/1/14 14:23, Daeho Jeong wrote:

From: Daeho Jeong 

We've added a new mount option "checkpoint=merge", which creates a
kernel daemon and makes it to merge concurrent checkpoint requests as
much as possible to eliminate redundant checkpoint issues. Plus, we
can eliminate the sluggish issue caused by slow checkpoint operation
when the checkpoint is done in a process context in a cgroup having
low i/o budget and cpu shares, and The below verification result
explains this.
The basic idea has come from https://opensource.samsung.com.

[Verification]
Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
Set "strict_guarantees" to "1" in BFQ tunables

In "fg" cgroup,
- thread A => trigger 1000 checkpoint operations
   "for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
done"
- thread B => gererating async. I/O
   "fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
--filename=test_img --name=test"

In "bg" cgroup,
- thread C => trigger repeated checkpoint operations
   "echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
fsync test_dir2; done"

We've measured thread A's execution time.

[ w/o patch ]
Elapsed Time: Avg. 68 seconds
[ w/  patch ]
Elapsed Time: Avg. 48 seconds

Signed-off-by: Daeho Jeong 
Signed-off-by: Sungjong Seo 
---
v2:
- inlined ckpt_req_control into f2fs_sb_info and collected stastics
   of checkpoint merge operations
---
  Documentation/filesystems/f2fs.rst |   6 ++
  fs/f2fs/checkpoint.c   | 163 +
  fs/f2fs/debug.c|  12 +++
  fs/f2fs/f2fs.h |  27 +
  fs/f2fs/super.c|  56 +-
  5 files changed, 260 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/f2fs.rst 
b/Documentation/filesystems/f2fs.rst
index dae15c96e659..bccc021bf31a 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -247,6 +247,12 @@ checkpoint=%s[:%u[%]]   Set to "disable" to turn off 
checkpointing. Set to "enabl
 hide up to all remaining free space. The actual space 
that
 would be unusable can be viewed at 
/sys/fs/f2fs//unusable
 This space is reclaimed once checkpoint=enable.
+Here is another option "merge", which creates a kernel 
daemon
+and makes it to merge concurrent checkpoint requests 
as much
+as possible to eliminate redundant checkpoint issues. 
Plus,
+we can eliminate the sluggish issue caused by slow 
checkpoint
+operation when the checkpoint is done in a process 
context in
+a cgroup having low i/o budget and cpu shares.
  compress_algorithm=%s  Control compress algorithm, currently f2fs supports 
"lzo",
 "lz4", "zstd" and "lzo-rle" algorithm.
  compress_log_size=%u   Support configuring compress cluster size, the size 
will
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 897edb7c951a..e0668cec3b80 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -13,6 +13,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "f2fs.h"

  #include "node.h"
@@ -20,6 +21,8 @@
  #include "trace.h"
  #include 
  
+#define DEFAULT_CHECKPOINT_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))

+
  static struct kmem_cache *ino_entry_slab;
  struct kmem_cache *f2fs_inode_entry_slab;
  
@@ -1707,3 +1710,163 @@ void f2fs_destroy_checkpoint_caches(void)

kmem_cache_destroy(ino_entry_slab);
kmem_cache_destroy(f2fs_inode_entry_slab);
  }
+
+static int __write_checkpoint_sync(struct f2fs_sb_info *sbi)
+{
+   struct cp_control cpc = { .reason = CP_SYNC, };
+   int err;
+
+   down_write(>gc_lock);
+   err = f2fs_write_checkpoint(sbi, );
+   up_write(>gc_lock);
+
+   return err;
+}
+
+static void __checkpoint_and_complete_reqs(struct f2fs_sb_info *sbi)
+{
+   struct ckpt_req_control *cprc = >cprc_info;
+   struct ckpt_req *req, *next;
+   struct llist_node *dispatch_list;
+   u64 sum_diff = 0, diff, count = 0;
+   int ret;
+
+   dispatch_list = llist_del_all(>issue_list);
+   if (!dispatch_list)
+   return;
+   dispatch_list = llist_reverse_order(dispatch_list);
+
+   ret = __write_checkpoint_sync(sbi);
+   atomic_inc(>issued_ckpt);
+
+   llist_for_each_entry_safe(req, next, dispatch_list, llnode) {
+   atomic_dec(>queued_ckpt);
+   atomic_inc(>total_ckpt);
+   diff = (u64)ktime_ms_delta(ktime_get(), req->queue_time);
+   req->ret = ret;
+   complete(>wait);
+
+   sum_diff += diff;
+   count++;
+   }


How about updating queued_ckpt and total_ckpt in batch, update atomic
variable one by one is low 

[PATCH v2 1/2] f2fs: introduce checkpoint=merge mount option

2021-01-13 Thread Daeho Jeong
From: Daeho Jeong 

We've added a new mount option "checkpoint=merge", which creates a
kernel daemon and makes it to merge concurrent checkpoint requests as
much as possible to eliminate redundant checkpoint issues. Plus, we
can eliminate the sluggish issue caused by slow checkpoint operation
when the checkpoint is done in a process context in a cgroup having
low i/o budget and cpu shares, and The below verification result
explains this.
The basic idea has come from https://opensource.samsung.com.

[Verification]
Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
Set "strict_guarantees" to "1" in BFQ tunables

In "fg" cgroup,
- thread A => trigger 1000 checkpoint operations
  "for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
   done"
- thread B => gererating async. I/O
  "fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
   --filename=test_img --name=test"

In "bg" cgroup,
- thread C => trigger repeated checkpoint operations
  "echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
   fsync test_dir2; done"

We've measured thread A's execution time.

[ w/o patch ]
Elapsed Time: Avg. 68 seconds
[ w/  patch ]
Elapsed Time: Avg. 48 seconds

Signed-off-by: Daeho Jeong 
Signed-off-by: Sungjong Seo 
---
v2:
- inlined ckpt_req_control into f2fs_sb_info and collected stastics
  of checkpoint merge operations
---
 Documentation/filesystems/f2fs.rst |   6 ++
 fs/f2fs/checkpoint.c   | 163 +
 fs/f2fs/debug.c|  12 +++
 fs/f2fs/f2fs.h |  27 +
 fs/f2fs/super.c|  56 +-
 5 files changed, 260 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/f2fs.rst 
b/Documentation/filesystems/f2fs.rst
index dae15c96e659..bccc021bf31a 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -247,6 +247,12 @@ checkpoint=%s[:%u[%]]   Set to "disable" to turn off 
checkpointing. Set to "enabl
 hide up to all remaining free space. The actual space 
that
 would be unusable can be viewed at 
/sys/fs/f2fs//unusable
 This space is reclaimed once checkpoint=enable.
+Here is another option "merge", which creates a kernel 
daemon
+and makes it to merge concurrent checkpoint requests 
as much
+as possible to eliminate redundant checkpoint issues. 
Plus,
+we can eliminate the sluggish issue caused by slow 
checkpoint
+operation when the checkpoint is done in a process 
context in
+a cgroup having low i/o budget and cpu shares.
 compress_algorithm=%s   Control compress algorithm, currently f2fs supports 
"lzo",
 "lz4", "zstd" and "lzo-rle" algorithm.
 compress_log_size=%uSupport configuring compress cluster size, the size 
will
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 897edb7c951a..e0668cec3b80 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
@@ -20,6 +21,8 @@
 #include "trace.h"
 #include 
 
+#define DEFAULT_CHECKPOINT_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))
+
 static struct kmem_cache *ino_entry_slab;
 struct kmem_cache *f2fs_inode_entry_slab;
 
@@ -1707,3 +1710,163 @@ void f2fs_destroy_checkpoint_caches(void)
kmem_cache_destroy(ino_entry_slab);
kmem_cache_destroy(f2fs_inode_entry_slab);
 }
+
+static int __write_checkpoint_sync(struct f2fs_sb_info *sbi)
+{
+   struct cp_control cpc = { .reason = CP_SYNC, };
+   int err;
+
+   down_write(>gc_lock);
+   err = f2fs_write_checkpoint(sbi, );
+   up_write(>gc_lock);
+
+   return err;
+}
+
+static void __checkpoint_and_complete_reqs(struct f2fs_sb_info *sbi)
+{
+   struct ckpt_req_control *cprc = >cprc_info;
+   struct ckpt_req *req, *next;
+   struct llist_node *dispatch_list;
+   u64 sum_diff = 0, diff, count = 0;
+   int ret;
+
+   dispatch_list = llist_del_all(>issue_list);
+   if (!dispatch_list)
+   return;
+   dispatch_list = llist_reverse_order(dispatch_list);
+
+   ret = __write_checkpoint_sync(sbi);
+   atomic_inc(>issued_ckpt);
+
+   llist_for_each_entry_safe(req, next, dispatch_list, llnode) {
+   atomic_dec(>queued_ckpt);
+   atomic_inc(>total_ckpt);
+   diff = (u64)ktime_ms_delta(ktime_get(), req->queue_time);
+   req->ret = ret;
+   complete(>wait);
+
+   sum_diff += diff;
+   count++;
+   }
+
+   spin_lock(>stat_lock);
+   cprc->cur_time = (unsigned int)div64_u64(sum_diff, count);
+   if (cprc->peak_time < cprc->cur_time)
+