Hello,

On Thu, May 12, 2016 at 09:11:33AM +0800, Miao Xie wrote:
> >My box has 48 cores and 188GB memory, but I set
> >vm.dirty_background_bytes = 268435456
> >vm.dirty_bytes = 536870912
> >
> >if I set vm.dirty_background_bytes and vm.dirty_bytes to be a large 
> >number(vm.dirty_background_bytes = 3GB,
> >vm.dirty_bytes = 4GB), then fio thoughput would be more than 1500MB/s. and 
> >then if I reset them to the original
> >value(the above ones), the thoughout would be down to 500MB/s.
> >
> >And according my debug, I found fio sleeped for 1ms every time we dirty a 
> >page(balance dirty pages) when
> >the thoughput was down to 4MB/s, it might be a bug of dirty throttle when we 
> >open write back cgroup, I think.

Heh, so, for cgroups, the absolute byte limits can't applied directly
and converted to percentage value before being applied.  You're
specifying 0.27% for threshold.  Unfortunately, the ratio is
translated into a percentage number and 0.27% becomes 0, so your
cgroups are always over limit and being throttled.

Can you please see whether the following patch fixes the issue?

Thanks.

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 999792d..a455a21 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -369,8 +369,9 @@ static void domain_dirty_limits(struct 
dirty_throttle_control *dtc)
        struct dirty_throttle_control *gdtc = mdtc_gdtc(dtc);
        unsigned long bytes = vm_dirty_bytes;
        unsigned long bg_bytes = dirty_background_bytes;
-       unsigned long ratio = vm_dirty_ratio;
-       unsigned long bg_ratio = dirty_background_ratio;
+       /* convert ratios to per-PAGE_SIZE for higher precision */
+       unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
+       unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
        unsigned long thresh;
        unsigned long bg_thresh;
        struct task_struct *tsk;
@@ -382,26 +383,28 @@ static void domain_dirty_limits(struct 
dirty_throttle_control *dtc)
                /*
                 * The byte settings can't be applied directly to memcg
                 * domains.  Convert them to ratios by scaling against
-                * globally available memory.
+                * globally available memory.  As the ratios are in
+                * per-PAGE_SIZE, they can be obtained by dividing bytes by
+                * pages.
                 */
                if (bytes)
-                       ratio = min(DIV_ROUND_UP(bytes, PAGE_SIZE) * 100 /
-                                   global_avail, 100UL);
+                       ratio = min(DIV_ROUND_UP(bytes, global_avail),
+                                   PAGE_SIZE);
                if (bg_bytes)
-                       bg_ratio = min(DIV_ROUND_UP(bg_bytes, PAGE_SIZE) * 100 /
-                                      global_avail, 100UL);
+                       bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
+                                      PAGE_SIZE);
                bytes = bg_bytes = 0;
        }
 
        if (bytes)
                thresh = DIV_ROUND_UP(bytes, PAGE_SIZE);
        else
-               thresh = (ratio * available_memory) / 100;
+               thresh = (ratio * available_memory) / PAGE_SIZE;
 
        if (bg_bytes)
                bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
        else
-               bg_thresh = (bg_ratio * available_memory) / 100;
+               bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
 
        if (bg_thresh >= thresh)
                bg_thresh = thresh / 2;

Reply via email to