On 9/19/19 11:49, Matthew Wilcox wrote:
On Thu, Sep 19, 2019 at 10:33:10AM +0800, Lin Feng wrote:
On 9/18/19 20:33, Michal Hocko wrote:
I absolutely agree here. From you changelog it is also not clear what is
the underlying problem. Both congestion_wait and wait_iff_congested
should wake up early if the congestion is handled. Is this not the case?

For now I don't know why, codes seem should work as you said, maybe I need to
trace more of the internals.
But weird thing is that once I set the people-disliked-tunable iowait
drop down instantly, this is contradictory to the code design.

Yes, this is quite strange.  If setting a smaller timeout makes a
difference, that indicates we're not waking up soon enough.  I see
two possibilities; one is that a wakeup is missing somewhere -- ie the
conditions under which we call clear_wb_congested() are wrong.  Or we
need to wake up sooner.

Umm.  We have clear_wb_congested() called from exactly one spot --
clear_bdi_congested().  That is only called from:

drivers/block/pktcdvd.c
fs/ceph/addr.c
fs/fuse/control.c
fs/fuse/dev.c
fs/nfs/write.c

Jens, is something supposed to be calling clear_bdi_congested() in the
block layer?  blk_clear_congested() used to exist until October 29th
last year.  Or is something else supposed to be waking up tasks that
are sleeping on congestion?


IIUC it looks like after commit a1ce35fa49852db60fc6e268038530be533c5b15,
besides those *.c places as you mentioned above, vmscan codes will always
wait as long as 100ms and nobody wakes them up.

here:
1964         while (unlikely(too_many_isolated(pgdat, file, sc))) {
1965                 if (stalled)
1966                         return 0;
1967
1968                 /* wait a bit for the reclaimer. */
>1969                 msleep(100);
1970                 stalled = true;
1971
1972                 /* We are about to die and free our memory. Return now. */
1973                 if (fatal_signal_pending(current))
1974                         return SWAP_CLUSTER_MAX;
1975         }

and here:
2784                         /*
2785                          * If kswapd scans pages marked marked for 
immediate
2786                          * reclaim and under writeback (nr_immediate), it
2787                          * implies that pages are cycling through the LRU
2788                          * faster than they are written so also forcibly 
stall.
2789                          */
2790                         if (sc->nr.immediate)
>2791                                 congestion_wait(BLK_RW_ASYNC, HZ/10);
2792                 }

except here, codes where set_bdi_congested will clear_bdi_congested at proper 
time,
exactly the source files you mentioned above, so it's OK.
2808                 if (!sc->hibernation_mode && !current_is_kswapd() &&
2809                    current_may_throttle() && pgdat_memcg_congested(pgdat, 
root))
2810                         wait_iff_congested(BLK_RW_ASYNC, HZ/10);


Reply via email to