On Wed 23-01-19 12:24:38, Yang Shi wrote: > > > On 1/23/19 1:59 AM, Michal Hocko wrote: > > On Wed 23-01-19 04:09:42, Yang Shi wrote: > > > In current implementation, both kswapd and direct reclaim has to iterate > > > all mem cgroups. It is not a problem before offline mem cgroups could > > > be iterated. But, currently with iterating offline mem cgroups, it > > > could be very time consuming. In our workloads, we saw over 400K mem > > > cgroups accumulated in some cases, only a few hundred are online memcgs. > > > Although kswapd could help out to reduce the number of memcgs, direct > > > reclaim still get hit with iterating a number of offline memcgs in some > > > cases. We experienced the responsiveness problems due to this > > > occassionally. > > Can you provide some numbers? > > What numbers do you mean? How long did it take to iterate all the memcgs? > For now I don't have the exact number for the production environment, but > the unresponsiveness is visible.
Yeah, I would be interested in the worst case direct reclaim latencies. You can get that from our vmscan tracepoints quite easily. > I had some test number with triggering direct reclaim with 8k memcgs > artificially, which has just one clean page charged for each memcg, so the > reclaim is cheaper than real production environment. > > perf shows it took around 220ms to iterate 8k memcgs: > > dd 13873 [011] 578.542919: > vmscan:mm_vmscan_direct_reclaim_begin > dd 13873 [011] 578.758689: > vmscan:mm_vmscan_direct_reclaim_end > > So, iterating 400K would take at least 11s in this artificial case. The > production environment is much more complicated, so it would take much > longer in fact. Having real world numbers would definitely help with the justification. > > > Here just break the iteration once it reclaims enough pages as what > > > memcg direct reclaim does. This may hurt the fairness among memcgs > > > since direct reclaim may awlays do reclaim from same memcgs. But, it > > > sounds ok since direct reclaim just tries to reclaim SWAP_CLUSTER_MAX > > > pages and memcgs can be protected by min/low. > > OK, this makes some sense to me. The purpose of the direct reclaim is > > to reclaim some memory and throttle the allocation pace. The iterator is > > cached so the next reclaimer on the same hierarchy will simply continue > > so the fairness should be more or less achieved. > > Yes, you are right. I missed this point. > > > > > Btw. is there any reason to keep !global_reclaim() check in place? Why > > is it not sufficient to exclude kswapd? > > Iterating all memcgs in kswapd is still useful to help to reduce those > zombie memcgs. Yes, but for that you do not need to check for global_reclaim right? -- Michal Hocko SUSE Labs