Michal Hocko wrote: > On Wed 21-10-15 09:49:07, Christoph Lameter wrote: > > On Wed, 21 Oct 2015, Michal Hocko wrote: > > > > > Because all the WQ workers are stuck somewhere, maybe in the memory > > > allocation which cannot make any progress and the vmstat update work is > > > queued behind them.
After invoking the OOM killer, we can easily observe that vmstat_update cannot be processed due to memory allocation by disk_events_workfn stalls. http://lkml.kernel.org/r/201509120019.bji48986.oosvmjtolfq...@i-love.sakura.ne.jp I worried that blocking forever from workqueue is an exclusive occupation of workqueue. In fact, changing to GFP_ATOMIC avoids this problem. http://lkml.kernel.org/r/201503012017.ead00571.hoojvostmfl...@i-love.sakura.ne.jp Now we realized that we are hitting this problem before invoking the OOM killer. The situation is similar to the case after the OOM killer is invoked; there are no reclaimable pages but vmstat_update cannot be processed. We are caught by a small difference of vmstat counter values. > > > > > > At least this is my current understanding. > > > > Eww. Maybe need a queue that does not do such evil things as memory > > allocation? > > I am not sure how to achieve that. Requiring non-sleeping worker would > work out but do we have enough users to add such an API? If a queue does not need to sleep, can't that queue be processed from timer context (e.g. mod_timer()) ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/