Hi Mel,

On Wed, Jul 20, 2016 at 04:21:46PM +0100, Mel Gorman wrote:
> Both Joonsoo Kim and Minchan Kim have reported premature OOM kills on
> a 32-bit platform. The common element is a zone-constrained high-order
> allocation failing. Two factors appear to be at fault -- pgdat being

Strictly speaking, my case is order-0 allocation failing, not high-order.
;)

> considered unreclaimable prematurely and insufficient rotation of the
> active list.
> 
> Unfortunately to date I have been unable to reproduce this with a variety
> of stress workloads on a 2G 32-bit KVM instance. It's not clear why as
> the steps are similar to what was described. It means I've been unable to
> determine if this series addresses the problem or not. I'm hoping they can
> test and report back before these are merged to mmotm. What I have checked
> is that a basic parallel DD workload completed successfully on the same
> machine I used for the node-lru performance tests. I'll leave the other
> tests running just in case anything interesting falls out.
> 
> The series is in three basic parts;
> 
> Patch 1 does not account for skipped pages as scanned. This avoids the pgdat
>       being prematurely marked unreclaimable
> 
> Patches 2-4 add per-zone stats back in. The actual stats patch is different
>       to Minchan's as the original patch did not account for unevictable
>       LRU which would corrupt counters. The second two patches remove
>       approximations based on pgdat statistics. It's effectively a
>       revert of "mm, vmstat: remove zone and node double accounting by
>       approximating retries" but different LRU stats are used. This
>       is better than a full revert or a reworking of the series as
>       it preserves history of why the zone stats are necessary.
> 
>       If this work out, we may have to leave the double accounting in
>       place for now until an alternative cheap solution presents itself.
> 
> Patch 5 rotates inactive/active lists for lowmem allocations. This is also
>       quite different to Minchan's patch as the original patch did not
>       account for memcg and would rotate if *any* eligible zone needed
>       rotation which may rotate excessively. The new patch considers
>       the ratio for all eligible zones which is more in line with
>       node-lru in general.
> 

Now I tested and confirmed it works for me at the OOM point of view.
IOW, I cannot see OOM kill any more. But note that I tested it
without [1/5] which has a problem I mentioned in that thread.

If you want to merge [1/5], please resend updated version but
I doubt we need it at this moment.

Reply via email to