On 2015/08/10 17:14, Vladimir Davydov wrote:
On Sun, Aug 09, 2015 at 11:12:25PM +0900, Kamezawa Hiroyuki wrote:
On 2015/08/08 22:05, Vladimir Davydov wrote:
On Fri, Aug 07, 2015 at 10:38:16AM +0900, Kamezawa Hiroyuki wrote:
...
All ? hmm. It seems that mixture of record of global memory pressure and of
local memory
pressure is just wrong.
What makes you think so? An example of misbehavior caused by this would
be nice to have.
By design, memcg's LRU aging logic is independent from global memory
allocation/pressure.
Assume there are 4 containers(using much page-cache) with 1GB limit on 4GB
server,
# contaienr A workingset=600M limit=1G (sleepy)
# contaienr B workingset=300M limit=1G (work often)
# container C workingset=500M limit=1G (work slowly)
# container D workingset=1.2G limit=1G (work hard)
container D can drive the zone's distance counter because of local memory
reclaim.
If active/inactive = 1:1, container D page can be activated.
At kswapd(global reclaim) runs, all container's LRU will rotate.
Possibility of refault in A, B, C is reduced by conainer D's counter updates.
This does not necessarily mean we have to use different inactive_age
counter for global and local memory pressure. In your example, having
inactive_age per lruvec and using it for evictions on both global and
local memory pressure would work just fine.
you're right.
if (current memcg == recorded memcg && eviction distance is
okay)
activate page.
else
inactivate
At page-out
if (global memory pressure)
record eviction id with using zone's counter.
else if (memcg local memory pressure)
record eviction id with memcg's counter.
I don't understand how this is supposed to work when a memory cgroup
experiences both local and global pressure simultaneously.
I think updating global distance counter by local reclaim may update counter
too much.
But if the inactive_age counter was per lruvec, then we wouldn't need to
bother about it.
yes.
Anyway, what I understand now is that we need to reduce influence from a
memcg's behavior
against other memcgs. Your way is dividing counter completely, my idea was
implementing
different counter. Doing it by calculation will be good because we can't have
enough record
space.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/