On Fri, Mar 20, 2026 at 04:42:36PM -0400, Waiman Long wrote:
> For a system with 4k page size, each percpu memcg_stock can hide up
> to 256 kbytes of memory with the current MEMCG_CHARGE_BATCH value of
> 64. For another system with 64k page size, that becomes 4 Mbytes. This
> hidden charges will affect the accurary of the memory.current value.
> 
> This MEMCG_CHARGE_BATCH value also controls how often should the
> memcg vmstat values need flushing. As a result, the values reported
> in memory.stat cgroup control files are less indicative of the actual
> memory consumption of a particular memory cgroup when the page size
> increases from 4k.
> 
> This problem can be illustrated by running the test_memcontrol
> selftest. Running a 4k page size kernel on a 128-core arm64 system,
> the test_memcg_current_peak test which allocates a 50M anonymous memory
> passed. With a 64k page size kernel on the same system, however, the
> same test failed because the "anon" attribute of memory.stat file might
> report a size of 0 depending on the number of CPUs the system has.
> 
> To solve this inaccurate memory stats problem, we need to scale down
> the amount of memory that can be hidden by reducing MEMCG_CHARGE_BATCH
> when the page size increases. The same user application will likely
> consume more memory on systems with larger page size and it is also
> less efficient if we scale down MEMCG_CHARGE_BATCH by too much.  So I
> believe a good compromise is to scale down MEMCG_CHARGE_BATCH by 2 for
> 16k page size and by 4 with 64k page size.
> 
> With that change, the test_memcg_current_peak test passed again with
> the modified 64k page size kernel.
> 
> Signed-off-by: Waiman Long <[email protected]>
> ---
>  include/linux/memcontrol.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 70b685a85bf4..748cfd75d998 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -328,8 +328,14 @@ struct mem_cgroup {
>   * size of first charge trial.
>   * TODO: maybe necessary to use big numbers in big irons or dynamic based of 
> the
>   * workload.
> + *
> + * There are 3 common base page sizes - 4k, 16k & 64k. In order to limit the
> + * amount of memory that can be hidden in each percpu memcg_stock for a given
> + * memcg, we scale down MEMCG_CHARGE_BATCH by 2 for 16k and 4 for 64k.
>   */
> -#define MEMCG_CHARGE_BATCH 64U
> +#define MEMCG_CHARGE_BATCH_BASE  64U
> +#define MEMCG_CHARGE_BATCH_SHIFT ((PAGE_SHIFT <= 16) ? (PAGE_SHIFT - 12)/2 : 
> 2)
> +#define MEMCG_CHARGE_BATCH    (MEMCG_CHARGE_BATCH_BASE >> 
> MEMCG_CHARGE_BATCH_SHIFT)
>  
>  extern struct mem_cgroup *root_mem_cgroup;

Reviewed-by: Li Wang <[email protected]>

-- 
Regards,
Li Wang


Reply via email to