On Fri, Mar 20, 2026 at 04:42:38PM -0400, Waiman Long wrote:
> It was found that some of the tests in test_memcontrol can fail more
> readily if system page size is larger than 4k. It is because the
> actual memory.current value deviates more from the expected value with
> larger page size. This is likely due to the fact there may be up to
> MEMCG_CHARGE_BATCH pages of charge hidden in each one of the percpu
> memcg_stock.
>
> To avoid this failure, the error tolerance is now increased in accordance
> to the current system page size value. The page size scale factor is
> set to 2 for 64k page and 1 for 16k page.
>
> Changes are made in alloc_pagecache_max_30M(), test_memcg_protection()
> and alloc_anon_50M_check_swap() to increase the error tolerance for
> memory.current for larger page size. The current set of values are
> chosen to ensure that the relevant test_memcontrol tests no longer
> have any test failure in a 100 repeated run of test_memcontrol with a
> 4k/16k/64k page size kernels on an arm64 system.
>
> Signed-off-by: Waiman Long <[email protected]>
> ---
> .../cgroup/lib/include/cgroup_util.h | 3 ++-
> .../selftests/cgroup/test_memcontrol.c | 23 ++++++++++++++-----
> 2 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> index 77f386dab5e8..2293e770e9b4 100644
> --- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> +++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
> @@ -6,7 +6,8 @@
> #define PAGE_SIZE 4096
> #endif
>
> -#define MB(x) (x << 20)
> +#define KB(x) ((x) << 10)
> +#define MB(x) ((x) << 20)
>
> #define USEC_PER_SEC 1000000L
> #define NSEC_PER_SEC 1000000000L
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
> b/tools/testing/selftests/cgroup/test_memcontrol.c
> index babbfad10aaf..c078fc458def 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -26,6 +26,7 @@
> static bool has_localevents;
> static bool has_recursiveprot;
> static int page_size;
> +static int pscale_factor; /* Page size scale factor */
>
> int get_temp_fd(void)
> {
> @@ -571,16 +572,17 @@ static int test_memcg_protection(const char *root, bool
> min)
> if (cg_run(parent[2], alloc_anon, (void *)MB(148)))
> goto cleanup;
>
> - if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
> + if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50),
> + 3 + (min ? 0 : 4) * pscale_factor))
> goto cleanup;
>
> for (i = 0; i < ARRAY_SIZE(children); i++)
> c[i] = cg_read_long(children[i], "memory.current");
>
> - if (!values_close(c[0], MB(29), 15))
> + if (!values_close(c[0], MB(29), 15 + 3 * pscale_factor))
> goto cleanup;
>
> - if (!values_close(c[1], MB(21), 20))
> + if (!values_close(c[1], MB(21), 20 + pscale_factor))
> goto cleanup;
>
> if (c[3] != 0)
> @@ -596,7 +598,8 @@ static int test_memcg_protection(const char *root, bool
> min)
> }
>
> current = min ? MB(50) : MB(30);
> - if (!values_close(cg_read_long(parent[1], "memory.current"), current,
> 3))
> + if (!values_close(cg_read_long(parent[1], "memory.current"), current,
> + 9 + (min ? 0 : 6) * pscale_factor))
> goto cleanup;
>
> if (!reclaim_until(children[0], MB(10)))
> @@ -684,7 +687,7 @@ static int alloc_pagecache_max_30M(const char *cgroup,
> void *arg)
> goto cleanup;
>
> current = cg_read_long(cgroup, "memory.current");
> - if (!values_close(current, MB(30), 5))
> + if (!values_close(current, MB(30), 5 + (pscale_factor ? 2 : 0)))
> goto cleanup;
>
> ret = 0;
> @@ -1004,7 +1007,7 @@ static int alloc_anon_50M_check_swap(const char
> *cgroup, void *arg)
> *ptr = 0;
>
> mem_current = cg_read_long(cgroup, "memory.current");
> - if (!mem_current || !values_close(mem_current, mem_max, 3))
> + if (!mem_current || !values_close(mem_current, mem_max, 6 +
> pscale_factor))
> goto cleanup;
>
> swap_current = cg_read_long(cgroup, "memory.swap.current");
> @@ -1684,6 +1687,14 @@ int main(int argc, char **argv)
> if (page_size <= 0)
> page_size = PAGE_SIZE;
>
> + /*
> + * It is found that the actual memory.current value can deviate more
> + * from the expected value with larger page size. So error tolerance
> + * will have to be increased a bit more for larger page size.
> + */
> + if (page_size > KB(4))
> + pscale_factor = (page_size >= KB(64)) ? 2 : 1;
This is a good improment but I still think the pscale_factor adjustments
are a bit fragile, each call site needs its own hand-tuned formula, and only
three page sizes (4K/16K/64K) are handled. If a new page size shows up,
every call site needs revisiting.
How about centralizing the page size adjustment inside values_close()
itself? Something like:
static inline int values_close(long a, long b, int err)
{
ssize_t page_adjusted_err = ffs(page_size >> 13) + err;
return 100 * labs(a - b) <= (a + b) * page_adjusted_err;
}
This adds one extra percent of tolerance per doubling above 4K, scales
continuously for any power-of-two page size, and also fixes an integer
truncation issue in the original: (a + b) / 100 * err loses precision
when (a + b) < 100.
With this, the callers wouldn't need any changes at all.
This method is inspired from LTP:
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/controllers/memcg/memcontrol_common.h#L27
--
Regards,
Li Wang