On Fri, Mar 20, 2026 at 04:42:41PM -0400, Waiman Long wrote:
> Although there is supposed to be a periodic and asynchronous flush of
> stats every 2 seconds, the actual time lag between succesive runs can
> actually vary quite a bit. In fact, I have seen time lag of up to 10s
> of seconds in some cases.
> 
> At the end of test_memcg_sock, it waits up to 3 seconds for the
> "sock" attribute of memory.stat to go back down to 0. Obviously it
> may occasionally fail especially when the kernel has large page size
> (e.g. 64k). Treat this failure as an expected failure (XFAIL) to
> distinguish it from the other failure cases.
> 
> Signed-off-by: Waiman Long <[email protected]>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c 
> b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 5336be5ed2f5..af3e8fe4e50e 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1486,12 +1486,21 @@ static int test_memcg_sock(const char *root)
>        * Poll memory.stat for up to 3 seconds (~FLUSH_TIME plus some
>        * scheduling slack) and require that the "sock " counter
>        * eventually drops to zero.
> +      *
> +      * The actual run-to-run elapse time between consecutive run
> +      * of asynchronous memcg rstat flush may varies quite a bit.
> +      * So the 3 seconds wait time may not be enough for the "sock"
> +      * counter to go down to 0. Treat it as a XFAIL instead of
> +      * a FAIL.
>        */
>       sock_post = cg_read_key_long_poll(memcg, "memory.stat", "sock ", 0,
>                                        MEMCG_SOCKSTAT_WAIT_RETRIES,
>                                        DEFAULT_WAIT_INTERVAL_US);
> -     if (sock_post)
> +     if (sock_post) {
> +             if (sock_post > 0)
> +                     ret = KSFT_XFAIL;

XFAIL means "expected failure" and is intended for known kernel bugs or
unsupported features. A timing issue where the test simply doesn't wait
long enough probably not an expected failure, it's a test that needs a
longer timeout.

I'm wondering can we just enlarge the MEMCG_SOCKSTAT_WAIT_RETRIES value?
e.g. from 30 to 150


-- 
Regards,
Li Wang


Reply via email to