Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-30 Thread Ujwal Kundur
> Sorry I don't have an opinion on which of these is the best (I can try
> to find some time to form an opionion on this later!), but:
>
> Fixing the flakiness sounds great, but I would suggest decoupling that
> from the refactoring. If it's practical, focus on removing the globals
> first, while leaving the fundamental logic the same, even if it's bad.
> Then as a separate series, fix the logic.

Thanks, much appreciated. I'll send a patch with the refactoring + a comment
regarding this variable left-in and follow-up with a fix.



Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-26 Thread Brendan Jackman
On Sun May 25, 2025 at 7:19 PM UTC, Ujwal Kundur wrote:
>> I'm afraid I'm too ignorant of this code to be able to suggest something
>> good here. But, can we just remove the comment and plumb the gopts
>> through to uffd_poll_thread()->uffd_handle_page_fault()->__copy_page()?
>>
>> This is not pretty but it lets us remove the global vars which is
>> clearly a step in the right direction.
>
> Perhaps Andrew can weigh in? If I understood this correctly, we're
> trying to assert that retrying a successful UFFDIO_COPY operation
> always results in EEXIST. This is being done in a somewhat racy
> fashion where a flag (test_uffdio_copy_eexist) is set every 10 seconds
> using alarm(2). IMO this is a flaky test, we should either:
> - remove this variable and associated logic entirely (preferred)
> - use a probability function to set this a % of the time instead of
> every 10 seconds
> - use an async library that can replace the implementation without the
> use of global vars

Sorry I don't have an opinion on which of these is the best (I can try
to find some time to form an opionion on this later!), but:

Fixing the flakiness sounds great, but I would suggest decoupling that
from the refactoring. If it's practical, focus on removing the globals
first, while leaving the fundamental logic the same, even if it's bad.
Then as a separate series, fix the logic. 



Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-25 Thread Ujwal Kundur
> Sounds like that's your issue - for the kernel, tab is supposed to be
> as wide as 8 spaces, not 4.

That fixed it, thanks! I've gone through the diff and made sure there
are no longer any inconsistent indents.

> I'm afraid I'm too ignorant of this code to be able to suggest something
> good here. But, can we just remove the comment and plumb the gopts
> through to uffd_poll_thread()->uffd_handle_page_fault()->__copy_page()?
>
> This is not pretty but it lets us remove the global vars which is
> clearly a step in the right direction.

Perhaps Andrew can weigh in? If I understood this correctly, we're
trying to assert that retrying a successful UFFDIO_COPY operation
always results in EEXIST. This is being done in a somewhat racy
fashion where a flag (test_uffdio_copy_eexist) is set every 10 seconds
using alarm(2). IMO this is a flaky test, we should either:
- remove this variable and associated logic entirely (preferred)
- use a probability function to set this a % of the time instead of
every 10 seconds
- use an async library that can replace the implementation without the
use of global vars



Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-20 Thread Brendan Jackman
On Mon May 19, 2025 at 1:50 PM UTC, Ujwal Kundur wrote:
> Thanks for the review and testing!
>
>>> -static void retry_copy_page(int ufd, struct uffdio_copy *uffdio_copy,
>>> - unsigned long offset)
>>> +static void retry_copy_page(uffd_global_test_opts_t *gopts, struct 
>>> uffdio_copy *uffdio_copy,
>>> + unsigned long offset)
>>>  {
>>> - uffd_test_ops->alias_mapping(&uffdio_copy->dst,
>>> -  uffdio_copy->len,
>>> -  offset);
>>> - if (ioctl(ufd, UFFDIO_COPY, uffdio_copy)) {
>>> + uffd_test_ops->alias_mapping(gopts,
>>> + 
>>> &uffdio_copy->dst,
>>> + 
>>> uffdio_copy->len,
>>> + offset);
>
>> Looks like your editor got a bit excited here :D
>>
>> There are a few other places where this happened too, e.g. the
>> are_count() declaration and there's a pthread_create_call() that's quite
>> messed up.
>
> I use vim with `:set list` turned on to display control characters and
> it looked fine to me when I submitted the patch :(
> Here's the output of $ cat -A uffd-common.c | grep -A 15 retry_copy_page:
> (where ^I represents a tab equivalent to 4 spaces)

Sounds like that's your issue - for the kernel, tab is supposed to be
as wide as 8 spaces, not 4.

>> Unfortunately I don't know of any tool that can find/fix these issues
>> automatically without also messing up the whole file. Could you just
>> do a visual skim and fix what you can spot?
>
> Yeah, I ran clang-format and it's playing by its own rules -- do we
> have a standard .clang-format file?
> Please let me know if there's a better way to find/fix whitespace
> formatting issues, I found a few inconsistencies which I will fix.

There is a .clang-format in the tree. The issue is that (AFAIK) there's
no way to retrofit clang-format really, it has to be applied to the
entire source file so if there are pre-existing deviations from its
configuration then it will "fix" those too, creating a bunch of diff
noise.

I think we just have to do it manually. checkpatch.pl can help with some
formatting issues (and it is diff-aware) but I don't think it knows
anything about this kind of hanging indentation stuff.

>
>>  static void sigalrm(int sig)
>>  {
>>   if (sig != SIGALRM)
>>   abort();
>> - test_uffdio_copy_eexist = true;
>> + // TODO: Set this without access to global vars
>> + // gopts->test_uffdio_copy_eexist = true;
>>
>> Did you mean to leave this like that?
>
> Nice catch! I was hoping someone would suggest a better way to handle
> this. As far as I can tell, this was written the way it was as a
> consequence of using global variables.
> I think this sets up retries for copies in a racy way using alarm(2) /
> signal(2), not sure how effective that has proven to be. I believe the
> only way to keep this functionality would be to introduce some async
> action (libev, libuv, etc.), but is that required?

I'm afraid I'm too ignorant of this code to be able to suggest something
good here. But, can we just remove the comment and plumb the gopts
through to uffd_poll_thread()->uffd_handle_page_fault()->__copy_page()?

This is not pretty but it lets us remove the global vars which is
clearly a step in the right direciton.



Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-19 Thread Andrew Morton
On Mon, 19 May 2025 19:20:31 +0530 Ujwal Kundur  wrote:

> I'm not sure about the protocol here, should I roll a PATCH v4 or a
> new patch entirely?

Those two are the same thing.

I dropped the v3 patch, my inbox eagerly awaits v4, thanks.



Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-19 Thread Ujwal Kundur
Thanks for the review and testing!

>> -static void retry_copy_page(int ufd, struct uffdio_copy *uffdio_copy,
>> - unsigned long offset)
>> +static void retry_copy_page(uffd_global_test_opts_t *gopts, struct 
>> uffdio_copy *uffdio_copy,
>> + unsigned long offset)
>>  {
>> - uffd_test_ops->alias_mapping(&uffdio_copy->dst,
>> -  uffdio_copy->len,
>> -  offset);
>> - if (ioctl(ufd, UFFDIO_COPY, uffdio_copy)) {
>> + uffd_test_ops->alias_mapping(gopts,
>> + 
>> &uffdio_copy->dst,
>> + 
>> uffdio_copy->len,
>> + offset);

> Looks like your editor got a bit excited here :D
>
> There are a few other places where this happened too, e.g. the
> are_count() declaration and there's a pthread_create_call() that's quite
> messed up.

I use vim with `:set list` turned on to display control characters and
it looked fine to me when I submitted the patch :(
Here's the output of $ cat -A uffd-common.c | grep -A 15 retry_copy_page:
(where ^I represents a tab equivalent to 4 spaces)
static void retry_copy_page(uffd_global_test_opts_t *gopts, struct
uffdio_copy *uffdio_copy,$
^I^I^I^I^I^I^Iunsigned long offset)$
{$
^Iuffd_test_ops->alias_mapping(gopts,$
^I^I^I^I^I^I^I^I&uffdio_copy->dst,$
^I^I^I^I^I^I^I^Iuffdio_copy->len,$
^I^I^I^I^I^I^I^Ioffset);$
^Iif (ioctl(gopts->uffd, UFFDIO_COPY, uffdio_copy)) {$
^I^I/* real retval in ufdio_copy.copy */$
^I^Iif (uffdio_copy->copy != -EEXIST)$
^I^I^Ierr("UFFDIO_COPY retry error: %"PRId64,$
^I^I^I(int64_t)uffdio_copy->copy);$
^I} else {$
^I^Ierr("UFFDIO_COPY retry unexpected: %"PRId64,$
^I^I(int64_t)uffdio_copy->copy);$
^I}$

I checked this against master:
$ cat -A uffd-common.c | grep -A 15 retry_copy_page
static void retry_copy_page(int ufd, struct uffdio_copy *uffdio_copy,$
^I^I^Iunsigned long offset)$
{$
^Iuffd_test_ops->alias_mapping(&uffdio_copy->dst,$
^I^I^I^I uffdio_copy->len,$
^I^I^I^I offset);$
^Iif (ioctl(ufd, UFFDIO_COPY, uffdio_copy)) {$
^I^I/* real retval in ufdio_copy.copy */$
^I^Iif (uffdio_copy->copy != -EEXIST)$
^I^I^Ierr("UFFDIO_COPY retry error: %"PRId64,$
^I^I^I(int64_t)uffdio_copy->copy);$
^I} else {$
^I^Ierr("UFFDIO_COPY retry unexpected: %"PRId64,$
^I^I(int64_t)uffdio_copy->copy);$
^I}$
}$

and there seem to be spaces mixed in earlier causing the formatting
issues. It looks okay to me after I pulled in the patch to my local
repo.

> Unfortunately I don't know of any tool that can find/fix these issues
> automatically without also messing up the whole file. Could you just
> do a visual skim and fix what you can spot?

Yeah, I ran clang-format and it's playing by its own rules -- do we
have a standard .clang-format file?
Please let me know if there's a better way to find/fix whitespace
formatting issues, I found a few inconsistencies which I will fix.

>  static void sigalrm(int sig)
>  {
>   if (sig != SIGALRM)
>   abort();
> - test_uffdio_copy_eexist = true;
> + // TODO: Set this without access to global vars
> + // gopts->test_uffdio_copy_eexist = true;
>
> Did you mean to leave this like that?

Nice catch! I was hoping someone would suggest a better way to handle
this. As far as I can tell, this was written the way it was as a
consequence of using global variables.
I think this sets up retries for copies in a racy way using alarm(2) /
signal(2), not sure how effective that has proven to be. I believe the
only way to keep this functionality would be to introduce some async
action (libev, libuv, etc.), but is that required?

> + /* TODO: remove this global var.. it's so ugly */
>
> That's done :)

Oh I misunderstood that as "remove nr_parallel" which is not the case, will fix.

> @@ -1734,6 +1737,27 @@ int main(int argc, char *argv[])
>   }
>   for (j = 0; j < n_mems; j++) {
>   mem_type = &mem_types[j];
> +
> + // Initialize global test options
>
> Wrong comment style here

Will fix

I'm not sure about the protocol here, should I roll a PATCH v4 or a
new patch entirely?
Planning on addressing another TODO for dynamically setting the
BASE_PMD_ADDR, I can roll the fixes as part of that patch if required.



Re: [PATCH v3 1/1] selftests/mm/uffd: Refactor non-composite global vars into struct

2025-05-13 Thread Brendan Jackman
On Sat May 10, 2025 at 4:03 PM UTC, Ujwal Kundur wrote:
> Refactor macros and non-composite global variable definitions into a
> struct that is defined at the start of a test and is passed around
> instead of relying on global vars.
>
> Signed-off-by: Ujwal Kundur 

Tested-by: Brendan Jackman https://github.com/bjackman/linux/blob/github-base/.github/scripts/run_local.sh

> ---
>  Changes since v2:
>  - redo patch on mm-new branch
>  Changes since v1:
>  - indentation fixes
>  - squash into single patch to assist bisections

Thanks for this.

> -static void retry_copy_page(int ufd, struct uffdio_copy *uffdio_copy,
> - unsigned long offset)
> +static void retry_copy_page(uffd_global_test_opts_t *gopts, struct 
> uffdio_copy *uffdio_copy,
> + unsigned long offset)
>  {
> - uffd_test_ops->alias_mapping(&uffdio_copy->dst,
> -  uffdio_copy->len,
> -  offset);
> - if (ioctl(ufd, UFFDIO_COPY, uffdio_copy)) {
> + uffd_test_ops->alias_mapping(gopts,
> + 
> &uffdio_copy->dst,
> + 
> uffdio_copy->len,
> + offset);

Looks like your editor got a bit excited here :D

There are a few other places where this happened too, e.g. the
are_count() declaration and there's a pthread_create_call() that's quite
messed up.

Unfortunately I don't know of any tool that can find/fix these issues
automatically without also messing up the whole file. Could you just
do a visual skim and fix what you can spot?

>  static void sigalrm(int sig)
>  {
>   if (sig != SIGALRM)
>   abort();
> - test_uffdio_copy_eexist = true;
> + // TODO: Set this without access to global vars
> + // gopts->test_uffdio_copy_eexist = true;

Did you mean to leave this like that?

> @@ -1734,6 +1737,27 @@ int main(int argc, char *argv[])
>   }
>   for (j = 0; j < n_mems; j++) {
>   mem_type = &mem_types[j];
> +
> + // Initialize global test options

Wrong comment style here

> + uffd_global_test_opts_t gopts;
> +
> + gopts.map_shared = mem_type->shared;
> + uffd_test_ops = mem_type->mem_ops;
> + uffd_test_case_ops = test->test_case_ops;
> +
> + if (mem_type->mem_flag & (MEM_HUGETLB_PRIVATE | 
> MEM_HUGETLB))
> + gopts.page_size = default_huge_page_size();
> + else
> + gopts.page_size = psize();
> +
> + /* Ensure we have at least 2 pages */
> + gopts.nr_pages = MAX(UFFD_TEST_MEM_SIZE, 
> gopts.page_size * 2) / gopts.page_size;
> + /* TODO: remove this global var.. it's so ugly */

That's done :)

> + gopts.nr_parallel = 1;
> +
> + /* Initialize test arguments */

(This comment seems like noise? I could be wrong, not a big deal).


Thanks for these improvements. Bit of a hasty review and I'm not really
qualified to comment on the test logic itself, but aside from that and
my nits:

Reviewed-by: Brendan Jackman