On Thu, Aug 07, 2025 at 02:56:28PM +0530, Donet Tom wrote: > >On 8/6/25 8:24 PM, Wei Yang wrote: >> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote: >> [...] >> > > Child process inherit the ksm_merging_pages from parent, which is >> > > reasonable >> > > to me. But I am confused why ksm_unmerge() would just reset >> > > ksm_merging_pages >> > > for parent and leave ksm_merging_pages in child process unchanged. >> > > >> > > ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide >> > > sysfs >> > > interface. I expect it applies to both parent and child. >> > I am not very familiar with the KSM code, but from what I understand: >> > >> > The ksm_merging_pages counter is maintained per mm_struct. When >> > we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the >> > counters are updated for all mm_structs present in the ksm_mm_slot list. >> > >> > A mm_struct gets added to this list when MADV_MERGEABLE is called. >> > In the case of the child process, since MADV_MERGEABLE has not been >> > invoked yet, its mm_struct is not part of the list. As a result, >> > its ksm_merging_pages counter is not reset. >> > >> Would this flag be inherited during fork? VM_MERGEABLE is saved in related >> vma >> I don't see it would be dropped during fork. Maybe missed. >> >> > > > value remained unchanged. That’s why get_my_merging_page() in the >> > > > child was >> > > > returning a non-zero value. >> > > > >> > > I guess you mean the get_my_merging_page() in __mmap_and_merge_range() >> > > return >> > > a non-zero value. But there is ksm_unmerge() before it. Why this >> > > ksm_unmerge() >> > > couldn't reset the value, but a ksm_unmerge() in parent could. >> > > >> > > > Initially, I fixed the issue by calling ksm_unmerge() before the >> > > > fork(), and >> > > > that >> > > > resolved the problem. Later, I decided it would be cleaner to move the >> > > > ksm_unmerge() call to the test cleanup phase. >> > > > >> > > Also all the tests before test_prctl_fork(), except test_prctl(), calls >> > > >> > > ksft_test_result(!range_maps_duplicates()); >> > > >> > > If the previous tests succeed, it means there is no duplicate pages. This >> > > means ksm_merging_pages should be 0 before test_prctl_fork() if other >> > > tests >> > > pass. And the child process would inherit a 0 ksm_merging_pages. (A >> > > quick test >> > > proves it.) >> > >> > If I understand correctly, all the tests are calling MADV_UNMERGEABLE, >> > which internally calls break_ksm() in the kernel. This function replaces >> > the >> > KSM page with an exclusive anonymous page. However, the >> > ksm_merging_pages counters are not updated at this point. >> > >> > The function range_maps_duplicates(map, size) checks whether the pages >> > have been unmerged. Since break_ksm() does perform the unmerge, this >> > function returns false, and the test passes. >> > >> > The ksm_merging_pages update happens later via the ksm_scan_thread(). >> > That’s why we observe that ksm_merging_pages values are not reset >> > immediately after the test finishes. >> > >> Not familiar with ksm internal. But the ksm_merging_pages counter still has >> non-zero value when all merged pages are unmerged makes me feel odd. >> >> > If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that >> > the ksm_merging_pages values are reset after the sleep. >> > >> > Once the test completes successfully, we can call ksm_unmerge(), which >> > will immediately reset the ksm_merging_pages value. This way, in the fork >> > test, the child process will also see the correct value. >> > > So which part of the story I missed? >> > > >> > So, during the cleanup phase after a successful test, we can call >> > ksm_unmerge() to reset the counter. Do you see any issue with >> > this approach? >> > >> It looks there is no issue with an extra ksm_unmerge(). >> >> But one more question. Why an extra ksm_unmerge() could help. >> >> Here is what we have during test: >> >> >> test_prot_none() >> !range_maps_duplicates() >> ksm_unmerge() 1) <--- newly add >> test_prctl_fork() >> >--- in child >> __mmap_and_merge_range() >> ksm_unmerge() 2) <--- already have >> >> As you mentioned above ksm_unmerge() would immediately reset >> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages >> non-zero? And the one at 1) could help. > > >From the debugging, what I understood is: > >When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the >mm_struct of the process gets added to the ksm_mm_slot list. As a >result, both the parent and child processes’ mm_struct structures >will be present in ksm_mm_slot. > >When KSM merges the pages, it creates a ksm_rmap_item for each page, >and the ksm_merging_pages counter is incremented accordingly. > >Since the parent process did the merge, its mm_struct is present in >ksm_mm_slot, and ksm_rmap_item entries are created for all the merged >pages. > >When a process is forked, the child’s mm_struct is also added to >ksm_mm_slot, and it inherits the ksm_merging_pages count. However, >no ksm_rmap_item entries are created for the child process because it >did not do any merge. > >When ksm_unmerge() is called, it iterates over all processes in >ksm_mm_slot. In our case, both the parent and child are present. It >first processes the parent, which has ksm_rmap_item entries, so it >unmerges the pages and resets the ksm_merging_pages counter. > >For the child, since it did not perform any actual merging, it does not >have any ksm_rmap_item entries. Therefore, there are no pages to unmerge, >and the counter remains unchanged. >
Thanks for the detailed analysis. So the key is child has no ksm_rmap_item which will not clear ksm_merging_page on ksm_unmerge(). >So, only processes that performed KSM merging will have their counters >updated during ksm_unmerge(). The child process, having not initiated any >merging, retains the inherited counter value without any update. > >So from a testing point of view, I think it is better to reset the >counters as part of the cleanup code to ensure that the next tests do >not get incorrect values. > Hmm... I agree from the test point of view based on current situation. While maybe this is also a check point for later version. >The question I have is: is it correct to keep the inherited >|ksm_merging_page| >value in the child or Should we reset it to 0 during |ksm_fork()|? > Very good question. There looks to be something wrong, but I am not sure this is the correct way. >> >> Or there is still some timing issue like sleep(1) you did? >> -- Wei Yang Help you, Help me

