On 9/17/20 6:52 AM, kan.li...@linux.intel.com wrote: > + mm = current->mm; > + if (!mm) { > + /* > + * For kernel threads and the like, use init_mm so that > + * we can find kernel memory. > + */ > + mm = &init_mm; > + }
I think it might be better to use current->active_mm instead of current->mm. Kernel threads can "take over" the mm of the threads that switched to them, so they're not actually using all of the page tables from the init_mm all the time. It's not _common_, thought. The only time that I think they can diverge is when vmalloc PGD sync'ing needs to be done, and there's even an effort to remove that. But, it's probably more more precise to just use ->active_mm since that's what will actually be more consistent with the values loaded into CR3. I _think_ ->active_mm is always non-NULL, too. One last concern as I look at this: I wish it was a bit more future-proof. There are lots of weird things folks are trying to do with the page tables, like Address Space Isolation. For instance, if you get a perf NMI when running userspace, current->mm->pgd is *different* than the PGD that was in use when userspace was running. It's close enough today, but it might not stay that way. But I can't think of any great ways to future proof this code, other than spitting out an error message if too many of the page table walks fail when they shouldn't.