On 9/17/20 6:52 AM, kan.li...@linux.intel.com wrote:
> +     mm = current->mm;
> +     if (!mm) {
> +             /*
> +              * For kernel threads and the like, use init_mm so that
> +              * we can find kernel memory.
> +              */
> +             mm = &init_mm;
> +     }

I think it might be better to use current->active_mm instead of
current->mm.  Kernel threads can "take over" the mm of the threads that
switched to them, so they're not actually using all of the page tables
from the init_mm all the time.

It's not _common_, thought.  The only time that I think they can diverge
is when vmalloc PGD sync'ing needs to be done, and there's even an
effort to remove that.

But, it's probably more more precise to just use ->active_mm since
that's what will actually be more consistent with the values loaded into
CR3.

I _think_ ->active_mm is always non-NULL, too.

One last concern as I look at this: I wish it was a bit more
future-proof.  There are lots of weird things folks are trying to do
with the page tables, like Address Space Isolation.  For instance, if
you get a perf NMI when running userspace, current->mm->pgd is
*different* than the PGD that was in use when userspace was running.
It's close enough today, but it might not stay that way.  But I can't
think of any great ways to future proof this code, other than spitting
out an error message if too many of the page table walks fail when they
shouldn't.

Reply via email to