* Dave Hansen <d...@sr71.net> wrote: > On 09/24/2015 02:30 AM, Ingo Molnar wrote: > >> To answer your question in the comment: it looks useful to have some sort > >> of > >> 'extended page fault error code' information here, which shows why the > >> page fault > >> happened. With the regular error_code it's easy - with protection keys > >> there's 16 > >> separate keys possible and user-space might not know the actual key value > >> in the > >> pte. > > > > Btw., alternatively we could also say that user-space should know what > > protection > > key it used when it created the mapping - there's no need to recover it for > > every > > page fault. > > That's true. We don't, for instance, tell userspace whether it was a > write that caused a fault.
I think we do put it into the signal frame, see setup_sigcontext(): put_user_ex(current->thread.error_code, &sc->err); and 'error_code & PF_WRITE' tells us whether it's a write fault. And I'm pretty sure applications like Valgrind rely on this. > But, other than smaps we don't have *any* way to tell userspace what > protection > key a page has. I think some mechanism is going to be required for this to > be > reasonably debuggable. I think it's a conceptual extension of sigcontext::err and we need it for similar reasons. > > OTOH, as long as we don't do a separate find_vma(), it looks cheap enough > > to > > look up the pkey value of that address and give it to user-space in the > > signal > > frame. > > I still think that find_vma() in this case is pretty darn cheap, definitely > if > you compare it to the cost of the entire fault path. So where's the problem? We have already looked up the vma and know whether there's any vma there or not. Why not pass in that pointer and be done with it? Why complicate the code by looking up a second time (and exposing us to various races)? > > Btw., how does pkey support interact with hugepages? > > Surprisingly little. I've made sure that everything works with huge pages > and > that the (huge) PTEs and VMAs get set up correctly, but I'm not sure I had to > touch the huge page code at all. I have test code to ensure that it works > the > same as with small pages, but everything worked pretty naturally. Yeah, so the reason I'm asking about expectations is that this code: + follow_ret = follow_pte(tsk->mm, address, &ptep, &ptl); + if (!follow_ret) { + /* + * On a successful follow, make sure to + * drop the lock. + */ + pte = *ptep; + pte_unmap_unlock(ptep, ptl); + ret = pte_pkey(pte); is visibly hugepage-unsafe: if a vma is hugepage mapped, there are no ptes, only pmds - and the protection key index lives in the pmd. We don't seem to recover that information properly. In any case, please put those hugepage tests into tools/tests/selftests/x86/ as well, as part of the pkey series. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/