Re: [RFC 0/9] Linear Address Masking enabling
On Sun, Feb 07, 2021 at 09:24:23AM +0100, Dmitry Vyukov wrote: > On Fri, Feb 5, 2021 at 4:16 PM Kirill A. Shutemov > wrote: > > > > Linear Address Masking[1] (LAM) modifies the checking that is applied to > > 64-bit linear addresses, allowing software to use of the untranslated > > address bits for metadata. > > > > The patchset brings support for LAM for userspace addresses. > > > > The most sensitive part of enabling is change in tlb.c, where CR3 flags > > get set. Please take a look that what I'm doing makes sense. > > > > The patchset is RFC quality and the code requires more testing before it > > can be applied. > > > > The userspace API is not finalized yet. The patchset extends API used by > > ARM64: PR_GET/SET_TAGGED_ADDR_CTRL. The API is adjusted to not imply ARM > > TBI: it now allows to request a number of bits of metadata needed and > > report where these bits are located in the address. > > > > There's an alternative proposal[2] for the API based on Intel CET > > interface. Please let us know if you prefer one over another. > > > > The feature competes for bits with 5-level paging: LAM_U48 makes it > > impossible to map anything about 47-bits. The patchset made these > > capability mutually exclusive: whatever used first wins. LAM_U57 can be > > combined with mappings above 47-bits. > > > > I include QEMU patch in case if somebody wants to play with the feature. > > Exciting! Do you plan to send the QEMU patch to QEMU? Sure. After more testing, once I'm sure it's conforming to the hardware. -- Kirill A. Shutemov
Re: [RFC 0/9] Linear Address Masking enabling
On Fri, Feb 5, 2021 at 4:16 PM Kirill A. Shutemov wrote: > > Linear Address Masking[1] (LAM) modifies the checking that is applied to > 64-bit linear addresses, allowing software to use of the untranslated > address bits for metadata. > > The patchset brings support for LAM for userspace addresses. > > The most sensitive part of enabling is change in tlb.c, where CR3 flags > get set. Please take a look that what I'm doing makes sense. > > The patchset is RFC quality and the code requires more testing before it > can be applied. > > The userspace API is not finalized yet. The patchset extends API used by > ARM64: PR_GET/SET_TAGGED_ADDR_CTRL. The API is adjusted to not imply ARM > TBI: it now allows to request a number of bits of metadata needed and > report where these bits are located in the address. > > There's an alternative proposal[2] for the API based on Intel CET > interface. Please let us know if you prefer one over another. > > The feature competes for bits with 5-level paging: LAM_U48 makes it > impossible to map anything about 47-bits. The patchset made these > capability mutually exclusive: whatever used first wins. LAM_U57 can be > combined with mappings above 47-bits. > > I include QEMU patch in case if somebody wants to play with the feature. Exciting! Do you plan to send the QEMU patch to QEMU? > The branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git lam > > Any comments are welcome. > > [1] ISE, Chapter 14. > https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf > [2] > https://github.com/hjl-tools/linux/commit/e85fa032e5b276ddf17edd056f92f599db9e8369 > > Kirill A. Shutemov (9): > mm, arm64: Update PR_SET/GET_TAGGED_ADDR_CTRL interface > x86/mm: Fix CR3_ADDR_MASK > x86: CPUID and CR3/CR4 flags for Linear Address Masking > x86/mm: Introduce TIF_LAM_U57 and TIF_LAM_U48 > x86/mm: Provide untagged_addr() helper > x86/uaccess: Remove tags from the address before checking > x86/mm: Handle tagged memory accesses from kernel threads > x86/mm: Make LAM_U48 and mappings above 47-bits mutually exclusive > x86/mm: Implement PR_SET/GET_TAGGED_ADDR_CTRL with LAM > > arch/arm64/include/asm/processor.h| 12 +- > arch/arm64/kernel/process.c | 45 +- > arch/arm64/kernel/ptrace.c| 4 +- > arch/x86/include/asm/cpufeatures.h| 1 + > arch/x86/include/asm/elf.h| 3 +- > arch/x86/include/asm/mmu.h| 1 + > arch/x86/include/asm/mmu_context.h| 13 ++ > arch/x86/include/asm/page_32.h| 3 + > arch/x86/include/asm/page_64.h| 19 +++ > arch/x86/include/asm/processor-flags.h| 2 +- > arch/x86/include/asm/processor.h | 10 ++ > arch/x86/include/asm/thread_info.h| 9 +- > arch/x86/include/asm/tlbflush.h | 5 + > arch/x86/include/asm/uaccess.h| 16 +- > arch/x86/include/uapi/asm/processor-flags.h | 6 + > arch/x86/kernel/process_64.c | 145 ++ > arch/x86/kernel/sys_x86_64.c | 5 +- > arch/x86/mm/hugetlbpage.c | 6 +- > arch/x86/mm/mmap.c| 9 +- > arch/x86/mm/tlb.c | 124 +-- > kernel/sys.c | 14 +- > .../testing/selftests/arm64/tags/tags_test.c | 31 > .../selftests/{arm64 => vm}/tags/.gitignore | 0 > .../selftests/{arm64 => vm}/tags/Makefile | 0 > .../{arm64 => vm}/tags/run_tags_test.sh | 0 > tools/testing/selftests/vm/tags/tags_test.c | 57 +++ > 26 files changed, 464 insertions(+), 76 deletions(-) > delete mode 100644 tools/testing/selftests/arm64/tags/tags_test.c > rename tools/testing/selftests/{arm64 => vm}/tags/.gitignore (100%) > rename tools/testing/selftests/{arm64 => vm}/tags/Makefile (100%) > rename tools/testing/selftests/{arm64 => vm}/tags/run_tags_test.sh (100%) > create mode 100644 tools/testing/selftests/vm/tags/tags_test.c > > -- > 2.26.2 >
Re: [RFC 0/9] Linear Address Masking enabling
On Fri, Feb 05, 2021 at 06:16:20PM +0300, Kirill A. Shutemov wrote: > The feature competes for bits with 5-level paging: LAM_U48 makes it > impossible to map anything about 47-bits. The patchset made these > capability mutually exclusive: whatever used first wins. LAM_U57 can be > combined with mappings above 47-bits. And I suppose we still can't switch between 4 and 5 level at runtime, using a CR3 bit?
Re: [RFC 0/9] Linear Address Masking enabling
On Fri, Feb 05, 2021 at 04:49:05PM +0100, Peter Zijlstra wrote: > On Fri, Feb 05, 2021 at 06:16:20PM +0300, Kirill A. Shutemov wrote: > > The feature competes for bits with 5-level paging: LAM_U48 makes it > > impossible to map anything about 47-bits. The patchset made these > > capability mutually exclusive: whatever used first wins. LAM_U57 can be > > combined with mappings above 47-bits. > > And I suppose we still can't switch between 4 and 5 level at runtime, > using a CR3 bit? No. And I can't imagine how would it work with 5-level on kernel side. -- Kirill A. Shutemov
Re: [RFC 0/9] Linear Address Masking enabling
On Fri, Feb 05, 2021 at 07:01:27PM +0300, Kirill A. Shutemov wrote: > On Fri, Feb 05, 2021 at 04:49:05PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 05, 2021 at 06:16:20PM +0300, Kirill A. Shutemov wrote: > > > The feature competes for bits with 5-level paging: LAM_U48 makes it > > > impossible to map anything about 47-bits. The patchset made these > > > capability mutually exclusive: whatever used first wins. LAM_U57 can be > > > combined with mappings above 47-bits. > > > > And I suppose we still can't switch between 4 and 5 level at runtime, > > using a CR3 bit? > > No. And I can't imagine how would it work with 5-level on kernel side. KPTI already switches CR3 on every entry and only maps a very limited number of kernel pages in the user map. This means a 4 level user page-table should be possible. The kernel page-tables would only need to update their p5d[0] on every 4l user change. Not as nice as actually having separate user and kernel page-tables in hardware, but it would actually make 5l page-tables useful on machines with less than stupid amounds of memory I think. One of the road-blocks to doing per-cpu kernel page-tables is having to do 2k copies, only having to update a single P5D entry would be ideal. Ofcourse, once we get 5l user tables we're back to being stupid, but maybe tasks with that much memory don't actually switch much, who knows.
[RFC 0/9] Linear Address Masking enabling
Linear Address Masking[1] (LAM) modifies the checking that is applied to 64-bit linear addresses, allowing software to use of the untranslated address bits for metadata. The patchset brings support for LAM for userspace addresses. The most sensitive part of enabling is change in tlb.c, where CR3 flags get set. Please take a look that what I'm doing makes sense. The patchset is RFC quality and the code requires more testing before it can be applied. The userspace API is not finalized yet. The patchset extends API used by ARM64: PR_GET/SET_TAGGED_ADDR_CTRL. The API is adjusted to not imply ARM TBI: it now allows to request a number of bits of metadata needed and report where these bits are located in the address. There's an alternative proposal[2] for the API based on Intel CET interface. Please let us know if you prefer one over another. The feature competes for bits with 5-level paging: LAM_U48 makes it impossible to map anything about 47-bits. The patchset made these capability mutually exclusive: whatever used first wins. LAM_U57 can be combined with mappings above 47-bits. I include QEMU patch in case if somebody wants to play with the feature. The branch: git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git lam Any comments are welcome. [1] ISE, Chapter 14. https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf [2] https://github.com/hjl-tools/linux/commit/e85fa032e5b276ddf17edd056f92f599db9e8369 Kirill A. Shutemov (9): mm, arm64: Update PR_SET/GET_TAGGED_ADDR_CTRL interface x86/mm: Fix CR3_ADDR_MASK x86: CPUID and CR3/CR4 flags for Linear Address Masking x86/mm: Introduce TIF_LAM_U57 and TIF_LAM_U48 x86/mm: Provide untagged_addr() helper x86/uaccess: Remove tags from the address before checking x86/mm: Handle tagged memory accesses from kernel threads x86/mm: Make LAM_U48 and mappings above 47-bits mutually exclusive x86/mm: Implement PR_SET/GET_TAGGED_ADDR_CTRL with LAM arch/arm64/include/asm/processor.h| 12 +- arch/arm64/kernel/process.c | 45 +- arch/arm64/kernel/ptrace.c| 4 +- arch/x86/include/asm/cpufeatures.h| 1 + arch/x86/include/asm/elf.h| 3 +- arch/x86/include/asm/mmu.h| 1 + arch/x86/include/asm/mmu_context.h| 13 ++ arch/x86/include/asm/page_32.h| 3 + arch/x86/include/asm/page_64.h| 19 +++ arch/x86/include/asm/processor-flags.h| 2 +- arch/x86/include/asm/processor.h | 10 ++ arch/x86/include/asm/thread_info.h| 9 +- arch/x86/include/asm/tlbflush.h | 5 + arch/x86/include/asm/uaccess.h| 16 +- arch/x86/include/uapi/asm/processor-flags.h | 6 + arch/x86/kernel/process_64.c | 145 ++ arch/x86/kernel/sys_x86_64.c | 5 +- arch/x86/mm/hugetlbpage.c | 6 +- arch/x86/mm/mmap.c| 9 +- arch/x86/mm/tlb.c | 124 +-- kernel/sys.c | 14 +- .../testing/selftests/arm64/tags/tags_test.c | 31 .../selftests/{arm64 => vm}/tags/.gitignore | 0 .../selftests/{arm64 => vm}/tags/Makefile | 0 .../{arm64 => vm}/tags/run_tags_test.sh | 0 tools/testing/selftests/vm/tags/tags_test.c | 57 +++ 26 files changed, 464 insertions(+), 76 deletions(-) delete mode 100644 tools/testing/selftests/arm64/tags/tags_test.c rename tools/testing/selftests/{arm64 => vm}/tags/.gitignore (100%) rename tools/testing/selftests/{arm64 => vm}/tags/Makefile (100%) rename tools/testing/selftests/{arm64 => vm}/tags/run_tags_test.sh (100%) create mode 100644 tools/testing/selftests/vm/tags/tags_test.c -- 2.26.2