[uClinux-dev] [PATCH] NOMMU: Stub out vm_get_page_prot() if there's no MMU
Stub out vm_get_page_prot() if there's no MMU. This was added by commit: commit 804af2cf6e7af31d2e664b54e6579b531dbd Author: Hugh Dickins h...@veritas.com Date: Wed Jul 26 21:39:49 2006 +0100 Subject: [AGPGART] remove private page protection map and is used in commit: commit c07fbfd17e614a76b194f371c5331e21e6cffb54 Author: Daniel De Graaf dgde...@tycho.nsa.gov Date: Tue Aug 10 18:02:45 2010 -0700 Subject: fbmem: VM_IO set, but not propagated in the fbmem video driver, but the function doesn't exist on NOMMU, resulting in an undefined symbol at link time. Signed-off-by: David Howells dhowe...@redhat.com Reviewed-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- include/linux/mm.h |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 831c693..e6b1210 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1363,7 +1363,15 @@ static inline unsigned long vma_pages(struct vm_area_struct *vma) return (vma-vm_end - vma-vm_start) PAGE_SHIFT; } +#ifdef CONFIG_MMU pgprot_t vm_get_page_prot(unsigned long vm_flags); +#else +static inline pgprot_t vm_get_page_prot(unsigned long vm_flags) +{ + return __pgprot(0); +} +#endif + struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr); int remap_pfn_range(struct vm_area_struct *, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t); ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
On 08/24/2010 03:43 PM, Mike Frysinger wrote: Apparently the ARM MPU's are not nearly as capable as the blackfin MPU. The ARM MPU deals with whole regions, and typically only up to 8 memory regions can be controlled by the MPU at any one time, each region having one protection setting (r/w/x for kernel mode, r/w/x for user mode). Not nearly as fine grained as per-page. i dont quite understand what you mean by whole region. if you define a region as 4KiB, dont you get the granularity expected ? could you describe the flexibility/restrictions of this a little more (i'm not an ARM core guy) ? the Blackfin MPU has separate insn/data TLBs, and each TLB has 16 entries (PTEs i believe is the common naming). each PTE has supervisor rwx and usermode rwx permissions. further, each PTE has a size field which may be 1KiB, 4KiB, 1MiB, or 4MiB. ok, sounds like the blackfin MPU has all the features of a true MMU but without the v--p address translation. The ARM MPU, using MMU language, has an 8-entry TLB (some ARM MPUs have separate insn/data TLBs, others don't). But here's the kicker, the entire address space can only be described by 8 PTE's (aka MPU regions), total! So actually there is no need for a page table in main memory at all, since the TLB already has enough entries to cover the entire address space. i guess we cheat a little and we lock a PTE for the kernel itself so that it'll always be covered so we can process PTE misses without triggering a miss (nested exceptions). i'm not entirely familiar with the exact gory details of other arches, so i cant say how unique we are in this regard. The ARM MPU can do something similar. MPU regions can overlap, and a simple priority scheme is used to decide which region's permissions apply to a memory access that overlaps (higher numbered regions have higher priority). So on ARM we can lock a PTE/region, by defining region 0 to cover the entire address space, and give kernel read/write access, user no access. And region 0 is never overwritten or disabled. So if an access is made to an address not described by any other region, region 0 permissions are applied to the access (and a protection fault is generated if the access was made in user mode). Note that, with region 0 locked, that only leaves 7 PTEs/regions that can be swapped in and out for user processes. So with the ARM MPU, we can't create a region for every mmap(), we would run out of available entries. So we have to use a trade-off, only create an MPU region for XIP file mappings (text). All other mappings (non-XIP file mappings and anonymous mappings) allocate from a common user memory pool (which is another patch I plan to submit). So another locked region is used (region 1) that covers this user memory pool. User mode has read/write access of course, as well as kernel. And so we actually now only have 6 regions that user processes can play with. What this trade-off means is that we have process-to-kernel protection, but not process-to-process protection. So ARM could use something higher-level than protect_page(), something like protect_region(start, end, flags), or just all of protect_vma() could be moved to include/asm/mmu_context.h. That way ARM can operate on the whole region, while blackfin would add protection for every page in the VMA as it is doing now. i think you could use the existing framework, and perhaps optionally extend it. maybe if i knew a little more about your regions, i could suggest something else. I'll work on another patch that better merges my original ARM MPU work into the blackfin work, and resubmit. great, thanks Btw, I probably should be working in whatever git tree people are submitting patches against, rather than the 20100628 release. Which git tree should I submit against? that's hard to say. if current mainline (2.6.36-rc2) has everything you need to boot a working system, then that is probably the place to base your work. i understand though that the arm/nommu work is taking a while to get into mainline, so that might not be feasible. in which case, you should find the very latest uclinux tree and use that. i know people like to base their work off a release, but in order to get merged, the focus has to be on the latest development tree. ok. Greg says that the core non-MMU stuff is in mainline now, so I'll work from mainline. Steve ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
On Thursday, August 26, 2010 14:19:41 Steve Longerbeam wrote: The ARM MPU can do something similar. MPU regions can overlap, and a simple priority scheme is used to decide which region's permissions apply to a memory access that overlaps (higher numbered regions have higher priority). So on ARM we can lock a PTE/region, by defining region 0 to cover the entire address space, and give kernel read/write access, user no access. And region 0 is never overwritten or disabled. So if an access is made to an address not described by any other region, region 0 permissions are applied to the access (and a protection fault is generated if the access was made in user mode). Note that, with region 0 locked, that only leaves 7 PTEs/regions that can be swapped in and out for user processes. So with the ARM MPU, we can't create a region for every mmap(), we would run out of available entries. So we have to use a trade-off, only create an MPU region for XIP file mappings (text). All other mappings (non-XIP file mappings and anonymous mappings) allocate from a common user memory pool (which is another patch I plan to submit). i dont understand why running out of entries is a problem. we run out of entries too as you cant cover 512MiB of SDRAM with 16 entries. we simply take an exception when this occurs and in the exception handler, we use a basic round-robin replacement scheme to install a valid PTE (assuming of course the user has a valid mapping for the excepting address). then we return to the user process and it continues on. why wont this scheme work for you too ? -mike signature.asc Description: This is a digitally signed message part. ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
On 08/26/2010 12:04 PM, Mike Frysinger wrote: On Thursday, August 26, 2010 14:19:41 Steve Longerbeam wrote: The ARM MPU can do something similar. MPU regions can overlap, and a simple priority scheme is used to decide which region's permissions apply to a memory access that overlaps (higher numbered regions have higher priority). So on ARM we can lock a PTE/region, by defining region 0 to cover the entire address space, and give kernel read/write access, user no access. And region 0 is never overwritten or disabled. So if an access is made to an address not described by any other region, region 0 permissions are applied to the access (and a protection fault is generated if the access was made in user mode). Note that, with region 0 locked, that only leaves 7 PTEs/regions that can be swapped in and out for user processes. So with the ARM MPU, we can't create a region for every mmap(), we would run out of available entries. So we have to use a trade-off, only create an MPU region for XIP file mappings (text). All other mappings (non-XIP file mappings and anonymous mappings) allocate from a common user memory pool (which is another patch I plan to submit). i dont understand why running out of entries is a problem. we run out of entries too as you cant cover 512MiB of SDRAM with 16 entries. we simply take an exception when this occurs and in the exception handler, we use a basic round-robin replacement scheme to install a valid PTE (assuming of course the user has a valid mapping for the excepting address). then we return to the user process and it continues on. why wont this scheme work for you too ? no, you're right, that would work. Of course, it would have a bigger memory usage for the page tables, and a performance hit (with my implementation when a process is running there are no faults). But it is more inline with how MMU kernels work, and it adds process-to-process protection too. Steve ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
Mike Frysinger wrote: as it stands, this breaks all non-arm NOMMU ports. the patch will need to be broken up into arm-specific and arm-independent parts. the common code changes will need justification as to why they exist at all. we're doing MPU on Blackfin/nommu today without any of these. we support pretty much all the same features of a MMU system short of virtual memory -- 4k pages, RWX granularity, process to process protection, process to kernel protection (include kernel modules), kernel XIP, and userspace XIP. further, why did you go with CONFIG_CPU_CP15_MPU ? there is already a CONFIG_MPU option that is used in common nommu code. While we're here, I'd better mention that I have a mostly ARM-compatible CPU here, with an MPU that isn't like the ARM ones - but it does use CP15 :-) -- Jamie ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
On Thursday, August 26, 2010 18:45:08 Steve Longerbeam wrote: On 08/26/2010 12:04 PM, Mike Frysinger wrote: On Thursday, August 26, 2010 14:19:41 Steve Longerbeam wrote: The ARM MPU can do something similar. MPU regions can overlap, and a simple priority scheme is used to decide which region's permissions apply to a memory access that overlaps (higher numbered regions have higher priority). So on ARM we can lock a PTE/region, by defining region 0 to cover the entire address space, and give kernel read/write access, user no access. And region 0 is never overwritten or disabled. So if an access is made to an address not described by any other region, region 0 permissions are applied to the access (and a protection fault is generated if the access was made in user mode). Note that, with region 0 locked, that only leaves 7 PTEs/regions that can be swapped in and out for user processes. So with the ARM MPU, we can't create a region for every mmap(), we would run out of available entries. So we have to use a trade-off, only create an MPU region for XIP file mappings (text). All other mappings (non-XIP file mappings and anonymous mappings) allocate from a common user memory pool (which is another patch I plan to submit). i dont understand why running out of entries is a problem. we run out of entries too as you cant cover 512MiB of SDRAM with 16 entries. we simply take an exception when this occurs and in the exception handler, we use a basic round-robin replacement scheme to install a valid PTE (assuming of course the user has a valid mapping for the excepting address). then we return to the user process and it continues on. why wont this scheme work for you too ? no, you're right, that would work. Of course, it would have a bigger memory usage for the page tables, and a performance hit (with my implementation when a process is running there are no faults). But it is more inline with how MMU kernels work, and it adds process-to-process protection too. we used a bitmap to save on memory and execution. each bit representing a 4k chunk. this is the page_rwx_mask and similar stuff that appears in the Blackfin asm/mmu*.h headers. have you done performance measurements to see the overhead with the MPU turned on in your scheme compared to off ? doing something like a ffmpeg decode to another file. if the performance trade offs of your current scheme (per- mapping) is significant compared to the classic per-page, then it is worth while to extend the MPU Kconfig option so people can select per-page or per- mapping schema. btw, i dont think it was mentioned earlier, but these ranges you're working with ... do they have alignment requirements ? the thing about Blackfin PTEs is that they must be aligned according to the size they represent. so if it is a 4KiB mapping, it must be aligned to 4KiB. if it's 1MiB, it must be aligned to 1MiB. it'd be nice if that alignment restriction wasnt there as we could then do a flexible range mapping similar to what you have. -mike signature.asc Description: This is a digitally signed message part. ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
On 08/26/2010 06:07 PM, Mike Frysinger wrote: On Thursday, August 26, 2010 18:45:08 Steve Longerbeam wrote: On 08/26/2010 12:04 PM, Mike Frysinger wrote: On Thursday, August 26, 2010 14:19:41 Steve Longerbeam wrote: The ARM MPU can do something similar. MPU regions can overlap, and a simple priority scheme is used to decide which region's permissions apply to a memory access that overlaps (higher numbered regions have higher priority). So on ARM we can lock a PTE/region, by defining region 0 to cover the entire address space, and give kernel read/write access, user no access. And region 0 is never overwritten or disabled. So if an access is made to an address not described by any other region, region 0 permissions are applied to the access (and a protection fault is generated if the access was made in user mode). Note that, with region 0 locked, that only leaves 7 PTEs/regions that can be swapped in and out for user processes. So with the ARM MPU, we can't create a region for every mmap(), we would run out of available entries. So we have to use a trade-off, only create an MPU region for XIP file mappings (text). All other mappings (non-XIP file mappings and anonymous mappings) allocate from a common user memory pool (which is another patch I plan to submit). i dont understand why running out of entries is a problem. we run out of entries too as you cant cover 512MiB of SDRAM with 16 entries. we simply take an exception when this occurs and in the exception handler, we use a basic round-robin replacement scheme to install a valid PTE (assuming of course the user has a valid mapping for the excepting address). then we return to the user process and it continues on. why wont this scheme work for you too ? no, you're right, that would work. Of course, it would have a bigger memory usage for the page tables, and a performance hit (with my implementation when a process is running there are no faults). But it is more inline with how MMU kernels work, and it adds process-to-process protection too. we used a bitmap to save on memory and execution. each bit representing a 4k chunk. this is the page_rwx_mask and similar stuff that appears in the Blackfin asm/mmu*.h headers. ok, I'll take a closer look. have you done performance measurements to see the overhead with the MPU turned on in your scheme compared to off ? doing something like a ffmpeg decode to another file. if the performance trade offs of your current scheme (per- mapping) is significant compared to the classic per-page, then it is worth while to extend the MPU Kconfig option so people can select per-page or per- mapping schema. yes, if performance degrades a lot for per-page compared to my current scheme, that would be worthwhile to offer both options. OTOH, other people may have different requirements (better protection being more important than memory footprint or performance, or vice-versa). So it might make sense to offer both options anyway. btw, i dont think it was mentioned earlier, but these ranges you're working with ... do they have alignment requirements ? yes, but it varies by ARM cores. For instance, on the SC100, the MPU regions must be 64-byte aligned, but the ARM940T has the same alignment requirement as blackfin (alignment = size). the thing about Blackfin PTEs is that they must be aligned according to the size they represent. so if it is a 4KiB mapping, it must be aligned to 4KiB. if it's 1MiB, it must be aligned to 1MiB. it'd be nice if that alignment restriction wasnt there as we could then do a flexible range mapping similar to what you have. -mike ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
Re: [uClinux-dev] [PATCH 1/3] MPU support
On Thursday, August 26, 2010 21:40:13 Steve Longerbeam wrote: On 08/26/2010 06:07 PM, Mike Frysinger wrote: have you done performance measurements to see the overhead with the MPU turned on in your scheme compared to off ? doing something like a ffmpeg decode to another file. if the performance trade offs of your current scheme (per- mapping) is significant compared to the classic per-page, then it is worth while to extend the MPU Kconfig option so people can select per-page or per- mapping schema. yes, if performance degrades a lot for per-page compared to my current scheme, that would be worthwhile to offer both options. OTOH, other people may have different requirements (better protection being more important than memory footprint or performance, or vice-versa). So it might make sense to offer both options anyway. iirc, in the tests we did, doing a cpu intensive task didnt suffer all that much. but doing a memory intensive task (like ffmpeg decoding), we saw a ~10x slowdown :(. so we default it to off but keep it around for debugging since the performance is certainly good enough for that. however, we also didnt profile the whole stack, so there might be some places we can squeeze a bit more performance out. -mike signature.asc Description: This is a digitally signed message part. ___ uClinux-dev mailing list uClinux-dev@uclinux.org http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by uclinux-dev@uclinux.org To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev