Re: ARM: vmsplit 4g/4g
Hi Linus, On Mon, Jun 15, 2020 at 11:11:04AM +0200, Linus Walleij wrote: > OK I would be very happy to look at it so I can learn a bit about the > hands-on and general approach here. Just send it to this address > directly and I will look! Have sent it > > For the next 3 weeks, right now, i cannot say whether i would be able > > to spend time on it, perhaps might be possible, but only during that > > time i will know. > > I'm going for vacation the next 2 weeks or so, but then it'd be great if > we can start looking at this in-depth! Yes for me too Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Sun, Jun 14, 2020 at 06:51:43PM +0530, afzal mohammed wrote: > It is MB/s for copying one file to another via user space buffer, i.e. > the value coreutils 'dd' shows w/ status=progress (here it was busybox > 'dd', so instead it was enabling a compile time option) Just for correctness, status=progress is not required, it's there in the default 3rd line of coreutils 'dd' o/p Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Sat, Jun 13, 2020 at 10:45:33PM +0200, Arnd Bergmann wrote: > 4% boot time increase sounds like a lot, especially if that is only for > copy_from_user/copy_to_user. In the end it really depends on how well > get_user()/put_user() and small copies can be optimized in the end. i mentioned the worst case(happened only once), normally it was in the range 2-3% > From the numbers you > measured, it seems the beaglebone currently needs an extra ~6µs or > 3µs per copy_to/from_user() call with your patch, depending on what > your benchmark was (MB/s for just reading or writing vs MB/s for > copying from one file to another through a user space buffer). It is MB/s for copying one file to another via user space buffer, i.e. the value coreutils 'dd' shows w/ status=progress (here it was busybox 'dd', so instead it was enabling a compile time option) > but if you want to test what the overhead is, you could try changing > /dev/zero (or a different chardev like it) to use a series of > put_user(0, u32uptr++) in place of whatever it has, and then replace the > 'str' instruction with dummy writes to ttbr0 using the value it already > has, like: > > mcr p15, 0, %0, c2, c0, 0 /* set_ttbr0() */ > isb /* prevent speculative access to kernel table */ > str%1, [%2],0 /* write 32 bit to user space */ > mcr p15, 0, %0, c2, c0, 0 /* set_ttbr0() */ > isb /* prevent speculative access to user table */ > It would be interesting to compare it to the overhead of a > get_user_page_fast() based implementation. i have to relocate & be on quarantine couple of weeks, so i will temporarily stop here, otherwise might end up in roadside. Reading feedbacks from everyone, some of it i could grasp only bits & pieces, familiarizing more w/ mm & vfs would help me add value better to the goal/discussion. Linus Walleij, if you wish to explore things, feel free, right now don't know how my connectivity would be for next 3 weeks. Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Sat, Jun 13, 2020 at 02:15:52PM +0100, Russell King - ARM Linux admin wrote: > On Sat, Jun 13, 2020 at 05:34:32PM +0530, afzal mohammed wrote: > > i think C > > library cuts any size read, write to page size (if it exceeds) & > > invokes the system call. > You can't make that assumption about read(2). stdio in the C library > may read a page size of data at a time, but programs are allowed to > call read(2) directly, and the C library will pass such a call straight > through to the kernel. So, if userspace requests a 16k read via > read(2), then read(2) will be invoked covering 16k. > > As an extreme case, for example: > > $ strace -e read dd if=/dev/zero of=/dev/null bs=1048576 count=1 > read(0, > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., > 1048576) = 1048576 Okay. Yes, observed that dd is passing whatever is the 'bs' to Kernel and from the 'dd' sources (of busybox), it is invoking read system call directly passing 'bs', so it is the tmpfs read that is splitting it to page size as mentioned by Arnd. Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Sat, Jun 13, 2020 at 01:56:15PM +0100, Al Viro wrote: > Incidentally, what about get_user()/put_user()? _That_ is where it's > going to really hurt... All other uaccess routines are also planned to be added, posting only copy_{from,to}_user() was to get early feedback (mentioned in the cover letter) Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Sat, Jun 13, 2020 at 02:08:11PM +0300, Andy Shevchenko wrote: > On Fri, Jun 12, 2020 at 1:20 PM afzal mohammed > wrote: > > +// Started from arch/um/kernel/skas/uaccess.c > > Does it mean you will deduplicate it there? What i meant was, that file was taken as a template & nothing more, at same time i wanted to give credit to that file, i will explicitly mention it next time. It is not meant to deduplicate it. Functionality here is completely different. In the case here, there would be different virtual address mapping that CPU will be see once in Kernel as compared to user mode. Here a facility is provided to access the user page, when the current virtual address mapping of the CPU excludes it. This is for providing full 4G virtual address to both user & kernel on 32bit ARM to avoid using highmem or reduce the impact of highmem, i.e. so that Kernel can address till 4GiB (almost) as lowmem. Here assumption is that user mapping is not a subset of virtual address mapped by CPU, but a separate one. Upon Kernel entry ttbr0 is changed to Kernel lowmem, while upon Kernel exit is changed back to user pages (ttbrx in ARM, iiuc, equivalent to cr3 in x86) Now realize that i am unable to put coherently the problem being attempted to solve here to a person not familiar w/ the issue w/o taking considerable time. If above explanation is not enough, i will try to explain later in a better way. > > +#include > > +#include > > +#include > > +#include > > Perhaps ordered? will take care > > +static int do_op_one_page(unsigned long addr, int len, > > +int (*op)(unsigned long addr, int len, void *arg), void > > *arg, > > +struct page *page) > > Maybe typedef for the func() ? will take care > > +{ > > + int n; > > + > > + addr = (unsigned long) kmap_atomic(page) + (addr & ~PAGE_MASK); > > I don't remember about this one... i am not following you here, for my case !CONFIG_64BIT case in that file was required, hence only it was picked (or rather not deleted) > > + size = min(PAGE_ALIGN(addr) - addr, (unsigned long) len); > > ...but here seems to me you can use helpers (offset_in_page() or how > it's called). i was not aware of it, will use it as required. > > Also consider to use macros like PFN_DOWN(), PFN_UP(), etc in your code. Okay > > > + remain = len; > > + if (size == 0) > > + goto page_boundary; > > + > > + n = do_op_one_page(addr, size, op, arg, *pages); > > + if (n != 0) { > > > + remain = (n < 0 ? remain : 0); > > Why duplicate three times (!) this line, if you can move it to under 'out'? yes better to move there > > > + goto out; > > + } > > + > > + pages++; > > + addr += size; > > + remain -= size; > > + > > +page_boundary: > > + if (remain == 0) > > + goto out; > > + while (addr < ((addr + remain) & PAGE_MASK)) { > > + n = do_op_one_page(addr, PAGE_SIZE, op, arg, *pages); > > + if (n != 0) { > > + remain = (n < 0 ? remain : 0); > > + goto out; > > + } > > + > > + pages++; > > + addr += PAGE_SIZE; > > + remain -= PAGE_SIZE; > > + } > > Sounds like this can be refactored to iterate over pages rather than > addresses. Okay, i will check > > +static int copy_chunk_from_user(unsigned long from, int len, void *arg) > > +{ > > + unsigned long *to_ptr = arg, to = *to_ptr; > > + > > + memcpy((void *) to, (void *) from, len); > > What is the point in the casting to void *? The reason it was there was because of copy-paste :), passing unsigned long as 'void *' or 'const void *' requires casting right ?, or you meant something else ? now i checked removing the cast, compiler is abusing me :), says 'makes pointer from integer without a cast' > > + num_pages = DIV_ROUND_UP((unsigned long)from + n, PAGE_SIZE) - > > +(unsigned long)from / PAGE_SIZE; > > PFN_UP() ? Okay > I think you can clean up the code a bit after you will get the main > functionality working. Yes, surely, intention was to post proof-of-concept ASAP, perhaps contents will change drastically in next version so that any resemblence of arch/um/kernel/skas/uaccess.c might not be there. Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Fri, Jun 12, 2020 at 10:07:28PM +0200, Arnd Bergmann wrote: > I think a lot > of usercopy calls are only for a few bytes, though this is of course > highly workload dependent and you might only care about the large > ones. Observation is that max. pages reaching copy_{from,to}_user() is 2, observed maximum of n (number of bytes) being 1 page size. i think C library cuts any size read, write to page size (if it exceeds) & invokes the system call. Max. pages reaching 2, happens when 'n' crosses page boundary, this has been observed w/ small size request as well w/ ones of exact page size (but not page aligned). Even w/ dd of various size >4K, never is the number of pages required to be mapped going greater than 2 (even w/ 'dd' 'bs=1M') i have a worry (don't know whether it is an unnecessary one): even if we improve performance w/ large copy sizes, it might end up in a sluggishness w.r.t user experience due to most (hence a high amount) of user copy calls being few bytes & there the penalty being higher. And benchmark would not be able to detect anything abnormal since usercopy are being tested on large sizes. Quickly comparing boot-time on Beagle Bone White, boot time increases by only 4%, perhaps this worry is irrelevant, but just thought will put it across. > There is also still hope of optimizing small aligned copies like > > set_ttbr0(user_ttbr); > ldm(); > set_ttbr0(kernel_ttbr); > stm(); Hmm, more needs to be done to be in a position to test it. Regards afzal
Re: [RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g
Hi, On Fri, Jun 12, 2020 at 09:31:12PM +0530, afzal mohammed wrote: > 512 1K 4K 16K 32K 64K 1M > > normal 30 46 89 95 90 85 65 > > uaccess_w_memcpy 28.545 85 92 91 85 65 > > w/ series22 36 72 79 78 75 61 For the sake of completeness all in MB/s, w/ various 'dd' 'bs' sizes. Regards afzal
Re: [RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g
Hi, On Fri, Jun 12, 2020 at 11:19:23AM -0400, Nicolas Pitre wrote: > On Fri, 12 Jun 2020, afzal mohammed wrote: > > Performance wise, results are not encouraging, 'dd' on tmpfs results, > Could you compare with CONFIG_UACCESS_WITH_MEMCPY as well? 512 1K 4K 16K 32K 64K 1M normal 30 46 89 95 90 85 65 uaccess_w_memcpy 28.545 85 92 91 85 65 w/ series22 36 72 79 78 75 61 There are variations in the range +/-2 in some readings when repeated, not put above, to keep comparison simple. Regards afzal
Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
Hi, On Fri, Jun 12, 2020 at 02:02:13PM +0200, Arnd Bergmann wrote: > On Fri, Jun 12, 2020 at 12:18 PM afzal mohammed > wrote: > > Roughly a one-third drop in performance. Disabling highmem improves > > performance only slightly. > There are probably some things that can be done to optimize it, > but I guess most of the overhead is from the page table operations > and cannot be avoided. Ingo's series did a follow_page() first, then as a fallback did it invoke get_user_pages(), i will try that way as well. Yes, i too feel get_user_pages_fast() path is the most time consuming, will instrument & check. > What was the exact 'dd' command you used, in particular the block size? > Note that by default, 'dd' will request 512 bytes at a time, so you usually > only access a single page. It would be interesting to see the overhead with > other typical or extreme block sizes, e.g. '1', '64', '4K', '64K' or '1M'. It was the default(512), more test results follows (in MB/s), 512 1K 4K 16K 32K 64K 1M w/o series 30 46 89 95 90 85 65 w/ series 22 36 72 79 78 75 61 perf drop 26% 21% 19% 16% 13% 12%6% Hmm, results ain't that bad :) > If you want to drill down into where exactly the overhead is (i.e. > get_user_pages or kmap_atomic, or something different), using > 'perf record dd ..', and 'perf report' may be helpful. Let me dig deeper & try to find out where the major overhead and try to figure out ways to reduce it. One reason to disable highmem & test (results mentioned earlier) was to make kmap_atomic() very lightweight, that was not making much difference, around 3% only. > > +static int copy_chunk_from_user(unsigned long from, int len, void *arg) > > +{ > > + unsigned long *to_ptr = arg, to = *to_ptr; > > + > > + memcpy((void *) to, (void *) from, len); > > + *to_ptr += len; > > + return 0; > > +} > > + > > +static int copy_chunk_to_user(unsigned long to, int len, void *arg) > > +{ > > + unsigned long *from_ptr = arg, from = *from_ptr; > > + > > + memcpy((void *) to, (void *) from, len); > > + *from_ptr += len; > > + return 0; > > +} > > Will gcc optimize away the indirect function call and inline everything? > If not, that would be a small part of the overhead. i think not, based on objdump, i will make these & wherever other places possible inline & see the difference. > > + num_pages = DIV_ROUND_UP((unsigned long)from + n, PAGE_SIZE) - > > +(unsigned long)from / PAGE_SIZE; > > Make sure this doesn't turn into actual division operations but uses shifts. > It might even be clearer here to open-code the shift operation so readers > can see what this is meant to compile into. Okay > > > + pages = kmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL | > > __GFP_ZERO); > > + if (!pages) > > + goto end; > > Another micro-optimization may be to avoid the kmalloc for the common case, > e.g. anything with "num_pages <= 64", using an array on the stack. Okay > > + ret = get_user_pages_fast((unsigned long)from, num_pages, 0, pages); > > + if (ret < 0) > > + goto free_pages; > > + > > + if (ret != num_pages) { > > + num_pages = ret; > > + goto put_pages; > > + } > > I think this is technically incorrect: if get_user_pages_fast() only > gets some of the > pages, you should continue with the short buffer and return the number > of remaining > bytes rather than not copying anything. I think you did that correctly > for a failed > kmap_atomic(), but this has to use the same logic. yes, will fix that. Regards afzal
Re: ARM: vmsplit 4g/4g
Hi, On Wed, Jun 10, 2020 at 12:10:21PM +0200, Linus Walleij wrote: > On Mon, Jun 8, 2020 at 1:09 PM afzal mohammed wrote: > > Not yet. Yes, i will do the performance evaluation. > > > > i am also worried about the impact on performance as these > > [ get_user_pages() or friends, kmap_atomic() ] are additionally > > invoked in the copy_{from,to}_user() path now. > > I am happy to help! Thanks Linus > I am anyway working on MMU-related code (KASan) so I need to be on > top of this stuff. i earlier went thr' KASAN series secretly & did learn a thing or two from that! > What test is appropriate for this? I would intuitively think hackbench? 'dd', i think, as you mentioned 'hackbench' i will use that as well. > > Note that this was done on a topic branch for user copy. Changes for > > kernel static mapping to vmalloc has not been merged with these. > > Also having kernel lowmem w/ a separate asid & switching at kernel > > entry/exit b/n user & kernel lowmem by changing ttbr0 is yet to be > > done. Quite a few things remaining to be done to achieve vmsplit 4g/4g > > I will be very excited to look at patches or a git branch once you have > something you want to show. Also to just understand how you go about > this. Don't put too much expectation on me, this is more of a learning for me. For user copy, the baby steps has been posted (To'ed you). On the static kernel mapping on vmalloc front, i do not want to post the patches in the current shape, though git-ized, will result in me getting mercilessly thrashed in public :). Many of the other platforms would fail and is not multi-platform friendly. i do not yet have a public git branch, i can send you the (ugly) patches separately, just let me know. > I have several elder systems under my roof i have only a few low RAM & CPU systems, so that is certainly helpful. > so my contribution could hopefully be to help and debug any issues If you would like, we can work together, at the same time keep in mind that me spending time on it would be intermittent & erratic (though i am trying to keep a consistent, but slow pace) perhaps making it difficult to coordinate. Or else i will continue the same way & request your help when required. For the next 3 weeks, right now, i cannot say whether i would be able to spend time on it, perhaps might be possible, but only during that time i will know. Regards afzal
[RFC 3/3] ARM: provide CONFIG_VMSPLIT_4G_DEV for development
Select UACCESS_GUP_KMAP_MEMCPY initially. Signed-off-by: afzal mohammed --- arch/arm/Kconfig | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index c77c93c485a08..ae2687679d7c8 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1326,6 +1326,15 @@ config PAGE_OFFSET default 0xB000 if VMSPLIT_3G_OPT default 0xC000 +config VMSPLIT_4G_DEV + bool "Experimental changes for 4G/4G user/kernel split" + depends on ARM_LPAE + select UACCESS_GUP_KMAP_MEMCPY + help + Experimental changes during 4G/4G user/kernel split development. + Existing vmsplit config option is used, once development is done, + this would be put as a new choice & _DEV suffix removed. + config NR_CPUS int "Maximum number of CPUs (2-32)" range 2 32 -- 2.26.2
[RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()
copy_{from,to}_user() uaccess helpers are implemented by user page pinning, followed by temporary kernel mapping & then memcpy(). This helps to achieve user page copy when current virtual address mapping of the CPU excludes user pages. Performance wise, results are not encouraging, 'dd' on tmpfs results, ARM Cortex-A8, BeagleBone White (256MiB RAM): w/o series - ~29.5 MB/s w/ series - ~20.5 MB/s w/ series & highmem disabled - ~21.2 MB/s On Cortex-A15(2GiB RAM) in QEMU: w/o series - ~4 MB/s w/ series - ~2.6 MB/s Roughly a one-third drop in performance. Disabling highmem improves performance only slightly. 'hackbench' also showed a similar pattern. uaccess routines using page pinning & temporary kernel mapping is not something new, it has been done long long ago by Ingo [1] as part of 4G/4G user/kernel mapping implementation on x86, though not merged in mainline. [1] https://lore.kernel.org/lkml/Pine.LNX.4.44.0307082332450.17252-10@localhost.localdomain/ Signed-off-by: afzal mohammed --- lib/Kconfig | 4 + lib/Makefile | 3 + lib/uaccess_gup_kmap_memcpy.c | 162 ++ 3 files changed, 169 insertions(+) create mode 100644 lib/uaccess_gup_kmap_memcpy.c diff --git a/lib/Kconfig b/lib/Kconfig index 5d53f9609c252..dadf4f6cc391d 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -622,6 +622,10 @@ config ARCH_HAS_MEMREMAP_COMPAT_ALIGN config UACCESS_MEMCPY bool +# pin page + kmap_atomic + memcpy for user copies, intended for vmsplit 4g/4g +config UACCESS_GUP_KMAP_MEMCPY + bool + config ARCH_HAS_UACCESS_FLUSHCACHE bool diff --git a/lib/Makefile b/lib/Makefile index 685aee60de1d5..bc457f85e391a 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -309,3 +309,6 @@ obj-$(CONFIG_OBJAGG) += objagg.o # KUnit tests obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o + +# uaccess +obj-$(CONFIG_UACCESS_GUP_KMAP_MEMCPY) += uaccess_gup_kmap_memcpy.o diff --git a/lib/uaccess_gup_kmap_memcpy.c b/lib/uaccess_gup_kmap_memcpy.c new file mode 100644 index 0..1536762df1fd5 --- /dev/null +++ b/lib/uaccess_gup_kmap_memcpy.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-2.0 +// Started from arch/um/kernel/skas/uaccess.c + +#include +#include +#include +#include + +#include +#include + +static int do_op_one_page(unsigned long addr, int len, +int (*op)(unsigned long addr, int len, void *arg), void *arg, +struct page *page) +{ + int n; + + addr = (unsigned long) kmap_atomic(page) + (addr & ~PAGE_MASK); + n = (*op)(addr, len, arg); + kunmap_atomic((void *)addr); + + return n; +} + +static long buffer_op(unsigned long addr, int len, + int (*op)(unsigned long, int, void *), void *arg, + struct page **pages) +{ + long size, remain, n; + + size = min(PAGE_ALIGN(addr) - addr, (unsigned long) len); + remain = len; + if (size == 0) + goto page_boundary; + + n = do_op_one_page(addr, size, op, arg, *pages); + if (n != 0) { + remain = (n < 0 ? remain : 0); + goto out; + } + + pages++; + addr += size; + remain -= size; + +page_boundary: + if (remain == 0) + goto out; + while (addr < ((addr + remain) & PAGE_MASK)) { + n = do_op_one_page(addr, PAGE_SIZE, op, arg, *pages); + if (n != 0) { + remain = (n < 0 ? remain : 0); + goto out; + } + + pages++; + addr += PAGE_SIZE; + remain -= PAGE_SIZE; + } + if (remain == 0) + goto out; + + n = do_op_one_page(addr, remain, op, arg, *pages); + if (n != 0) { + remain = (n < 0 ? remain : 0); + goto out; + } + + return 0; +out: + return remain; +} + +static int copy_chunk_from_user(unsigned long from, int len, void *arg) +{ + unsigned long *to_ptr = arg, to = *to_ptr; + + memcpy((void *) to, (void *) from, len); + *to_ptr += len; + return 0; +} + +static int copy_chunk_to_user(unsigned long to, int len, void *arg) +{ + unsigned long *from_ptr = arg, from = *from_ptr; + + memcpy((void *) to, (void *) from, len); + *from_ptr += len; + return 0; +} + +unsigned long gup_kmap_copy_from_user(void *to, const void __user *from, unsigned long n) +{ + struct page **pages; + int num_pages, ret, i; + + if (uaccess_kernel()) { + memcpy(to, (__force void *)from, n); + return 0; + } + + num_pages = DIV_ROUND_UP((unsigned long)from + n, PAGE_SIZE) - +(unsigned long)from / PAGE_SIZE; + pages = kmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL | __GFP_ZERO); +
[RFC 2/3] ARM: uaccess: let UACCESS_GUP_KMAP_MEMCPY enabling
Turn off existing raw_copy_{from,to}_user() using arm_copy_{from,to}_user() when CONFIG_UACCESS_GUP_KMAP_MEMCPY is enabled. Signed-off-by: afzal mohammed --- arch/arm/include/asm/uaccess.h | 20 arch/arm/kernel/armksyms.c | 2 ++ arch/arm/lib/Makefile | 7 +-- 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 98c6b91be4a8a..4a16ae52d4978 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -512,6 +512,15 @@ do { \ extern unsigned long __must_check arm_copy_from_user(void *to, const void __user *from, unsigned long n); +#ifdef CONFIG_UACCESS_GUP_KMAP_MEMCPY +extern unsigned long __must_check +gup_kmap_copy_from_user(void *to, const void __user *from, unsigned long n); +static inline __must_check unsigned long +raw_copy_from_user(void *to, const void __user *from, unsigned long n) +{ + return gup_kmap_copy_from_user(to, from, n); +} +#else static inline unsigned long __must_check raw_copy_from_user(void *to, const void __user *from, unsigned long n) { @@ -522,12 +531,22 @@ raw_copy_from_user(void *to, const void __user *from, unsigned long n) uaccess_restore(__ua_flags); return n; } +#endif extern unsigned long __must_check arm_copy_to_user(void __user *to, const void *from, unsigned long n); extern unsigned long __must_check __copy_to_user_std(void __user *to, const void *from, unsigned long n); +#ifdef CONFIG_UACCESS_GUP_KMAP_MEMCPY +extern unsigned long __must_check +gup_kmap_copy_to_user(void __user *to, const void *from, unsigned long n); +static inline __must_check unsigned long +raw_copy_to_user(void __user *to, const void *from, unsigned long n) +{ + return gup_kmap_copy_to_user(to, from, n); +} +#else static inline unsigned long __must_check raw_copy_to_user(void __user *to, const void *from, unsigned long n) { @@ -541,6 +560,7 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n) return arm_copy_to_user(to, from, n); #endif } +#endif extern unsigned long __must_check arm_clear_user(void __user *addr, unsigned long n); diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c index 98bdea51089d5..8c92fe30d1559 100644 --- a/arch/arm/kernel/armksyms.c +++ b/arch/arm/kernel/armksyms.c @@ -96,8 +96,10 @@ EXPORT_SYMBOL(mmiocpy); #ifdef CONFIG_MMU EXPORT_SYMBOL(copy_page); +#ifndef CONFIG_UACCESS_GUP_KMAP_MEMCPY EXPORT_SYMBOL(arm_copy_from_user); EXPORT_SYMBOL(arm_copy_to_user); +#endif EXPORT_SYMBOL(arm_clear_user); EXPORT_SYMBOL(__get_user_1); diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 6d2ba454f25b6..1aeff2cd7b4b3 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -16,8 +16,11 @@ lib-y:= changebit.o csumipv6.o csumpartial.o \ io-readsb.o io-writesb.o io-readsl.o io-writesl.o \ call_with_stack.o bswapsdi2.o -mmu-y := clear_user.o copy_page.o getuser.o putuser.o \ - copy_from_user.o copy_to_user.o +mmu-y := clear_user.o copy_page.o getuser.o putuser.o + +ifndef CONFIG_UACCESS_GUP_KMAP_MEMCPY + mmu-y+= copy_from_user.o copy_to_user.o +endif ifdef CONFIG_CC_IS_CLANG lib-y+= backtrace-clang.o -- 2.26.2
[RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g
Hi, copy_{from,to}_user() uaccess helpers are implemented by user page pinning, followed by temporary kernel mapping & then memcpy(). This helps to achieve user page copy when current virtual address mapping of the CPU excludes user pages. Other uaccess routines are also planned to be modified to make use of pinning plus kmap_atomic() based on the feedback here. This is done as one of the initial steps to achieve 4G virtual address mapping for user as well as Kernel on ARMv7 w/ LPAE. Motive behind this is to enable Kernel access till 4GiB (almost) as lowmem, thus helping in removing highmem support for platforms having upto 4GiB RAM. In the case of platforms having >4GiB, highmem is still required for the Kernel to be able to access whole RAM. Performance wise, results are not encouraging, 'dd' on tmpfs results, ARM Cortex-A8, BeagleBone White (256MiB RAM): w/o series - ~29.5 MB/s w/ series - ~20.5 MB/s w/ series & highmem disabled - ~21.2 MB/s On Cortex-A15(2GiB RAM) in QEMU: w/o series - ~4 MB/s w/ series - ~2.6 MB/s Roughly a one-third drop in performance. Disabling highmem improves performance only slightly. 'hackbench' also showed a similar pattern. Ways to improve the performance has to be explored, if any one has thoughts on it, please share. uaccess routines using page pinning & temporary kernel mapping is not something new, it has been done by Ingo long long ago [1] as part of 4G/4G user/kernel mapping implementation on x86, though not merged in mainline. Arnd has outlined basic design for vmsplit 4g/4g, uaccess routines using user page pinning plus kmap_atomic() is one part of that. [1] https://lore.kernel.org/lkml/Pine.LNX.4.44.0307082332450.17252-10@localhost.localdomain/ Last 2 patches are only meant for testing first patch. Regards afzal afzal mohammed (3): lib: copy_{from,to}_user using gup & kmap_atomic() ARM: uaccess: let UACCESS_GUP_KMAP_MEMCPY enabling ARM: provide CONFIG_VMSPLIT_4G_DEV for development arch/arm/Kconfig | 9 ++ arch/arm/include/asm/uaccess.h | 20 arch/arm/kernel/armksyms.c | 2 + arch/arm/lib/Makefile | 7 +- lib/Kconfig| 4 + lib/Makefile | 3 + lib/uaccess_gup_kmap_memcpy.c | 162 + 7 files changed, 205 insertions(+), 2 deletions(-) create mode 100644 lib/uaccess_gup_kmap_memcpy.c -- 2.26.2
Re: ARM: vmsplit 4g/4g
Hi, On Mon, Jun 08, 2020 at 08:47:27PM +0530, afzal mohammed wrote: > On Mon, Jun 08, 2020 at 04:43:57PM +0200, Arnd Bergmann wrote: > > There is another difference: get_user_pages_fast() does not return > > a vm_area_struct pointer, which is where you would check the access > > permissions. I suppose those pointers could not be returned to callers > > that don't already hold the mmap_sem. > > Ok, thanks for the details, i need to familiarize better with mm. i was & now more confused w.r.t checking access permission using vm_area_struct to deny write on a read only user page. i have been using get_user_pages_fast() w/ FOLL_WRITE in copy_to_user. Isn't that sufficient ?, afaiu, get_user_pages_fast() will ensure that w/ FOLL_WRITE, pte has write permission, else no struct page * is handed back to the caller. One of the simplified path which could be relevant in the majority of the cases that i figured out as follows, get_user_pages_fast internal_user_pages_fast gup_pgd_range [ no mmap_sem acquire path] gup_p4d_range gup_pud_range gup_pmd_range gup_pte_range if (!pte_access_permitted(pte, flags & FOLL_WRITE)) [ causes to return NULL page if access violation ] __gup_longterm_unlocked [ mmap_sem acquire path] get_user_pages_unlocked __get_user_pages_locked __get_user_pages follow_page_mask follow_p4d_mask follow_pud_mask follow_pmd_mask follow_page_pte if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) [ causes to return NULL page if access violation ] As far as i could see none of the get_user_pages() caller are passing struct vm_area_struct ** to get it populated. And Ingo's series eons ago didn't either pass it or check permission using it (it was passing a 'write' arguement, which i believe corrresponds to FOLL_WRITE) Am i missing something or wrong in the analysis ? Regards afzal
Re: ARM: vmsplit 4g/4g
Hi, On Mon, Jun 08, 2020 at 04:43:57PM +0200, Arnd Bergmann wrote: > There is another difference: get_user_pages_fast() does not return > a vm_area_struct pointer, which is where you would check the access > permissions. I suppose those pointers could not be returned to callers > that don't already hold the mmap_sem. Ok, thanks for the details, i need to familiarize better with mm. Regards afzal
Re: ARM: vmsplit 4g/4g
Hi, On Sun, Jun 07, 2020 at 09:26:26PM +0200, Arnd Bergmann wrote: > I think you have to use get_user_pages() though instead of > get_user_pages_fast(), > in order to be able to check the permission bits to prevent doing a > copy_to_user() > into read-only mappings. i was not aware of this, is it documented somewhere ?, afaiu, difference b/n get_user_pages_fast() & get_user_pages() is that fast version will try to pin pages w/o acquiring mmap_sem if possible. > Do you want me to review the uaccess patch to look for any missing > corner cases, or do you want to do the whole set of user access helpers > first? i will cleanup and probably post RFC initially for the changes handling copy_{from,to}_user() to get feedback. Regards afzal
Re: ARM: vmsplit 4g/4g
Hi, [ my previous mail did not make into linux-arm-kernel mailing list, got a mail saying it has a suspicious header and that it is waiting moderator approval ] On Sun, Jun 07, 2020 at 05:11:16PM +0100, Russell King - ARM Linux admin wrote: > On Sun, Jun 07, 2020 at 06:29:32PM +0530, afzal mohammed wrote: > > get_user_pages_fast() followed by kmap_atomic() & then memcpy() seems > > to work in principle for user copy. > > Have you done any performance evaluation of the changes yet? I think > it would be a good idea to keep that in the picture. If there's any > significant regression, then that will need addressing. Not yet. Yes, i will do the performance evaluation. i am also worried about the impact on performance as these [ get_user_pages() or friends, kmap_atomic() ] are additionally invoked in the copy_{from,to}_user() path now. Note that this was done on a topic branch for user copy. Changes for kernel static mapping to vmalloc has not been merged with these. Also having kernel lowmem w/ a separate asid & switching at kernel entry/exit b/n user & kernel lowmem by changing ttbr0 is yet to be done. Quite a few things remaining to be done to achieve vmsplit 4g/4g Regards afzal
ARM: vmsplit 4g/4g
Hi, On Sat, May 16, 2020 at 09:35:57AM +0200, Arnd Bergmann wrote: > On Sat, May 16, 2020 at 8:06 AM afzal mohammed > wrote: > > Okay, so the conclusion i take is, > > 1. VMSPLIT 4G/4G have to live alongside highmem > > 2. For user space copy, do pinning followed by kmap > Right, though kmap_atomic() should be sufficient here > because it is always a short-lived mapping. get_user_pages_fast() followed by kmap_atomic() & then memcpy() seems to work in principle for user copy. Verified in a crude way by pointing TTBR0 to a location that has user pgd's cleared upon entry to copy_to_user() & restoring TTBR0 to earlier value after user copying was done and ensuring boot. Meanwhile more testing w/ kernel static mapping in vmalloc space revealed a major issue, w/ LPAE it was not booting. There were issues related to pmd handling, w/ !LPAE those issues were not present as pmd is in effect equivalent to pgd. The issues has been fixed, though now LPAE boots, but feel a kind of fragile, will probably have to revisit it. Regards afzal
Re: ARM: static kernel in vmalloc space
Hi, On Thu, May 14, 2020 at 05:32:41PM +0200, Arnd Bergmann wrote: > Typical distros currently offer two kernels, with and without LPAE, > and they probably don't want to add a third one for LPAE with > either highmem or vmsplit-4g-4g. Having extra user address > space and more lowmem is both going to help users that > still have 8GB configurations. Okay, so the conclusion i take is, 1. VMSPLIT 4G/4G have to live alongside highmem 2. For user space copy, do pinning followed by kmap Regards afzal
Re: ARM: static kernel in vmalloc space
Hi, On Thu, May 14, 2020 at 07:05:45PM +0530, afzal mohammed wrote: > So if we make VMSPLIT_4G_4G, depends on !HIGH_MEMORY (w/ mention of > caveat in Kconfig help that this is meant for platforms w/ <=4GB), then > we can do copy_{from,to}_user the same way currently do, and no need to > do the user page pinning & kmap, right ? i think user page pinning is still required, but kmap can be avoided by using lowmem corresponding to that page, right ?, or am i completely wrong ? Regards afzal
Re: ARM: static kernel in vmalloc space
Hi, On Thu, May 14, 2020 at 02:41:11PM +0200, Arnd Bergmann wrote: > On Thu, May 14, 2020 at 1:18 PM afzal mohammed > wrote: > > 1. SoC w/ LPAE > > 2. TTBR1 (top 256MB) for static kernel, modules, io mappings, vmalloc, > > kmap, fixmap & vectors > Right, these kind of go together because pre-LPAE cannot do the > same TTBR1 split, and they more frequently have conflicting > static mappings. > > It's clearly possible to do something very similar for older chips > (v6 or v7 without LPAE, possibly even v5), it just gets harder > while providing less benefit. Yes, lets have it only for LPAE > > 3. TTBR0 (low 3768MB) for user space & lowmem (kernel lowmem to have > hardcoded 3840/256 split is likely the best compromise of all the hmm,i swallowed 72MB ;) > > 4. for user space to/from copy > > a. pin user pages > > b. kmap user page (can't corresponding lowmem be used instead ?) > - In the long run, there is no need for kmap()/kmap_atomic() after > highmem gets removed from the kernel, but for the next few years > we should still assume that highmem can be used, in order to support > systems like the 8GB highbank, armadaxp, keystone2 or virtual > machines. For lowmem pages (i.e. all pages when highmem is > disabled), kmap_atomic() falls back to page_address() anyway, > so there is no much overhead. Here i have some confusion - iiuc, VMSPLIT_4G_4G is meant to help platforms having RAM > 768M and <= 4GB disable high memory and still be able to access full RAM, so high memory shouldn't come into picture, right ?. And for the above platforms it can continue current VMPSLIT option (the default 3G/1G), no ?, as VMSPLIT_4G_4G can't help complete 8G to be accessible from lowmem. So if we make VMSPLIT_4G_4G, depends on !HIGH_MEMORY (w/ mention of caveat in Kconfig help that this is meant for platforms w/ <=4GB), then we can do copy_{from,to}_user the same way currently do, and no need to do the user page pinning & kmap, right ? Only problem i see is Kernel compiled w/ VMSPLIT_4G_4G not suitable for >4GB machines, but anyway iiuc, it is was not meant for those machines. And it is not going to affect our current multiplatform setup as LPAE is not defined in multi_v7. Regards afzal
Re: ARM: static kernel in vmalloc space
Hi, On Tue, May 12, 2020 at 09:49:59PM +0200, Arnd Bergmann wrote: > Any idea which bit you want to try next? My plan has been to next post patches for the static kernel migration to vmalloc space (currently the code is rigid, taking easy route wherever possible & not of high quality) as that feature has an independent existence & adds value by itself. And then start working on other steps towards VMSPLIT_4G_4G. Now that you mentioned about other things, i will slowly start those as well. > Creating a raw_copy_{from,to}_user() > based on get_user_pages()/kmap_atomic()/memcpy() is probably a good > next thing to do. I think it can be done one page at a time with only > checking for > get_fs(), access_ok(), and page permissions, while get_user()/put_user() > need to handle a few more corner cases. Before starting w/ other things, i would like to align on the high level design, My understanding (mostly based on your comments) as follows, (i currently do not have a firm grip over these things, hope to have it once started w/ the implementation) 1. SoC w/ LPAE 2. TTBR1 (top 256MB) for static kernel, modules, io mappings, vmalloc, kmap, fixmap & vectors 3. TTBR0 (low 3768MB) for user space & lowmem (kernel lowmem to have separate ASID) 4. for user space to/from copy a. pin user pages b. kmap user page (can't corresponding lowmem be used instead ?) c. copy Main points are as above, right ?, anything missed ?, or anything more you want to add ?, let me know your opinion. Regards afzal
Re: ARM: static kernel in vmalloc space
Hi, On Mon, May 11, 2020 at 05:29:29PM +0200, Arnd Bergmann wrote: > What do you currently do with the module address space? In the current setup, module address space was untouched, i.e. virtual address difference b/n text & module space is far greater than 32MB, at least > (2+768+16)MB and modules can't be loaded unless ARM_MODULE_PLTS is enabled (this was checked now) > easiest way is to just always put modules into vmalloc space, as we already > do with CONFIG_ARM_MODULE_PLTS when the special area gets full, > but that could be optimized once the rest works. Okay Regards afzal
ARM: static kernel in vmalloc space (was Re: [PATCH 0/3] Highmem support for 32-bit RISC-V)
Hi, Kernel now boots to prompt w/ static kernel mapping moved to vmalloc space. Changes currently done have a couple of platform specific things, this has to be modified to make it multiplatform friendly (also to be taken care is ARM_PATCH_PHYS_VIRT case). Module address space has to be taken care as well. Logs follows Regards afzal [0.00] Booting Linux on physical CPU 0x0 [0.00] Linux version 5.7.0-rc1-00043-ge8ffd99475b9c (afzal@afzalpc) (gcc version 8.2.0 (GCC_MA), GNU ld (GCC_MA) 2.31.1) #277 SMP Mon May 11 18:16:51 IST 2020 [0.00] CPU: ARMv7 Processor [412fc0f1] revision 1 (ARMv7), cr=10c5387d [0.00] CPU: div instructions available: patching division code [0.00] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache [0.00] OF: fdt: Machine model: V2P-CA15 [0.00] printk: bootconsole [earlycon0] enabled [0.00] Memory policy: Data cache writealloc [0.00] efi: UEFI not found. [0.00] Reserved memory: created DMA memory pool at 0x1800, size 8 MiB [0.00] OF: reserved mem: initialized node vram@1800, compatible id shared-dma-pool [0.00] percpu: Embedded 20 pages/cpu s49164 r8192 d24564 u81920 [0.00] Built 1 zonelists, mobility grouping on. Total pages: 522751 [0.00] Kernel command line: console=ttyAMA0,115200 rootwait root=/dev/mmcblk0 earlyprintk [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear) [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear) [0.00] mem auto-init: stack:off, heap alloc:off, heap free:off [0.00] Memory: 2057032K/2097148K available (12288K kernel code, 1785K rwdata, 5188K rodata, 2048K init, 403K bss, 40116K reserved, 0K cma-reserved, 1310716K highmem) [0.00] Virtual kernel memory layout: [0.00] vector : 0x - 0x1000 ( 4 kB) [0.00] fixmap : 0xffc0 - 0xfff0 (3072 kB) [0.00] vmalloc : 0xf100 - 0xff80 ( 232 MB) [0.00] lowmem : 0xc000 - 0xf000 ( 768 MB) [0.00] pkmap : 0xbfe0 - 0xc000 ( 2 MB) [0.00] modules : 0xbf00 - 0xbfe0 ( 14 MB) [0.00] .text : 0xf1208000 - 0xf1f0 (13280 kB) [0.00] .init : 0xf250 - 0xf270 (2048 kB) [0.00] .data : 0xf270 - 0xf28be558 (1786 kB) [0.00].bss : 0xf28be558 - 0xf29231a8 ( 404 kB) [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 [0.00] rcu: Hierarchical RCU implementation. [0.00] rcu: RCU event tracing is enabled. [0.00] rcu: RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=2. [0.00] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. [0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2 [0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16 [0.00] random: get_random_bytes called from start_kernel+0x304/0x49c with crng_init=0 [0.000311] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns [0.006788] clocksource: arm,sp804: mask: 0x max_cycles: 0x, max_idle_ns: 1911260446275 ns [0.008479] Failed to initialize '/bus@800/motherboard/iofpga@3,/timer@12': -22 [0.013414] arch_timer: cp15 timer(s) running at 62.50MHz (virt). [0.013875] clocksource: arch_sys_counter: mask: 0xff max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns [0.014610] sched_clock: 56 bits at 62MHz, resolution 16ns, wraps every 4398046511096ns [0.015199] Switching to timer-based delay loop, resolution 16ns [0.020168] Console: colour dummy device 80x30 [0.022219] Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=625000) [0.026998] pid_max: default: 32768 minimum: 301 [0.028835] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear) [0.029319] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear) [0.044484] CPU: Testing write buffer coherency: ok [0.045452] CPU0: Spectre v2: firmware did not set auxiliary control register IBE bit, system vulnerable [0.057536] /cpus/cpu@0 missing clock-frequency property [0.058065] /cpus/cpu@1 missing clock-frequency property [0.058538] CPU0: thread -1, cpu 0, socket 0, mpidr 8000 [0.066972] Setting up static identity map for 0x8030 - 0x803000ac [0.074772] rcu: Hierarchical SRCU implementation. [0.083336] EFI services will not be available. [0.085605] smp: Bringing up secondary CPUs ... [0.090454] CPU1: thread -1, cpu 1, socket 0, mpidr 8001 [0.090560] CPU1: Spectre v2: firmware did not set auxiliary control register IBE bit, system vulnerable [0.096711] smp: Brought up 1 node, 2 CPUs [0.097132] SMP: Total of 2 processors activa
Re: [PATCH] ARM: omap1: fix irq setup
Hi, On Tue, May 05, 2020 at 04:13:48PM +0200, Arnd Bergmann wrote: > A recent cleanup introduced a bug on any omap1 machine that has > no wakeup IRQ, i.e. omap15xx: > Move this code into a separate function to deal with it cleanly. > > Fixes: b75ca5217743 ("ARM: OMAP: replace setup_irq() by request_irq()") > Signed-off-by: Arnd Bergmann Sorry for the mistake and thanks for the fix, Acked-by: afzal mohammed Regards afzal
Re: [PATCH 0/3] Highmem support for 32-bit RISC-V
[ +linux-arm-kernel Context: This is regarding VMSPLIT_4G_4G support for 32-bit ARM as a possible replacement to highmem. For that, initially, it is being attempted to move static kernel mapping from lowmem to vmalloc space. in next reply, i will remove everyone/list !ARM related ] Hi, On Sun, May 03, 2020 at 10:20:39PM +0200, Arnd Bergmann wrote: > Which SoC platform are you running this on? Just making > sure that this won't conflict with static mappings later. Versatile Express V2P-CA15 on qemu, qemu options include --smp 2 & 2GB memory. BTW, i could not convince myself why, except for DEBUG_LL, static io mappings are used. > > One problem I see immediately in arm_memblock_init() Earlier it went past arm_memblock_init(), issue was clearing the page tables from VMALLOC_START in devicemaps_init() thr' paging_init(), which was like cutting the sitting branch of the tree. Now it is crashing at debug_ll_io_init() of devicemap_init(), and printascii/earlycon was & is being used to debug :). Things are going wrong when it tries to create mapping for debug_ll. It looks like a conflict with static mapping, which you mentioned above, at the same time i am not seeing kernel static mapping in the same virtual address, need to dig deeper. Also tried removing DEBUG_LL, there is a deafening silence in the console ;) > is that it uses > __pa() to convert from virtual address in the linear map to physical, > but now you actually pass an address that is in vmalloc rather than > the linear map. __virt_to_phys_nodebug() which does the actual work on __pa() invocation has been modifed to handle that case (ideas lifted from ARM64's implementation), though currently it is a hack as below (and applicable only for ARM_PATCH_PHYS_VIRT disabled case), other hacks being VMALLOC_OFFSET set to 0 and adjusting vmalloc size. static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x) { phys_addr_t __x = (phys_addr_t)x; if (__x >= 0xf000) return __x - KIMAGE_OFFSET + PHYS_OFFSET; else return __x - PAGE_OFFSET + PHYS_OFFSET; } Regards afzal
Re: [PATCH 0/3] Highmem support for 32-bit RISC-V
Hi Arnd, > On Tue, Apr 14, 2020 at 09:29:46PM +0200, Arnd Bergmann wrote: > > Another thing to try early is to move the vmlinux virtual address > > from the linear mapping into vmalloc space. This does not require > > LPAE either, but it only works on relatively modern platforms that > > don't have conflicting fixed mappings there. i have started by attempting to move static kernel mapping from lowmem to vmalloc space. At boot the execution so far has went past assembly & reached C, to be specific, arm_memblock_init [in setup_arch()], currently debugging the hang that happens after that point. To make things easier in the beginning, ARM_PATCH_PHYS_VIRT is disabled & platform specific PHYS_OFFSET is fed, this is planned to be fixed once it boots. [ i will probably start a new thread or hopefully RFC on LAKML ] Regards afzal
Re: [PATCHv2 5/5] arm64: allwinner: a64: Add support for TERES-I laptop
Hi Maxime, On Sun, Mar 18, 2018 at 09:22:51PM +0100, Maxime Ripard wrote: > The first part is supposed to be the name of the boards. I did sed > s/leds/teres-i/, and applied, together with all the patches but the > PWM (so I had to drop the backlight node as well). > > Please coordinate with Andre about who should send the PWM support. Assuming that these patches were applied to your sunxi/dt64-for-4.17 branch, since PWM support patch is missing, there is a build error, arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts:129.1-5 Label or path pwm not found Diff at the end cures it. (there is another H6 pine 64 DT build error related to header file missing) afzal --->8--- diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts index b3c7ef6b6fe5..d9baab3dc96b 100644 --- a/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts @@ -126,12 +126,6 @@ status = "okay"; }; -&pwm { - pinctrl-names = "default"; - pinctrl-0 = <&pwm_pin>; - status = "okay"; -}; - &ohci1 { status = "okay"; };
Re: [PATCHv2 5/5] arm64: allwinner: a64: Add support for TERES-I laptop
Hi, On Fri, Mar 16, 2018 at 12:07:53PM +0530, afzal mohammed wrote: > Received only patch 4 & 5 in my inbox, receive path was via > linux-kernel rather than linux-arm-kernel, but in both archives all > patches are seen (though threading seems not right), probably missing > patches are due to issue gmail have with LKML, Cover letter plus 1-3 patches was swallowed by spam filter, even your reply to me on v1 cover letter subthread was so, dunno whether it has something to do with your mail header contents. afzal
Re: [PATCHv2 5/5] arm64: allwinner: a64: Add support for TERES-I laptop
Hi, On Thu, Mar 15, 2018 at 04:25:10PM +, Harald Geyer wrote: > The TERES-I is an open hardware laptop built by Olimex using the > Allwinner A64 SoC. > > Add the board specific .dts file, which includes the A64 .dtsi and > enables the peripherals that we support so far. > > Signed-off-by: Harald Geyer Received only patch 4 & 5 in my inbox, receive path was via linux-kernel rather than linux-arm-kernel, but in both archives all patches are seen (though threading seems not right), probably missing patches are due to issue gmail have with LKML, so had to pull the series from patchwork, for the series, Tested-by: afzal mohammed afzal
Re: arm64: allwinner: Add support for TERES I laptop
Hi, On Thu, Mar 15, 2018 at 10:36:06PM +0530, afzal mohammed wrote: > Thanks for the patches > > w/ defconfig could reach to prompt via serial console using audio > jack. > > And just by enabling PWM_SUN4I & FB_SIMPLE, laptop could function > standalone as well. > > Suggestions (feel free to ignore): > > 1. seems currently only review comment pending is on simple > framebuffer, perhaps you can proceed removing just that so that a > basic bootable system can be achieved at the earliest (iiuc, anyway > drm would be the final solution for display) > > 2. in next revision (if), may be you can put keywords DIY and/or Open > Hardware (irrespective of whatever exactly that means) Laptop in the > subject itself, that might bring more interest/eyeballs, especially at > this time of ME & so on. Realizing now that your v2 patches & above mail crossed. afzal
Re: [PATCH 00/16] remove eight obsolete architectures
Hi, On Thu, Mar 15, 2018 at 10:56:48AM +0100, Arnd Bergmann wrote: > On Thu, Mar 15, 2018 at 10:42 AM, David Howells wrote: > > Do we have anything left that still implements NOMMU? Please don't kill !MMU. > Yes, plenty. > I've made an overview of the remaining architectures for my own reference[1]. > The remaining NOMMU architectures are: > > - arch/arm has ARMv7-M (Cortex-M microcontroller), which is actually > gaining traction ARMv7-R as well, also seems ARM is coming up with more !MMU's - v8-M, v8-R. In addition, though only of academic interest, ARM MMU capable platform's can run !MMU Linux. afzal > - arch/sh has an open-source J2 core that was added not that long ago, > it seems to > be the only SH compatible core that anyone is working on. > - arch/microblaze supports both MMU/NOMMU modes (most use an MMU) > - arch/m68k supports several NOMMU targets, both the coldfire SoCs and the > classic processors > - c6x has no MMU
Re: arm64: allwinner: Add support for TERES I laptop
Hi, On Mon, Mar 12, 2018 at 04:10:45PM +, Harald Geyer wrote: > This series adds support for the TERES I open hardware laptop produced > by olimex. With these patches and a bootloader capable of setting up > simple framebuffer the laptop is quite useable. Thanks for the patches w/ defconfig could reach to prompt via serial console using audio jack. And just by enabling PWM_SUN4I & FB_SIMPLE, laptop could function standalone as well. Suggestions (feel free to ignore): 1. seems currently only review comment pending is on simple framebuffer, perhaps you can proceed removing just that so that a basic bootable system can be achieved at the earliest (iiuc, anyway drm would be the final solution for display) 2. in next revision (if), may be you can put keywords DIY and/or Open Hardware (irrespective of whatever exactly that means) Laptop in the subject itself, that might bring more interest/eyeballs, especially at this time of ME & so on. Regards afzal
Re: [tip:x86/pti] x86/speculation: Use IBRS if available before calling into firmware
Hi, On Sun, Feb 11, 2018 at 11:19:10AM -0800, tip-bot for David Woodhouse wrote: > x86/speculation: Use IBRS if available before calling into firmware > > Retpoline means the kernel is safe because it has no indirect branches. > But firmware isn't, so use IBRS for firmware calls if it's available. afaui, so only retpoline means still mitigation not enough. Also David W has mentioned [1] that even with retpoline, IBPB is also required (except Sky Lake). If IBPB & IBRS is not supported by ucode, shouldn't the below indicate some thing on the lines of Mitigation not enough ? > - return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], > + return sprintf(buf, "%s%s%s%s\n", > spectre_v2_strings[spectre_v2_enabled], > boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", > +boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", > spectre_v2_module_string()); On 4.16-rc1, w/ GCC 7.3.0, /sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full generic retpoline Here for the user (at least for me), it is not clear whether the mitigation is enough. In the present system (Ivy Bridge), as ucode update is not available, IBPB is not printed along with "spectre_v2:Mitigation", so unless i am missing something, till then this system should be considered vulnerable, but for a user not familiar with details of the issue, it cannot be deduced. Perhaps an additional status field [OKAY,PARTIAL] to Mitigation in sysfs might be helpful. All these changes are in the air for me, this is from a user perspective, sorry if my feedback seems idiotic. afzal [1] lkml.kernel.org/r/1516638426.9521.20.ca...@infradead.org
Re: [PATCH] doc: memory-barriers: reStructure Text
Hi, On Thu, Jan 04, 2018 at 11:27:55AM +0100, Markus Heiser wrote: > IMO symlinks are mostly ending in a mess, URLs are never stable. > There is a > > https://www.kernel.org/doc/html/latest/objects.inv > > to handle such requirements. Take a look at *intersphinx* : > > http://www.sphinx-doc.org/en/stable/ext/intersphinx.html > > to see how it works: Each Sphinx HTML build creates a file named objects.inv > that > contains a mapping from object names to URIs relative to the HTML set’s root. > > This means articles from external (like lwn articles) has to be recompiled. > Not perfect, but a first solution. Thanks for the details. > I really like them Initially i was sceptical of rst & once instead of hitting the fly, hit "make htmldocs" on the keyboard :), and the opinion about it was changed. It was easy to navigate through various docs & the realized that various topics (& many) were present (yes, it was there earlier also, but had to dive inside Documentation & search, while viewing the toplevel index.html made them standout). It was like earlier you had to go after docs, but now it was docs coming after you, that is my opinion. Later while fighting with memory-barriers.txt, felt that it might be good for it as well as to be in that company. And the readability as a text is not hurt as well. It was thought that rst conversion could be done quickly, but since this was my first attempt with rst, had to put some effort to get a not so bad output, even if this patch dies, i am happy to have learnt rst conversion to some extent. > > Upon trying to understand memory-barriers.txt, i felt that it might be > > better to have it in PDF/HTML format, thus attempted to convert it to > > rst. And i see it not being welcomed, hence shelving the conversion. > > I think that's a pity. When one of the author of the original document objected, i felt it is better to backoff. But if there is a consensus, i will proceed. afzal
Re: [PATCH] doc: memory-barriers: reStructure Text
Hi, On Thu, Jan 04, 2018 at 09:48:50AM +0800, Boqun Feng wrote: > > The location chosen is "Documentation/kernel-hacking", i was unsure > > where this should reside & there was no .rst file in top-level directory > > "Documentation", so put it into one of the existing folder that seemed > > to me as not that unsuitable. > > > > Other files refer to memory-barrier.txt, those also needs to be > > adjusted based on where .rst can reside. > How do you plan to handle the external references? For example, the > following LWN articles has a link this file: > > https://lwn.net/Articles/718628/ > > And changing the name and/or location will break that link, AFAIK. If necessary to handle these, symlink might help here i believe. Upon trying to understand memory-barriers.txt, i felt that it might be better to have it in PDF/HTML format, thus attempted to convert it to rst. And i see it not being welcomed, hence shelving the conversion. afzal
Re: [PATCH] doc: memory-barriers: reStructure Text
Hi, On Thu, Jan 04, 2018 at 12:48:28AM +0100, Peter Zijlstra wrote: > > Let PDF & HTML's be created out of memory-barriers Text by > > reStructuring. > So I hate this rst crap with a passion, so NAK from me. Okay, the outcome is exactly as was feared. Abondoning the patch, let this be > /dev/null afzal
[PATCH] doc: memory-barriers: reStructure Text
Let PDF & HTML's be created out of memory-barriers Text by reStructuring. reStructuring done were, 1. Section headers modification, lower header case except start 2. Removal of manual index(contents section), since it now gets created automatically for html/pdf 3. Internal cross reference for easy navigation 4. Alignment adjustments 5. Strong emphasis made wherever there was emphasis earlier (through other ways), strong was chosen as normal emphasis showed in italics, which was felt to be not enough & strong showed it in bold 6. ASCII text & code snippets in literal blocks 7. Backquotes for inline instances in the paragraph's where they are expressed not in English, but in C, pseudo-code, file path etc. 8. Notes section created out of the earlier notes 9. Manual numbering replaced by auto-numbering 10.Bibliography (References section) made such that it can be cross-linked Signed-off-by: afzal mohammed --- Hi, With this change, pdf & html could be generated. There certainly are improvements to be made, but thought of first knowing whether migrating memory-barriers from txt to rst is welcome. The location chosen is "Documentation/kernel-hacking", i was unsure where this should reside & there was no .rst file in top-level directory "Documentation", so put it into one of the existing folder that seemed to me as not that unsuitable. Other files refer to memory-barrier.txt, those also needs to be adjusted based on where .rst can reside. afzal Documentation/kernel-hacking/index.rst |1 + .../memory-barriers.rst} | 1707 ++-- 2 files changed, 837 insertions(+), 871 deletions(-) rename Documentation/{memory-barriers.txt => kernel-hacking/memory-barriers.rst} (63%) diff --git a/Documentation/kernel-hacking/index.rst b/Documentation/kernel-hacking/index.rst index fcb0eda3cca3..20eb56d02ea5 100644 --- a/Documentation/kernel-hacking/index.rst +++ b/Documentation/kernel-hacking/index.rst @@ -7,3 +7,4 @@ Kernel Hacking Guides hacking locking + memory-barriers diff --git a/Documentation/memory-barriers.txt b/Documentation/kernel-hacking/memory-barriers.rst similarity index 63% rename from Documentation/memory-barriers.txt rename to Documentation/kernel-hacking/memory-barriers.rst index 479ecec80593..60b6a8be8a09 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/kernel-hacking/memory-barriers.rst @@ -1,14 +1,13 @@ - -LINUX KERNEL MEMORY BARRIERS - + +Linux kernel memory barriers + -By: David Howells -Paul E. McKenney -Will Deacon -Peter Zijlstra +:Authors: David Howells , + Paul E. McKenney , + Will Deacon , + Peter Zijlstra -== -DISCLAIMER +Disclaimer == This document is not a specification; it is intentionally (for the sake of @@ -21,10 +20,9 @@ hardware. The purpose of this document is twofold: - (1) to specify the minimum functionality that one can rely on for any - particular barrier, and - - (2) to provide a guide as to how to use the barriers that are available. +* to specify the minimum functionality that one can rely on for any + particular barrier +* to provide a guide as to how to use the barriers that are available Note that an architecture can provide more than the minimum requirement for any particular barrier, but if the architecture provides less than @@ -35,78 +33,10 @@ architecture because the way that arch works renders an explicit barrier unnecessary in that case. - -CONTENTS - - - (*) Abstract memory access model. - - - Device operations. - - Guarantees. - - (*) What are memory barriers? - - - Varieties of memory barrier. - - What may not be assumed about memory barriers? - - Data dependency barriers. - - Control dependencies. - - SMP barrier pairing. - - Examples of memory barrier sequences. - - Read memory barriers vs load speculation. - - Multicopy atomicity. - - (*) Explicit kernel barriers. - - - Compiler barrier. - - CPU memory barriers. - - MMIO write barrier. - - (*) Implicit kernel memory barriers. - - - Lock acquisition functions. - - Interrupt disabling functions. - - Sleep and wake-up functions. - - Miscellaneous functions. - - (*) Inter-CPU acquiring barrier effects. - - - Acquires vs memory accesses. - - Acquires vs I/O accesses. - - (*) Where are memory barriers needed? - - - Interprocessor interaction. - - Atomic operations. - - Accessing devices. - - Interrupts. - - (*) Kernel I/O barrier effects. - - (*) Assumed minimum execution ordering model. - - (*) The effects of the cpu cache. - - - Cache coherency. - - Cache coherency vs DMA
Re: Prototype patch for Linux-kernel memory model
Hi, On Fri, Dec 22, 2017 at 09:41:32AM +0530, afzal mohammed wrote: > On Thu, Dec 21, 2017 at 08:15:02AM -0800, Paul E. McKenney wrote: > > Have you installed and run the herd tool? Doing so would allow you > > to experiment with changes to the litmus tests. > > Yes, i installed herd tool and then i was at a loss :(, so started > re-reading the documentation, yet to run any of the tests. Above was referring to "opam install herdtools7" & the pre-requisites, with the current HEAD of herd, build fails as below, but builds fine with the latest tag - 7.47. Could run a couple of tests as well now, thanks. afzal herdtools7(master)$ make all sh ./build.sh $HOME + /usr/bin/ocamldep.opt -modules gen/RISCVCompile_gen.ml > gen/RISCVCompile_gen.ml.depends File "gen/RISCVCompile_gen.ml", line 94, characters 8-9: Error: Syntax error Command exited with code 2. Compilation unsuccessful after building 1439 targets (0 cached) in 00:00:59. Makefile:4: recipe for target 'all' failed make: *** [all] Error 10
Re: Prototype patch for Linux-kernel memory model
Hi, On Thu, Dec 21, 2017 at 08:15:02AM -0800, Paul E. McKenney wrote: > On Thu, Dec 21, 2017 at 09:00:55AM +0530, afzal mohammed wrote: > > Since it is now mentioned that r1 can have final value of 0, though it > > is understood, it might make things crystal clear and for the sake of > > completeness to also show the non-automatic variable x being > > initialized to 0. > > Here we rely on the C-language and Linux-kernel convention that global > variables that are not explicitly initialized are initialized to zero. > (Also the documented behavior of the litmus tests and the herd tool that > uses them.) So that part should be OK as is. Okay, that was suggested to bring parity with some of the examples in explanation.txt, where global variables are explicitly initalized to zero, that unconsciously made me feel that litmus tests also follow that pattern, but checking again realize that litmus tests are not so. > > Nevertheless, thank you for your review and comments! Thanks for taking the effort to reply. > Have you installed and run the herd tool? Doing so would allow you > to experiment with changes to the litmus tests. Yes, i installed herd tool and then i was at a loss :(, so started re-reading the documentation, yet to run any of the tests. afzal
Re: Prototype patch for Linux-kernel memory model
Hi, On Wed, Dec 20, 2017 at 08:45:38AM -0800, Paul E. McKenney wrote: > On Wed, Dec 20, 2017 at 05:01:45PM +0530, afzal mohammed wrote: > > > +It is tempting to assume that CPU0()'s store to x is globally ordered > > > +before CPU1()'s store to z, but this is not the case: > > > + > > > + /* See Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus. */ > > > + void CPU0(void) > > > + { > > > + WRITE_ONCE(x, 1); > > > + smp_store_release(&y, 1); > > > + } > > > + > > > + void CPU1(void) > > > + { > > > + r1 = smp_load_acquire(y); > > > + smp_store_release(&z, 1); > > > + } > > > + > > > + void CPU2(void) > > > + { > > > + WRITE_ONCE(z, 2); > > > + smp_mb(); > > > + r2 = READ_ONCE(x); > > > + } > > > + > > > +One might hope that if the final value of r1 is 1 and the final value > > > +of z is 2, then the final value of r2 must also be 1, but the opposite > > > +outcome really is possible. > > > > As there are 3 variables to have the values, perhaps, it might be > > clearer to have instead of "the opposite.." - "the final value need > > not be 1" or was that a read between the lines left as an exercise to > > the idiots ;) > > Heh! Good catch, thank you! How about the following for the paragraph > immediately after that litmus test? > > One might hope that if the final value of r0 is 1 and the final > value of z is 2, then the final value of r1 must also be 1, > but it really is possible for r1 to have the final value of 0. > The reason, of course, is that in this version, CPU2() is not > part of the release-acquire chain. This situation is accounted > for in the rules of thumb below. > > I also fixed r1 and r2 to match the names in the actual litmus test. Since it is now mentioned that r1 can have final value of 0, though it is understood, it might make things crystal clear and for the sake of completeness to also show the non-automatic variable x being initialized to 0. Thanks for taking into account my opinion. afzal
Re: Prototype patch for Linux-kernel memory model
Hi, Is this patch not destined to the HEAD of Torvalds ?, got that feeling as this was in flight around merge window & have not yet made there. On Wed, Nov 15, 2017 at 08:37:49AM -0800, Paul E. McKenney wrote: > diff --git a/tools/memory-model/Documentation/recipes.txt > b/tools/memory-model/Documentation/recipes.txt > +Taking off the training wheels > +== : > +Release-acquire chains > +-- : > +It is tempting to assume that CPU0()'s store to x is globally ordered > +before CPU1()'s store to z, but this is not the case: > + > + /* See Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus. */ > + void CPU0(void) > + { > + WRITE_ONCE(x, 1); > + smp_store_release(&y, 1); > + } > + > + void CPU1(void) > + { > + r1 = smp_load_acquire(y); > + smp_store_release(&z, 1); > + } > + > + void CPU2(void) > + { > + WRITE_ONCE(z, 2); > + smp_mb(); > + r2 = READ_ONCE(x); > + } > + > +One might hope that if the final value of r1 is 1 and the final value > +of z is 2, then the final value of r2 must also be 1, but the opposite > +outcome really is possible. As there are 3 variables to have the values, perhaps, it might be clearer to have instead of "the opposite.." - "the final value need not be 1" or was that a read between the lines left as an exercise to the idiots ;) afzal > The reason, of course, is that in this > +version, CPU2() is not part of the release-acquire chain. This > +situation is accounted for in the rules of thumb below.
Re: Prototype patch for Linux-kernel memory model
Hi, A trivial & late (sorry) comment, On Wed, Nov 15, 2017 at 08:37:49AM -0800, Paul E. McKenney wrote: > +THE HAPPENS-BEFORE RELATION: hb > +--- > +Less trivial examples of prop all involve fences. Unlike the simple > +examples above, they can require that some instructions are executed > +out of program order. This next one should look familiar: > + > + int buf = 0, flag = 0; > + > + P0() > + { > + WRITE_ONCE(buf, 1); > + smp_wmb(); > + WRITE_ONCE(flag, 1); > + } > + > + P1() > + { > + int r1; > + int r2; > + > + r1 = READ_ONCE(flag); > + r2 = READ_ONCE(buf); > + } > + > +This is the MP pattern again, with an smp_wmb() fence between the two > +stores. If r1 = 1 and r2 = 0 at the end then there is a prop link > +from P1's second load to its first (backwards!). The reason is > +similar to the previous examples: The value P1 loads from buf gets > +overwritten by P1's store to buf, P0's store to buf afzal > the fence guarantees that the store > +to buf will propagate to P1 before the store to flag does, and the > +store to flag propagates to P1 before P1 reads flag. > + > +The prop link says that in order to obtain the r1 = 1, r2 = 0 result, > +P1 must execute its second load before the first. Indeed, if the load > +from flag were executed first, then the buf = 1 store would already > +have propagated to P1 by the time P1's load from buf executed, so r2 > +would have been 1 at the end, not 0. (The reasoning holds even for > +Alpha, although the details are more complicated and we will not go > +into them.) > + > +But what if we put an smp_rmb() fence between P1's loads? The fence > +would force the two loads to be executed in program order, and it > +would generate a cycle in the hb relation: The fence would create a ppo > +link (hence an hb link) from the first load to the second, and the > +prop relation would give an hb link from the second load to the first. > +Since an instruction can't execute before itself, we are forced to > +conclude that if an smp_rmb() fence is added, the r1 = 1, r2 = 0 > +outcome is impossible -- as it should be.
Re: [PATCH 1/6] ARM: stm32: prepare stm32 family to welcome armv7 architecture
Hi, On Mon, Dec 11, 2017 at 02:40:43PM +0100, Arnd Bergmann wrote: > On Mon, Dec 11, 2017 at 11:25 AM, Linus Walleij > >> This patch prepares the STM32 machine for the integration of Cortex-A > >> based microprocessor (MPU), on top of the existing Cortex-M > >> microcontroller family (MCU). Since both MCUs and MPUs are sharing > >> common hardware blocks we can keep using ARCH_STM32 flag for most of > >> them. If a hardware block is specific to one family we can use either > >> ARCH_STM32_MCU or ARCH_STM32_MPU flag. > To what degree do we need to treat them as separate families > at all then? I wonder if the MCU/MPU distinction is always that > clear along the Cortex-M/Cortex-A separation, > What > exactly would we miss if we do away with the ARCH_STM32_MCU > symbol here? Based on this patch series, the only difference seems to be w.r.t ARM components, not peripherals outside ARM subystem. Vybrid VF610 is a similar case, though not identical (it can have both instead of either), deals w/o extra symbols, 8064887e02fd6 (ARM: vf610: enable Cortex-M4 configuration on Vybrid SoC) > especially if > we ever get to a chip that has both types of cores. Your wish fulfilled, Vybrid VF610 has both A5 & M4F and mainline Linux boots on both (simultaneously as well), and the second Linux support, i.e. on M4 went thr' your keyboard, see above commit :) There are quite a few others as well, TI's AM335x (A8 + M3), AM437x (A9 + M3), AM57x (A15 + M4), but of these Cortex M's, the one in AM57x only can be Linux'able. On others they are meant for PM with limited resources. > > So yesterdays application processors are todays MCU processors. > > > > I said this on a lecture for control systems a while back and > > stated it as a reason I think RTOSes are not really seeing a bright > > future compared to Linux. > I think there is still lots of room for smaller RTOS in the long run, Me being an electrical engineer & worked to some extent in motor control on RTOS/no OS (the value of my opinion is questionable though), the thought of handling the same in Linux (even RT) sends shivers down my spine. Here, case being considered is the type of motor (like permanent magnet ones) where each phase of the motor has to be properly excited during every PWM period (say every 100us, depending on the feedback, algorithm, other synchronization) w/o which the motor that has been told to run might try to fly. This is different from stepper motor where if control misbehaves/stops nothing harmful normally happens. But my opinion is a kind of knee-jerk reaction and based on prevalent atitude in that field, hmm.., probably i should attempt it first. Regards afzal
Re: vger.kernel.org mail queue issue?
Hi, On Mon, May 01, 2017 at 10:50:57AM -0400, David Miller wrote: > From: afzal mohammed > > On Wed, Jan 11, 2017 at 09:07:35PM -0500, David Miller wrote: > >> From: Florian Fainelli > >> > I am seeing emails being received right now from @vger.kernel.org that > >> > seem to be from this morning according to my mailer. Has anything > >> > changed on vger.kernel.org that could cause that? Other mailing-lists > >> > (e.g: infradead.org) seems to be fine. > > > >> Nope, in fact I've been aggressively removing bouncers lately > >> and trying to keep the system running efficiently. > >> > >> I kind of suspect that google has ramped up their rate limiting > >> settings a little bit on gmail. > >> > >> I'll try to keep an eye out. > > > > Seems gmail again is receiving mails with a delay, the last received > > lk mail has date as 30 Apr 2017 08:23:50 +0300, while here it is > > around 01 May 2017 17:10 + 0530. And lkml archives has a lot of mails > > after that. > There is really nothing I can do about this. > > The problem is that GMAIL has extremely restrictive rate limiting. It > really is insufficient for absorbing the rate at which postings are > made on the lists during the busiest times of the day. And when the > rate it exceeded, the gmail accounts in question simply drop postings > for a certain period of time. > > So I have to intentionally back off the rate at which vger.kernel.org > queues up to GMAIL accounts. > > If I let it go at full speed then half of the postings would get > dropped and people would miss content. Thanks much for handling it the way you are now, it at least helps in getting all mails instead of missing. The last time even before Florian reported the issue, i was seeing it, but initially thought it was a problem related to my account, tried unsubscribe & subscribe, contacting list owner etc, none did help, only upon seeing Florian's mail, did realize that it was a generic GMAIL issue. > Complain to GMAIL if you dislike this but I have tried in the past and > they have no intention of increasing their default posting rate > limits. Don't know whether you had done some changes, now able to get mails in realtime. Next time upon seeing this kind of issue will request GMAIL, irrespective of the outcome, i will do my part. And thanks for taking the time to reply. Regards afzal
Re: vger.kernel.org mail queue issue?
Hi, On Wed, Jan 11, 2017 at 09:07:35PM -0500, David Miller wrote: > From: Florian Fainelli > > I am seeing emails being received right now from @vger.kernel.org that > > seem to be from this morning according to my mailer. Has anything > > changed on vger.kernel.org that could cause that? Other mailing-lists > > (e.g: infradead.org) seems to be fine. > Nope, in fact I've been aggressively removing bouncers lately > and trying to keep the system running efficiently. > > I kind of suspect that google has ramped up their rate limiting > settings a little bit on gmail. > > I'll try to keep an eye out. Seems gmail again is receiving mails with a delay, the last received lk mail has date as 30 Apr 2017 08:23:50 +0300, while here it is around 01 May 2017 17:10 + 0530. And lkml archives has a lot of mails after that. With filters based on TO|CC (my problem) & due to cross posted mails, to realize the issue it takes some time, seems the issue is there for last few days. Regards afzal
Re: [PATCH] ARM: nommu: access ID_PFR1 only if CPUID scheme
Hi, On Thu, Mar 23, 2017 at 09:37:48PM +1000, Greg Ungerer wrote: > Tested-by: Greg Ungerer Thanks Greg Since there was no negative feedback yet, change has been deposited in rmk's patch system as 8665/1 Regards afzal
Re: [PATCH] ARM: nommu: access ID_PFR1 only if CPUID scheme
Hi, On Fri, Mar 17, 2017 at 10:10:34PM +0530, afzal mohammed wrote: > Greg upon trying to boot no-MMU Kernel on ARM926EJ reported boot > failure. He root caused it to ID_PFR1 access introduced by the > commit mentioned in the fixes tag below. > > All CP15 processors need not have processor feature registers, only > for architectures defined by CPUID scheme would have it. Hence check > for it before accessing processor feature register, ID_PFR1. > > Fixes: f8300a0b5de0 ("ARM: 8647/2: nommu: dynamic exception base address > setting") > Reported-by: Greg Ungerer > Signed-off-by: afzal mohammed Greg, can i add your Tested-by ? Regards afzal > --- > > Hi Russell, > > It would be good to have the fix go in during -rc, as, > > 1. Culprit commit went in during the last merge window > 2. Though nothing supported in mainline is known to be broken, the > original change needs to be modified to be reliable
[PATCH] ARM: nommu: access ID_PFR1 only if CPUID scheme
Greg upon trying to boot no-MMU Kernel on ARM926EJ reported boot failure. He root caused it to ID_PFR1 access introduced by the commit mentioned in the fixes tag below. All CP15 processors need not have processor feature registers, only for architectures defined by CPUID scheme would have it. Hence check for it before accessing processor feature register, ID_PFR1. Fixes: f8300a0b5de0 ("ARM: 8647/2: nommu: dynamic exception base address setting") Reported-by: Greg Ungerer Signed-off-by: afzal mohammed --- Hi Russell, It would be good to have the fix go in during -rc, as, 1. Culprit commit went in during the last merge window 2. Though nothing supported in mainline is known to be broken, the original change needs to be modified to be reliable Vladimir, this is being posted as the issue is taken care run time. Regards afzal --- arch/arm/mm/nommu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index 3b5c7aaf9c76..33a45bd96860 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -303,7 +303,10 @@ static inline void set_vbar(unsigned long val) */ static inline bool security_extensions_enabled(void) { - return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4); + /* Check CPUID Identification Scheme before ID_PFR1 read */ + if ((read_cpuid_id() & 0x000f) == 0x000f) + return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4); + return 0; } static unsigned long __init setup_vectors_base(void) -- 2.12.0
Re: [PATCH RESEND] ARM: ep93xx: Disable TS-72xx watchdog before uncompressing
Hi, On Thu, Feb 02, 2017 at 12:12:26PM -0800, Florian Fainelli wrote: > The TS-72xx/73xx boards have a CPLD watchdog which is configured to > reset the board after 8 seconds, if the kernel is large enough that this > takes about this time to decompress the kernel, we will encounter a > spurious reboot. so once it reaches Kernel proper, that dog is being killed, right ? iirc, TI AM335x's & AM43x's ROM code too leaves the on-chip watchdog enabled & the bootloader disables it (else once it boots to prompt, it reboots always unless watchdog driver [if present] takes care of it), Lokesh, right ? But yes, that brings a bootloader dependency. Regards afzal
Re: [PATCH v3 0/3] ARM: !MMU: v7-A support, dynamic vectors base handling
Hi, On Wed, Feb 01, 2017 at 10:33:17AM +, Vladimir Murzin wrote: > On 31/01/17 19:24, Russell King - ARM Linux wrote: > > On Tue, Jan 31, 2017 at 06:34:46PM +0530, afzal mohammed wrote: > >> ARM core changes to support !MMU Kernel on v7-A MMU processors. > >> > >> Based on the feedback from Russell, it was decided to handle vector > >> base dynamically in C for no-MMU & work towards the the goal of > >> removing VECTORS_BASE from Kconfig. > > > > Looks good from my perspective. If Vladimir can reply about patch 2, > > then I think we'll be good to go with these. Thanks. Patch system has been updated with this series along with Vladimir's Tested-by on patch 2. Thanks > My R-class and M-class setups continue to work with this series applied on > top of next-20170201 plus > following fixup for PATCH 2/3 Yes, Russell has applied another patch and the context changes a little. > > -#define VECTORS_BASE UL(0x) > - > - /* > - * We fix the TCM memories max 32 KiB ITCM resp DTCM at these > - * locations > + #ifdef CONFIG_XIP_KERNEL > + #define KERNEL_START _sdata > + #else > > FWIW: Tested-by: Vladimir Murzin Thanks Regards afzal
[PATCH v3 2/3] ARM: nommu: display vectors base
VECTORS_BASE displays the exception base address. Now on no-MMU as the exception base address is dynamically estimated, define VECTORS_BASE to the variable holding it. As it is the case, limit VECTORS_BASE constant definition to MMU. Suggested-by: Russell King Signed-off-by: afzal mohammed --- v3: Simplify by defining VECTORS_BASE to vectors_base v2: A change to accomodate bisectability resolution on patch 1/4 arch/arm/include/asm/memory.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 0b5416fe7709..780549a78937 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -83,8 +83,15 @@ #define IOREMAP_MAX_ORDER 24 #endif +#define VECTORS_BASE UL(0x) + #else /* CONFIG_MMU */ +#ifndef __ASSEMBLY__ +extern unsigned long vectors_base; +#define VECTORS_BASE vectors_base +#endif + /* * The limitation of user task size can grow up to the end of free ram region. * It is difficult to define and perhaps will never meet the original meaning @@ -111,8 +118,6 @@ #endif /* !CONFIG_MMU */ -#define VECTORS_BASE UL(0x) - /* * We fix the TCM memories max 32 KiB ITCM resp DTCM at these * locations -- 2.11.0
[PATCH v3 3/3] ARM: nommu: remove Hivecs configuration is asm
Now that exception based address is handled dynamically for processors with CP15, remove Hivecs configuration in assembly. Signed-off-by: afzal mohammed Tested-by: Vladimir Murzin --- v3: Vladimir's Tested-by arch/arm/kernel/head-nommu.S | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index 6b4eb27b8758..2e21e08de747 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -152,11 +152,6 @@ __after_proc_init: #ifdef CONFIG_CPU_ICACHE_DISABLE bic r0, r0, #CR_I #endif -#ifdef CONFIG_CPU_HIGH_VECTOR - orr r0, r0, #CR_V -#else - bic r0, r0, #CR_V -#endif mcr p15, 0, r0, c1, c0, 0 @ write control reg #elif defined (CONFIG_CPU_V7M) /* For V7M systems we want to modify the CCR similarly to the SCTLR */ -- 2.11.0
[PATCH v3 1/3] ARM: nommu: dynamic exception base address setting
No-MMU dynamic exception base address configuration on CP15 processors. In the case of low vectors, decision based on whether security extensions are enabled & whether remap vectors to RAM CONFIG option is selected. For no-MMU without CP15, current default value of 0x0 is retained. Signed-off-by: afzal mohammed Tested-by: Vladimir Murzin --- v3: Vladimir's Tested-by v2: Use existing helpers to detect security extensions Rewrite a CPP step to C for readability arch/arm/mm/nommu.c | 52 ++-- 1 file changed, 50 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index 2740967727e2..20ac52579952 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -22,6 +23,8 @@ #include "mm.h" +unsigned long vectors_base; + #ifdef CONFIG_ARM_MPU struct mpu_rgn_info mpu_rgn_info; @@ -278,15 +281,60 @@ static void sanity_check_meminfo_mpu(void) {} static void __init mpu_setup(void) {} #endif /* CONFIG_ARM_MPU */ +#ifdef CONFIG_CPU_CP15 +#ifdef CONFIG_CPU_HIGH_VECTOR +static unsigned long __init setup_vectors_base(void) +{ + unsigned long reg = get_cr(); + + set_cr(reg | CR_V); + return 0x; +} +#else /* CONFIG_CPU_HIGH_VECTOR */ +/* Write exception base address to VBAR */ +static inline void set_vbar(unsigned long val) +{ + asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc"); +} + +/* + * Security extensions, bits[7:4], permitted values, + * 0b - not implemented, 0b0001/0b0010 - implemented + */ +static inline bool security_extensions_enabled(void) +{ + return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4); +} + +static unsigned long __init setup_vectors_base(void) +{ + unsigned long base = 0, reg = get_cr(); + + set_cr(reg & ~CR_V); + if (security_extensions_enabled()) { + if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) + base = CONFIG_DRAM_BASE; + set_vbar(base); + } else if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) { + if (CONFIG_DRAM_BASE != 0) + pr_err("Security extensions not enabled, vectors cannot be remapped to RAM, vectors base will be 0x\n"); + } + + return base; +} +#endif /* CONFIG_CPU_HIGH_VECTOR */ +#endif /* CONFIG_CPU_CP15 */ + void __init arm_mm_memblock_reserve(void) { #ifndef CONFIG_CPU_V7M + vectors_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vectors_base() : 0; /* * Register the exception vector page. * some architectures which the DRAM is the exception vector to trap, * alloc_page breaks with error, although it is not NULL, but "0." */ - memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE); + memblock_reserve(vectors_base, 2 * PAGE_SIZE); #else /* ifndef CONFIG_CPU_V7M */ /* * There is no dedicated vector page on V7-M. So nothing needs to be @@ -310,7 +358,7 @@ void __init sanity_check_meminfo(void) */ void __init paging_init(const struct machine_desc *mdesc) { - early_trap_init((void *)CONFIG_VECTORS_BASE); + early_trap_init((void *)vectors_base); mpu_setup(); bootmem_init(); } -- 2.11.0
[PATCH v3 0/3] ARM: !MMU: v7-A support, dynamic vectors base handling
Hi, ARM core changes to support !MMU Kernel on v7-A MMU processors. Based on the feedback from Russell, it was decided to handle vector base dynamically in C for no-MMU & work towards the the goal of removing VECTORS_BASE from Kconfig. Exception base address is dynamically found out in C & configured. This series also does the preparation for CONFIG_VECTORS_BASE removal. Once vector region setup, used by Cortex-R, is made devoid of VECTORS_BASE, it can be removed from Kconfig. [2] already decouples it from Kconfig for MMU. Vladimir's Tested-by on v2 has been removed from [PATCH 2/3] as it has been changed. And as it doesn't affect functionality, Tested-by has been retained on the other two patches, Vladimir, let me know if not okay. This series has been verified over current mainline plus [1,2] on 1. Vybrid Cosmic+ a. Cortex-M4 - !MMU Kernel b. Cortex-A5 - MMU Kernel. This series also has been verified over Vladimir's series [3] along with [1,2] on 1. Vybrid Cosmic+ a. Cortex-M4 !MMU Kernel b. Cortex-A5 MMU Kernel c. Cortex-A5 !MMU Kernel 2. AM437x IDK a. Cortex-A9 MMU Kernel b. Cortex-A9 !MMU Kernel Regards afzal v3: => Removed [PATCH 1/4] of v2 as it is in -next => Simplify by defining VECTORS_BASE to variable holding dynamically calculated exception base address v2: => Fix bisectability issue on !MMU builds => UL suffix on VECTORS_BASE definition => Use existing helpers to detect security extensions => Rewrite a CPP step to C for readability [1] "[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM" http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473593.html (in -next) [2] "[PATCH v2 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig" http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/481904.html (in -next) [3] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM", http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html (git://linux-arm.org/linux-vm.git nommu-rfc-v2) afzal mohammed (3): ARM: nommu: dynamic exception base address setting ARM: nommu: display vectors base ARM: nommu: remove Hivecs configuration is asm arch/arm/include/asm/memory.h | 9 ++-- arch/arm/kernel/head-nommu.S | 5 - arch/arm/mm/nommu.c | 52 +-- 3 files changed, 57 insertions(+), 9 deletions(-) -- 2.11.0
Re: [PATCH v2 3/4] ARM: nommu: display vectors base
Hi, On Mon, Jan 30, 2017 at 02:03:26PM +, Russell King - ARM Linux wrote: > On Sun, Jan 22, 2017 at 08:52:12AM +0530, afzal mohammed wrote: > > The exception base address is now dynamically estimated for no-MMU, > > display it. As it is the case, now limit VECTORS_BASE usage to MMU > > scenario. > > +#define VECTORS_BASE UL(0x) > > + > > #else /* CONFIG_MMU */ > > > > /* > > @@ -111,8 +113,6 @@ > > > > #endif /* !CONFIG_MMU */ > > > > -#define VECTORS_BASE UL(0x) > > I think adding a definition for VECTORS_BASE in asm/memory.h for nommu: > > extern unsigned long vectors_base; > #define VECTORS_BASE vectors_base Above was required to be enclosed by below, #ifndef __ASSEMBLY__ #endif Putting it inside the existing #ifndef __ASSEMBLY__ (which encloses other externs) in asm/memory.h would put it alongside not so related definitions as compared to the existing location. And the existing #ifndef __ASSEMBLY__ in asm/memory.h is a bit down that makes the above stand separately, > > +#ifdef CONFIG_MMU > > MLK(VECTORS_BASE, VECTORS_BASE + PAGE_SIZE), > > +#else > > + MLK(vectors_base, vectors_base + PAGE_SIZE), > > +#endif > > will mean that this conditional becomes unnecessary. > > -#endif > > +#else /* CONFIG_MMU */ > > +extern unsigned long vectors_base; > > +#endif /* CONFIG_MMU */ > > and you don't need this here either. but the above improvements make the patch simpler. Regards afzal
[PATCH] ARM: vf610m4: defconfig: enable EXT4 filesystem
Enable EXT4_FS to have rootfs in EXT[2-4]. Other changes are result of savedefconfig keeping minimal config (even without enabling EXT4_FS, these would be present). Signed-off-by: afzal mohammed --- Hi Shawn, i am not sure about the route for this patch, sending it you as the Vybrid maintainer. Last (& the only) change to this file was picked by Arnd. Regards afzal arch/arm/configs/vf610m4_defconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/configs/vf610m4_defconfig b/arch/arm/configs/vf610m4_defconfig index aeb2482c492e..b7ecb83a95b6 100644 --- a/arch/arm/configs/vf610m4_defconfig +++ b/arch/arm/configs/vf610m4_defconfig @@ -7,7 +7,6 @@ CONFIG_BLK_DEV_INITRD=y CONFIG_KALLSYMS_ALL=y CONFIG_EMBEDDED=y # CONFIG_MMU is not set -CONFIG_ARM_SINGLE_ARMV7M=y CONFIG_ARCH_MXC=y CONFIG_SOC_VF610=y CONFIG_SET_MEM_PARAM=y @@ -38,5 +37,5 @@ CONFIG_SERIAL_FSL_LPUART_CONSOLE=y CONFIG_MFD_SYSCON=y # CONFIG_HID is not set # CONFIG_USB_SUPPORT is not set +CONFIG_EXT4_FS=y # CONFIG_MISC_FILESYSTEMS is not set -# CONFIG_FTRACE is not set -- 2.11.0
Re: [PATCH 2/4] ARM: nommu: dynamic exception base address setting
Hi, On Fri, Jan 20, 2017 at 09:50:22PM +0530, Afzal Mohammed wrote: > On Thu, Jan 19, 2017 at 01:59:09PM +, Vladimir Murzin wrote: > > You can use > > > > cpuid_feature_extract(CPUID_EXT_PFR1, 4) > > > > and add a comment explaining what we are looking for and why. W.r.t comments, tried to keep it concise, C tokens doing a part of it. > Yes, that is better, was not aware of this, did saw CPUID_EXT_PFR1 as > an unused macro. > > > +#ifdef CONFIG_CPU_CP15 > > > + vectors_base = setup_vectors_base(); > > > +#endif > > > > alternatively it can be > > > > unsigned long vector_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vbar() > > : 0; > > Yes that certainly is better. Have kept the function name as setup_vector_base() as in addition to setting up VBAR, V bit also has to be configured by it - so that function name remains true to it's name. v2 with changes has been posted. Regards afzal
Re: [PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig
Hi, On Thu, Jan 19, 2017 at 02:24:24PM +, Russell King - ARM Linux wrote: > On Thu, Jan 19, 2017 at 02:07:39AM +0530, afzal mohammed wrote: > > +#define VECTORS_BASE 0x > > This should be UL(0x) This has been taken care in v2. Regards afzal
[PATCH v2 4/4] ARM: nommu: remove Hivecs configuration is asm
Now that exception based address is handled dynamically for processors with CP15, remove Hivecs configuration in assembly. Signed-off-by: afzal mohammed --- arch/arm/kernel/head-nommu.S | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index 6b4eb27b8758..2e21e08de747 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -152,11 +152,6 @@ __after_proc_init: #ifdef CONFIG_CPU_ICACHE_DISABLE bic r0, r0, #CR_I #endif -#ifdef CONFIG_CPU_HIGH_VECTOR - orr r0, r0, #CR_V -#else - bic r0, r0, #CR_V -#endif mcr p15, 0, r0, c1, c0, 0 @ write control reg #elif defined (CONFIG_CPU_V7M) /* For V7M systems we want to modify the CCR similarly to the SCTLR */ -- 2.11.0
[PATCH v2 3/4] ARM: nommu: display vectors base
The exception base address is now dynamically estimated for no-MMU, display it. As it is the case, now limit VECTORS_BASE usage to MMU scenario. Signed-off-by: afzal mohammed --- v2: A change to accomodate bisectability resolution on patch 1/4 arch/arm/include/asm/memory.h | 4 ++-- arch/arm/mm/init.c| 5 + arch/arm/mm/mm.h | 5 +++-- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 0b5416fe7709..9ae474bf84fc 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -83,6 +83,8 @@ #define IOREMAP_MAX_ORDER 24 #endif +#define VECTORS_BASE UL(0x) + #else /* CONFIG_MMU */ /* @@ -111,8 +113,6 @@ #endif /* !CONFIG_MMU */ -#define VECTORS_BASE UL(0x) - /* * We fix the TCM memories max 32 KiB ITCM resp DTCM at these * locations diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 823e119a5daa..9c68e3aba87c 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -522,7 +522,12 @@ void __init mem_init(void) " .data : 0x%p" " - 0x%p" " (%4td kB)\n" " .bss : 0x%p" " - 0x%p" " (%4td kB)\n", +#ifdef CONFIG_MMU MLK(VECTORS_BASE, VECTORS_BASE + PAGE_SIZE), +#else + MLK(vectors_base, vectors_base + PAGE_SIZE), +#endif + #ifdef CONFIG_HAVE_TCM MLK(DTCM_OFFSET, (unsigned long) dtcm_end), MLK(ITCM_OFFSET, (unsigned long) itcm_end), diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h index ce727d47275c..546f09437fca 100644 --- a/arch/arm/mm/mm.h +++ b/arch/arm/mm/mm.h @@ -79,8 +79,9 @@ struct static_vm { extern struct list_head static_vmlist; extern struct static_vm *find_static_vm_vaddr(void *vaddr); extern __init void add_static_vm_early(struct static_vm *svm); - -#endif +#else /* CONFIG_MMU */ +extern unsigned long vectors_base; +#endif /* CONFIG_MMU */ #ifdef CONFIG_ZONE_DMA extern phys_addr_t arm_dma_limit; -- 2.11.0
[PATCH v2 2/4] ARM: nommu: dynamic exception base address setting
No-MMU dynamic exception base address configuration on CP15 processors. In the case of low vectors, decision based on whether security extensions are enabled & whether remap vectors to RAM CONFIG option is selected. For no-MMU without CP15, current default value of 0x0 is retained. Signed-off-by: afzal mohammed --- v2: Use existing helpers to detect security extensions Rewrite a CPP step to C for readability arch/arm/mm/nommu.c | 52 ++-- 1 file changed, 50 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index 2740967727e2..20ac52579952 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -22,6 +23,8 @@ #include "mm.h" +unsigned long vectors_base; + #ifdef CONFIG_ARM_MPU struct mpu_rgn_info mpu_rgn_info; @@ -278,15 +281,60 @@ static void sanity_check_meminfo_mpu(void) {} static void __init mpu_setup(void) {} #endif /* CONFIG_ARM_MPU */ +#ifdef CONFIG_CPU_CP15 +#ifdef CONFIG_CPU_HIGH_VECTOR +static unsigned long __init setup_vectors_base(void) +{ + unsigned long reg = get_cr(); + + set_cr(reg | CR_V); + return 0x; +} +#else /* CONFIG_CPU_HIGH_VECTOR */ +/* Write exception base address to VBAR */ +static inline void set_vbar(unsigned long val) +{ + asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc"); +} + +/* + * Security extensions, bits[7:4], permitted values, + * 0b - not implemented, 0b0001/0b0010 - implemented + */ +static inline bool security_extensions_enabled(void) +{ + return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4); +} + +static unsigned long __init setup_vectors_base(void) +{ + unsigned long base = 0, reg = get_cr(); + + set_cr(reg & ~CR_V); + if (security_extensions_enabled()) { + if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) + base = CONFIG_DRAM_BASE; + set_vbar(base); + } else if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) { + if (CONFIG_DRAM_BASE != 0) + pr_err("Security extensions not enabled, vectors cannot be remapped to RAM, vectors base will be 0x\n"); + } + + return base; +} +#endif /* CONFIG_CPU_HIGH_VECTOR */ +#endif /* CONFIG_CPU_CP15 */ + void __init arm_mm_memblock_reserve(void) { #ifndef CONFIG_CPU_V7M + vectors_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vectors_base() : 0; /* * Register the exception vector page. * some architectures which the DRAM is the exception vector to trap, * alloc_page breaks with error, although it is not NULL, but "0." */ - memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE); + memblock_reserve(vectors_base, 2 * PAGE_SIZE); #else /* ifndef CONFIG_CPU_V7M */ /* * There is no dedicated vector page on V7-M. So nothing needs to be @@ -310,7 +358,7 @@ void __init sanity_check_meminfo(void) */ void __init paging_init(const struct machine_desc *mdesc) { - early_trap_init((void *)CONFIG_VECTORS_BASE); + early_trap_init((void *)vectors_base); mpu_setup(); bootmem_init(); } -- 2.11.0
[PATCH v2 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig
For MMU configurations, VECTORS_BASE is always 0x, a macro definition will suffice. For no-MMU, exception base address is dynamically determined in subsequent patches. To preserve bisectability, now make the macro applicable for no-MMU scenario too. Thanks to 0-DAY kernel test infrastructure that found the bisectability issue. This macro will be restricted to MMU case upon dynamically determining exception base address for no-MMU. Once exception address is handled dynamically for no-MMU, VECTORS_BASE can be removed from Kconfig. Suggested-by: Russell King Signed-off-by: afzal mohammed --- v2: Fix bisectability issue on !MMU builds UL suffix on VECTORS_BASE definition arch/arm/include/asm/memory.h | 2 ++ arch/arm/mach-berlin/platsmp.c | 3 ++- arch/arm/mm/dump.c | 5 +++-- arch/arm/mm/init.c | 4 ++-- 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 76cbd9c674df..0b5416fe7709 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -111,6 +111,8 @@ #endif /* !CONFIG_MMU */ +#define VECTORS_BASE UL(0x) + /* * We fix the TCM memories max 32 KiB ITCM resp DTCM at these * locations diff --git a/arch/arm/mach-berlin/platsmp.c b/arch/arm/mach-berlin/platsmp.c index 93f90688db18..578d41031abf 100644 --- a/arch/arm/mach-berlin/platsmp.c +++ b/arch/arm/mach-berlin/platsmp.c @@ -15,6 +15,7 @@ #include #include +#include #include #include @@ -75,7 +76,7 @@ static void __init berlin_smp_prepare_cpus(unsigned int max_cpus) if (!cpu_ctrl) goto unmap_scu; - vectors_base = ioremap(CONFIG_VECTORS_BASE, SZ_32K); + vectors_base = ioremap(VECTORS_BASE, SZ_32K); if (!vectors_base) goto unmap_scu; diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c index 9fe8e241335c..21192d6eda40 100644 --- a/arch/arm/mm/dump.c +++ b/arch/arm/mm/dump.c @@ -18,6 +18,7 @@ #include #include +#include #include struct addr_marker { @@ -31,8 +32,8 @@ static struct addr_marker address_markers[] = { { 0,"vmalloc() Area" }, { VMALLOC_END, "vmalloc() End" }, { FIXADDR_START,"Fixmap Area" }, - { CONFIG_VECTORS_BASE, "Vectors" }, - { CONFIG_VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" }, + { VECTORS_BASE, "Vectors" }, + { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" }, { -1, NULL }, }; diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 370581aeb871..823e119a5daa 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -521,8 +522,7 @@ void __init mem_init(void) " .data : 0x%p" " - 0x%p" " (%4td kB)\n" " .bss : 0x%p" " - 0x%p" " (%4td kB)\n", - MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) + - (PAGE_SIZE)), + MLK(VECTORS_BASE, VECTORS_BASE + PAGE_SIZE), #ifdef CONFIG_HAVE_TCM MLK(DTCM_OFFSET, (unsigned long) dtcm_end), MLK(ITCM_OFFSET, (unsigned long) itcm_end), -- 2.11.0
[PATCH v2 0/4] ARM: v7-A !MMU support, prepare CONFIG_VECTORS_BASE removal
Hi, ARM core changes to support !MMU Kernel on v7-A MMU processors. This series also does the preparation for CONFIG_VECTORS_BASE removal. Based on the feedback from Russell, it was decided to handle vector base dynamically in C for no-MMU & work towards the the goal of removing VECTORS_BASE from Kconfig. MMU platform's always have exception base address at 0x, hence a macro was defined and it was decoupled from Kconfig. No-MMU CP15 scenario is handled dynamically in C. Once vector region setup, used by Cortex-R, is made devoid of VECTORS_BASE, it can be removed from Kconfig. This series has been verified over current mainline plus [2] on Vybrid Cosmic+, Cortex-M4 - !MMU Kernel and Cortex-A5 - MMU Kernel. This series also has been verified over Vladimir's series plus [2] on 1. Vybrid Cosmic+ a. Cortex-M4 !MMU Kernel b. Cortex-A5 MMU Kernel c. Cortex-A5 !MMU Kernel 2. AM437x IDK a. Cortex-A9 MMU Kernel b. Cortex-A9 !MMU Kernel Regards afzal v2: => Fix bisectability issue on !MMU builds => UL suffix on VECTORS_BASE definition => Use existing helpers to detect security extensions => Rewrite a CPP step to C for readability [1] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM", http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html (git://linux-arm.org/linux-vm.git nommu-rfc-v2) [2] "[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM" http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473593.html (in -next) afzal mohammed (4): ARM: mmu: decouple VECTORS_BASE from Kconfig ARM: nommu: dynamic exception base address setting ARM: nommu: display vectors base ARM: nommu: remove Hivecs configuration is asm arch/arm/include/asm/memory.h | 2 ++ arch/arm/kernel/head-nommu.S | 5 arch/arm/mach-berlin/platsmp.c | 3 ++- arch/arm/mm/dump.c | 5 ++-- arch/arm/mm/init.c | 9 ++-- arch/arm/mm/mm.h | 5 ++-- arch/arm/mm/nommu.c| 52 -- 7 files changed, 67 insertions(+), 14 deletions(-) -- 2.11.0
Re: [PATCH 2/4] ARM: nommu: dynamic exception base address setting
Hi, On Thu, Jan 19, 2017 at 01:59:09PM +, Vladimir Murzin wrote: > On 18/01/17 20:38, afzal mohammed wrote: > > +#define ID_PFR1_SE (0x3 << 4) /* Security extension enable bits */ > > This bitfiled is 4 bits wide. Since only 2 LSb's out of the 4 were enough to detect whether security extensions were enabled, it was done so. i am going to use your below suggestion & this would be taken care by that. > > + if (security_extensions_enabled()) { > > You can use > > cpuid_feature_extract(CPUID_EXT_PFR1, 4) > > and add a comment explaining what we are looking for and why. Yes, that is better, was not aware of this, did saw CPUID_EXT_PFR1 as an unused macro. > > +#ifdef CONFIG_CPU_CP15 > > + vectors_base = setup_vectors_base(); > > +#endif > > alternatively it can be > > unsigned long vector_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vbar() > : 0; Yes that certainly is better. Regards afzal
Re: [PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig
Hi, On Thu, Jan 19, 2017 at 02:24:24PM +, Russell King - ARM Linux wrote: > On Thu, Jan 19, 2017 at 02:07:39AM +0530, afzal mohammed wrote: > > +++ b/arch/arm/include/asm/memory.h > > +#define VECTORS_BASE 0x > > This should be UL(0x) > > - MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) + > > - (PAGE_SIZE)), > > + MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)), > > which means you don't need it here, which will then fix the build error > reported by the 0-day builder. Seems there is some confusion here, VECTORS_BASE definition above in memory.h is enclosed within CONFIG_MMU. Robot used a no-MMU defconfig, it didn't get a VECTORS_BASE definition at this patch, causing the build error. Our dear robot mentioned that my HEAD didn't break build, but bisectability is broken at this point. With "PATCH 3/4 ARM: nommu: display vectors base", the above is changed to #ifdef CONFIG_MMU MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)), #else ... #endif thus making the series build again for no-MMU One option to keep bisectability would be to squash this with PATCH 3/4, but i think a better & natural solution would be define VECTORS_BASE outside of #ifdef CONFIG_MMU ... #else ... #endif and then in PATCH 3/4, move VECTORS_BASE to be inside #ifdef CONFIG_MMU ... #else Regards afzal
Re: [PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig
+ Marvell Berlin SoC maintainers - Sebastian, Jisheng On Thu, Jan 19, 2017 at 02:07:39AM +0530, afzal mohammed wrote: > For MMU configurations, VECTORS_BASE is always 0x, a macro > definition will suffice. > > Once exception address is handled dynamically for no-MMU also (this > would involve taking care of region setup too), VECTORS_BASE can be > removed from Kconfig. > > Suggested-by: Russell King > Signed-off-by: afzal mohammed > --- > > Though there was no build error without inclusion of asm/memory.h, to > be on the safer side it has been added, to reduce chances of build > breakage in random configurations. > > arch/arm/include/asm/memory.h | 2 ++ > arch/arm/mach-berlin/platsmp.c | 3 ++- > arch/arm/mm/dump.c | 5 +++-- > arch/arm/mm/init.c | 4 ++-- > 4 files changed, 9 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h > index 76cbd9c674df..9cc9f1dbc88e 100644 > --- a/arch/arm/include/asm/memory.h > +++ b/arch/arm/include/asm/memory.h > @@ -83,6 +83,8 @@ > #define IOREMAP_MAX_ORDER24 > #endif > > +#define VECTORS_BASE 0x > + > #else /* CONFIG_MMU */ > > /* > diff --git a/arch/arm/mach-berlin/platsmp.c b/arch/arm/mach-berlin/platsmp.c > index 93f90688db18..578d41031abf 100644 > --- a/arch/arm/mach-berlin/platsmp.c > +++ b/arch/arm/mach-berlin/platsmp.c > @@ -15,6 +15,7 @@ > > #include > #include > +#include > #include > #include > > @@ -75,7 +76,7 @@ static void __init berlin_smp_prepare_cpus(unsigned int > max_cpus) > if (!cpu_ctrl) > goto unmap_scu; > > - vectors_base = ioremap(CONFIG_VECTORS_BASE, SZ_32K); > + vectors_base = ioremap(VECTORS_BASE, SZ_32K); > if (!vectors_base) > goto unmap_scu; > > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c > index 9fe8e241335c..21192d6eda40 100644 > --- a/arch/arm/mm/dump.c > +++ b/arch/arm/mm/dump.c > @@ -18,6 +18,7 @@ > #include > > #include > +#include > #include > > struct addr_marker { > @@ -31,8 +32,8 @@ static struct addr_marker address_markers[] = { > { 0,"vmalloc() Area" }, > { VMALLOC_END, "vmalloc() End" }, > { FIXADDR_START,"Fixmap Area" }, > - { CONFIG_VECTORS_BASE, "Vectors" }, > - { CONFIG_VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" }, > + { VECTORS_BASE, "Vectors" }, > + { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" }, > { -1, NULL }, > }; > > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c > index 370581aeb871..cf47f86f79ed 100644 > --- a/arch/arm/mm/init.c > +++ b/arch/arm/mm/init.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -521,8 +522,7 @@ void __init mem_init(void) > " .data : 0x%p" " - 0x%p" " (%4td kB)\n" > " .bss : 0x%p" " - 0x%p" " (%4td kB)\n", > > - MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) + > - (PAGE_SIZE)), > + MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)), > #ifdef CONFIG_HAVE_TCM > MLK(DTCM_OFFSET, (unsigned long) dtcm_end), > MLK(ITCM_OFFSET, (unsigned long) itcm_end), > -- > 2.11.0
Re: [PATCH 3/4] ARM: nommu: display vectors base
Hi, On Wed, Jan 18, 2017 at 10:13:15PM +, Russell King - ARM Linux wrote: > On Thu, Jan 19, 2017 at 02:08:37AM +0530, afzal mohammed wrote: > > + MLK_ROUNDUP(vectors_base, vectors_base + PAGE_SIZE), > > I think MLK() will do here - no need to use the rounding-up version > as PAGE_SIZE is a multiple of 1k. Yes, i will replace it. Earlier, used MLK(), got some build error, now checking again, no build error, i should have messed up something at that time. Regards afzal
[PATCH 3/4] ARM: nommu: display vectors base
The exception base address is now dynamically estimated for no-MMU case, display it. Signed-off-by: afzal mohammed --- arch/arm/mm/init.c | 5 + arch/arm/mm/mm.h | 5 +++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index cf47f86f79ed..9e11f255c3bf 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -522,7 +522,12 @@ void __init mem_init(void) " .data : 0x%p" " - 0x%p" " (%4td kB)\n" " .bss : 0x%p" " - 0x%p" " (%4td kB)\n", +#ifdef CONFIG_MMU MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)), +#else + MLK_ROUNDUP(vectors_base, vectors_base + PAGE_SIZE), +#endif + #ifdef CONFIG_HAVE_TCM MLK(DTCM_OFFSET, (unsigned long) dtcm_end), MLK(ITCM_OFFSET, (unsigned long) itcm_end), diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h index ce727d47275c..546f09437fca 100644 --- a/arch/arm/mm/mm.h +++ b/arch/arm/mm/mm.h @@ -79,8 +79,9 @@ struct static_vm { extern struct list_head static_vmlist; extern struct static_vm *find_static_vm_vaddr(void *vaddr); extern __init void add_static_vm_early(struct static_vm *svm); - -#endif +#else /* CONFIG_MMU */ +extern unsigned long vectors_base; +#endif /* CONFIG_MMU */ #ifdef CONFIG_ZONE_DMA extern phys_addr_t arm_dma_limit; -- 2.11.0
[PATCH 4/4] ARM: nommu: remove Hivecs configuration is asm
Now that exception based address is handled dynamically for processors with CP15, remove Highvecs configuration in assembly. Signed-off-by: afzal mohammed --- arch/arm/kernel/head-nommu.S | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index 6b4eb27b8758..2e21e08de747 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -152,11 +152,6 @@ __after_proc_init: #ifdef CONFIG_CPU_ICACHE_DISABLE bic r0, r0, #CR_I #endif -#ifdef CONFIG_CPU_HIGH_VECTOR - orr r0, r0, #CR_V -#else - bic r0, r0, #CR_V -#endif mcr p15, 0, r0, c1, c0, 0 @ write control reg #elif defined (CONFIG_CPU_V7M) /* For V7M systems we want to modify the CCR similarly to the SCTLR */ -- 2.11.0
[PATCH 2/4] ARM: nommu: dynamic exception base address setting
No-MMU dynamic exception base address configuration on CP15 processors. In the case of low vectors, decision based on whether security extensions are enabled & whether remap vectors to RAM CONFIG option is selected. For no-MMU without CP15, current default value of 0x0 is retained. Signed-off-by: afzal mohammed --- arch/arm/mm/nommu.c | 64 +++-- 1 file changed, 62 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index 2740967727e2..db8e784f20f3 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -22,6 +23,8 @@ #include "mm.h" +unsigned long vectors_base; + #ifdef CONFIG_ARM_MPU struct mpu_rgn_info mpu_rgn_info; @@ -278,15 +281,72 @@ static void sanity_check_meminfo_mpu(void) {} static void __init mpu_setup(void) {} #endif /* CONFIG_ARM_MPU */ +#ifdef CONFIG_CPU_CP15 +#ifdef CONFIG_CPU_HIGH_VECTOR +static unsigned long __init setup_vectors_base(void) +{ + unsigned long reg = get_cr(); + + set_cr(reg | CR_V); + return 0x; +} +#else /* CONFIG_CPU_HIGH_VECTOR */ +/* + * ID_PRF1 bits (CP#15 ID_PFR1) + */ +#define ID_PFR1_SE (0x3 << 4) /* Security extension enable bits */ + +/* Read processor feature register ID_PFR1 */ +static unsigned long get_id_pfr1(void) +{ + unsigned long val; + + asm("mrc p15, 0, %0, c0, c1, 1" : "=r" (val) : : "cc"); + return val; +} + +/* Write exception base address to VBAR */ +static void set_vbar(unsigned long val) +{ + asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc"); +} + +static bool __init security_extensions_enabled(void) +{ + return !!(get_id_pfr1() & ID_PFR1_SE); +} + +static unsigned long __init setup_vectors_base(void) +{ + unsigned long base = 0, reg = get_cr(); + + set_cr(reg & ~CR_V); + if (security_extensions_enabled()) { + if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) + base = CONFIG_DRAM_BASE; + set_vbar(base); + } else if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) { + if (CONFIG_DRAM_BASE != 0) + pr_err("Security extensions not enabled, vectors cannot be remapped to RAM, vectors base will be 0x\n"); + } + + return base; +} +#endif /* CONFIG_CPU_HIGH_VECTOR */ +#endif /* CONFIG_CPU_CP15 */ + void __init arm_mm_memblock_reserve(void) { #ifndef CONFIG_CPU_V7M +#ifdef CONFIG_CPU_CP15 + vectors_base = setup_vectors_base(); +#endif /* * Register the exception vector page. * some architectures which the DRAM is the exception vector to trap, * alloc_page breaks with error, although it is not NULL, but "0." */ - memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE); + memblock_reserve(vectors_base, 2 * PAGE_SIZE); #else /* ifndef CONFIG_CPU_V7M */ /* * There is no dedicated vector page on V7-M. So nothing needs to be @@ -310,7 +370,7 @@ void __init sanity_check_meminfo(void) */ void __init paging_init(const struct machine_desc *mdesc) { - early_trap_init((void *)CONFIG_VECTORS_BASE); + early_trap_init((void *)vectors_base); mpu_setup(); bootmem_init(); } -- 2.11.0
[PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig
For MMU configurations, VECTORS_BASE is always 0x, a macro definition will suffice. Once exception address is handled dynamically for no-MMU also (this would involve taking care of region setup too), VECTORS_BASE can be removed from Kconfig. Suggested-by: Russell King Signed-off-by: afzal mohammed --- Though there was no build error without inclusion of asm/memory.h, to be on the safer side it has been added, to reduce chances of build breakage in random configurations. arch/arm/include/asm/memory.h | 2 ++ arch/arm/mach-berlin/platsmp.c | 3 ++- arch/arm/mm/dump.c | 5 +++-- arch/arm/mm/init.c | 4 ++-- 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 76cbd9c674df..9cc9f1dbc88e 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -83,6 +83,8 @@ #define IOREMAP_MAX_ORDER 24 #endif +#define VECTORS_BASE 0x + #else /* CONFIG_MMU */ /* diff --git a/arch/arm/mach-berlin/platsmp.c b/arch/arm/mach-berlin/platsmp.c index 93f90688db18..578d41031abf 100644 --- a/arch/arm/mach-berlin/platsmp.c +++ b/arch/arm/mach-berlin/platsmp.c @@ -15,6 +15,7 @@ #include #include +#include #include #include @@ -75,7 +76,7 @@ static void __init berlin_smp_prepare_cpus(unsigned int max_cpus) if (!cpu_ctrl) goto unmap_scu; - vectors_base = ioremap(CONFIG_VECTORS_BASE, SZ_32K); + vectors_base = ioremap(VECTORS_BASE, SZ_32K); if (!vectors_base) goto unmap_scu; diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c index 9fe8e241335c..21192d6eda40 100644 --- a/arch/arm/mm/dump.c +++ b/arch/arm/mm/dump.c @@ -18,6 +18,7 @@ #include #include +#include #include struct addr_marker { @@ -31,8 +32,8 @@ static struct addr_marker address_markers[] = { { 0,"vmalloc() Area" }, { VMALLOC_END, "vmalloc() End" }, { FIXADDR_START,"Fixmap Area" }, - { CONFIG_VECTORS_BASE, "Vectors" }, - { CONFIG_VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" }, + { VECTORS_BASE, "Vectors" }, + { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" }, { -1, NULL }, }; diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 370581aeb871..cf47f86f79ed 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -521,8 +522,7 @@ void __init mem_init(void) " .data : 0x%p" " - 0x%p" " (%4td kB)\n" " .bss : 0x%p" " - 0x%p" " (%4td kB)\n", - MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) + - (PAGE_SIZE)), + MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)), #ifdef CONFIG_HAVE_TCM MLK(DTCM_OFFSET, (unsigned long) dtcm_end), MLK(ITCM_OFFSET, (unsigned long) itcm_end), -- 2.11.0
[PATCH 0/4] ARM: v7-A !MMU support, CONFIG_VECTORS_BASE removal (almost)
Hi, ARM core changes to support !MMU Kernel on v7-A MMU processors. This series also does the preparation for CONFIG_VECTORS_BASE removal. Based on the feedback from Russell on the initial patches (part RFC), it was decided to handle vector base dynamically in C & work towards the the goal of removing VECTORS_BASE from Kconfig. MMU platform's always have exception base address at 0x, while no-MMU CP15 scenario was handled dynamically in C. Hivecs handling for no-MMU CP15 that was done in asm has been moved to C as part of dynamic handling. This now leaves only vector region setup, used by Cortex-R, to be made devoid of VECTORS_BASE so as to remove it from Kconfig. Vladimir is planning to rework MPU code, so it has been left untouched. VECTORS_BASE is to be removed from Kconfig after the MPU region rework. This series has been tested on top of mainline on, 1. Vybrid CM4 (!MMU) 2. Vybrid CA5 (MMU) and on top of Vladimir's series[1] on, 1. Vybrid CM4 (!MMU) 2. Vybrid CA5 (MMU & !MMU) 3. AM437x IDK (MMU & !MMU) Both above had an additional patch [2] as well, which is in next now. Regards afzal [1] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM", http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html (git://linux-arm.org/linux-vm.git nommu-rfc-v2) [2] "[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM" http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473593.html afzal mohammed (4): ARM: mmu: decouple VECTORS_BASE from Kconfig ARM: nommu: dynamic exception base address setting ARM: nommu: display vectors base ARM: nommu: remove Hivecs configuration is asm arch/arm/include/asm/memory.h | 2 ++ arch/arm/kernel/head-nommu.S | 5 arch/arm/mach-berlin/platsmp.c | 3 +- arch/arm/mm/dump.c | 5 ++-- arch/arm/mm/init.c | 9 -- arch/arm/mm/mm.h | 5 ++-- arch/arm/mm/nommu.c| 64 -- 7 files changed, 79 insertions(+), 14 deletions(-) -- 2.11.0
Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Hi, On Mon, Jan 16, 2017 at 09:53:41AM +, Vladimir Murzin wrote: > On 15/01/17 11:47, Afzal Mohammed wrote: > > mpu_setup_region() in arch/arm/mm/nommu.c that takes care of > > MPU_RAM_REGION only. And that seems to be a kind of redundant as it is > > also done in asm at __setup_mpu(). Git blames asm & C to consecutive > > commits, that makes me a little shaky about the conclusion on it being > > redundant. > > It is not redundant. MPU setup is done it two steps. The first step done in > asm to enable caches, there only kernel image is covered; the second step > takes > care on the whole RAM given via dt or "mem=" parameter. Okay, thanks for the details. > > Thinking of invoking mpu_setup() from secondary_start_kernel() in > > arch/arm/kernel/smp.c, with mpu_setup() being slightly modified to > > avoid storing region details again when invoked by secondary cpu's. > > I have wip patches on reworking MPU setup code. The idea is to start using > mpu_rgn_info[] actively, so asm part for secondariness would just sync-up > content of that array. Additionally, it seems that we can reuse free MPU slots > to cover memory which is discarded due to MPU alignment restrictions... > > > Vladimir, once changes are done after a revisit, i would need your > > help to test on Cortex-R. > > I'm more than happy to help, but currently I have limited bandwidth, so if it > can wait till the next dev cycle I'd try to make MPU rework finished by that > time. Okay, please feel free to do MPU rework the way you were planning, you know more details & have the platform to achieve it with much higher efficiency than me. Regards afzal
Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Hi, On Sat, Jan 07, 2017 at 10:43:39PM +0530, Afzal Mohammed wrote: > On Tue, Dec 13, 2016 at 10:02:26AM +, Russell King - ARM Linux wrote: > > Also, if the region setup for the vectors was moved as well, it would > > then be possible to check the ID registers to determine whether this > > is supported, and make the decision where to locate the vectors base > > more dynamically. > > This would affect Cortex-R's, which is a bit concerning due to lack of > those platforms with me, let me try to get it right. QEMU too doesn't seem to provide a Cortex-R target > Seems translating __setup_mpu() altogether to C afaics, a kind of C translation is already present as mpu_setup_region() in arch/arm/mm/nommu.c that takes care of MPU_RAM_REGION only. And that seems to be a kind of redundant as it is also done in asm at __setup_mpu(). Git blames asm & C to consecutive commits, that makes me a little shaky about the conclusion on it being redundant. > & installing at a later, but suitable place might be better. But looks like enabling MPU can't be moved to C & that would necessitate keeping at least some portion of__setu_mpu() in asm. Instead, moving region setup only for vectors to C as Russell suggested at first would have to be done. A kind of diff at the end is in my mind, with additional changes to handle the similar during secondary cpu bringup too. Thinking of invoking mpu_setup() from secondary_start_kernel() in arch/arm/kernel/smp.c, with mpu_setup() being slightly modified to avoid storing region details again when invoked by secondary cpu's. Vladimir, once changes are done after a revisit, i would need your help to test on Cortex-R. As an aside, wasn't aware of the fact that Cortex-R supports SMP Linux, had thought that, of !MMU one's, only Blackfin & J2 had it. > Also !MMU Kernel could boot on 3 ARM v7-A platforms - AM335x Beagle > Bone (A8), AM437x IDK (A9) & Vybrid VF610 (on A5 core, note that it > has M4 core too) Talking about Cortex-M, AMx3's too have it, to be specific M3, but they are not Linux-able unlike the one in VF610. Regards afzal --->8--- diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index e0565d73e49e..f8ac79b6136d 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -249,20 +249,6 @@ ENTRY(__setup_mpu) setup_region r0, r5, r6, MPU_INSTR_SIDE @ 0x0, BG region, enabled 2: isb - /* Vectors region */ - set_region_nr r0, #MPU_VECTORS_REGION - isb - /* Shared, inaccessible to PL0, rw PL1 */ - mov r0, #CONFIG_VECTORS_BASE@ Cover from VECTORS_BASE - ldr r5,=(MPU_AP_PL1RW_PL0NA | MPU_RGN_NORMAL) - /* Writing N to bits 5:1 (RSR_SZ) --> region size 2^N+1 */ - mov r6, #(((2 * PAGE_SHIFT - 1) << MPU_RSR_SZ) | 1 << MPU_RSR_EN) - - setup_region r0, r5, r6, MPU_DATA_SIDE @ VECTORS_BASE, PL0 NA, enabled - beq 3f @ Memory-map not unified - setup_region r0, r5, r6, MPU_INSTR_SIDE @ VECTORS_BASE, PL0 NA, enabled -3: isb - /* Enable the MPU */ mrc p15, 0, r0, c1, c0, 0 @ Read SCTLR bic r0, r0, #CR_BR @ Disable the 'default mem-map' diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index e82056df0635..7fe8906322d5 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -269,12 +269,19 @@ void __init mpu_setup(void) ilog2(memblock.memory.regions[0].size), MPU_AP_PL1RW_PL0RW | MPU_RGN_NORMAL); if (region_err) { - panic("MPU region initialization failure! %d", region_err); + panic("MPU RAM region initialization failure! %d", region_err); } else { - pr_info("Using ARMv7 PMSA Compliant MPU. " -"Region independence: %s, Max regions: %d\n", - mpu_iside_independent() ? "Yes" : "No", - mpu_max_regions()); + region_err = mpu_setup_region(MPU_VECTORS_REGION, vectors_base, + ilog2(memblock.memory.regions[0].size), + MPU_AP_PL1RW_PL0NA | MPU_RGN_NORMAL); + if (region_err) { + panic("MPU VECTOR region initialization failure! %d", + region_err); + } else { + pr_info("Using ARMv7 PMSA Compliant MPU. " + "Region independence: %s, Max regions: %d\n", + mpu_iside_independent() ? "Yes" : "No", + mpu_max_regions()); } } #else
Re: [PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case
Hi, On Sat, Jan 07, 2017 at 06:24:15PM +, Russell King - ARM Linux wrote: > As I've said, CONFIG_VECTORS_BASE is _always_ 0x on MMU, so > this always displays 0x - 0x1000 here. > Older ARM CPUs without the V bit (ARMv3 and early ARMv4) expect the > vectors to be at virtual address zero. > > Most of these systems place ROM at physical address 0, so when the CPU > starts from reset (with the MMU off) it starts executing from ROM. Once > the MMU is initialised, RAM can be placed there and the ROM vectors > replaced. The side effect of this is that NULL pointer dereferences > are not always caught... of course, it makes sense that the page at > address 0 is write protected even from the kernel, so a NULL pointer > write dereference doesn't corrupt the vectors. > > How we handle it in Linux is that we always map the page for the vectors > at 0x, and then only map that same page at 0x if we have > a CPU that needs it there. Thanks for the information, i was not aware, seems that simplifies MMU case handling. arch/arm/mm/mmu.c: if (!vectors_high()) { map.virtual = 0; map.length = PAGE_SIZE * 2; map.type = MT_LOW_VECTORS; create_mapping(&map); } arch/arm/include/asm/cp15.h: #if __LINUX_ARM_ARCH__ >= 4 #define vectors_high() (get_cr() & CR_V) #else #define vectors_high() (0) #endif Deducing from your reply & above code snippets that for __LINUX_ARM_ARCH__ >= 4, in all practical cases, vector_high() returns true Regards afzal
Re: [PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case
Hi, On Sat, Jan 07, 2017 at 11:32:27PM +0530, Afzal Mohammed wrote: > i had thought that for MMU case if Hivecs is not enabled, > CONFIG_VECTOR_BASE has to be considered as 0x at least for the s/CONFIG_VECTOR_BASE/exception base address > purpose of displaying exception base address. Regards afzal
Re: [PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case
Hi, On Sat, Jan 07, 2017 at 05:38:32PM +, Russell King - ARM Linux wrote: > On Sat, Jan 07, 2017 at 10:52:28PM +0530, afzal mohammed wrote: > > TODO: > > Kill off VECTORS_BASE completely - this would require to handle MMU > > case as well as ARM_MPU scenario dynamically. > Why do you think MMU doesn't already handle it? i meant here w.r.t displaying vector base address in arch/arm/mm/init.c, i.e. dynamically get it based on Hivecs setting as either 0x or 0x > > > config VECTORS_BASE > > hex > > - default 0x if MMU || CPU_HIGH_VECTOR > > - default DRAM_BASE if REMAP_VECTORS_TO_RAM > > + default 0x if MMU > > default 0x > > When MMU=y, the resulting VECTORS_BASE is always 0x. The only > case where this ends up zero after your change is when MMU=n. > The MMU case does have to cater for CPUs wanting vectors at 0x > and at 0x, and this is handled via the page tables - but this > has nothing to do with CONFIG_VECTORS_BASE. CONFIG_VECTORS_BASE > exists primarily for noMMU. i had thought that for MMU case if Hivecs is not enabled, CONFIG_VECTOR_BASE has to be considered as 0x at least for the purpose of displaying exception base address. One thing i have not yet understood is how CPU can take exception with it base address as 0x (for Hivecs not enabled case) virtual address as it is below Kernel memory map. > For the Berlin and mm/dump code, we could very easily just have a > #define VECTORS_BASE 0x in a header file and drop the CONFIG_ > prefix. Okay, thanks for the tip. Regards afzal
[PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case
vectors base is now dynamically updated for Hivecs as well as for REMAP_VECTORS_TO_RAM case to DRAM_START. Hence remove these CP15 cases. TODO: Kill off VECTORS_BASE completely - this would require to handle MMU case as well as ARM_MPU scenario dynamically. Signed-off-by: afzal mohammed --- arch/arm/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index bc6f4065840e..720ee62b4955 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -232,8 +232,7 @@ config ARCH_MTD_XIP config VECTORS_BASE hex - default 0x if MMU || CPU_HIGH_VECTOR - default DRAM_BASE if REMAP_VECTORS_TO_RAM + default 0x if MMU default 0x help The base address of exception vectors. This must be two pages -- 2.11.0
[PATCH WIP 3/4] ARM: mm: nommu: display dynamic exception base
Display dynamically estimated nommu exception base. TODO: Dynamically update MMU case too. Signed-off-by: afzal mohammed --- arch/arm/mm/init.c | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 370581aeb871..1777ee23a6a2 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -39,6 +39,10 @@ #include "mm.h" +#ifndef CONFIG_MMU +extern unsigned long vectors_base; +#endif + #ifdef CONFIG_CPU_CP15_MMU unsigned long __init __clear_cr(unsigned long mask) { @@ -521,8 +525,13 @@ void __init mem_init(void) " .data : 0x%p" " - 0x%p" " (%4td kB)\n" " .bss : 0x%p" " - 0x%p" " (%4td kB)\n", +#ifdef CONFIG_MMU MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) + (PAGE_SIZE)), +#else + MLK_ROUNDUP(vectors_base, vectors_base + PAGE_SIZE), +#endif + #ifdef CONFIG_HAVE_TCM MLK(DTCM_OFFSET, (unsigned long) dtcm_end), MLK(ITCM_OFFSET, (unsigned long) itcm_end), -- 2.11.0
[PATCH WIP 2/4] ARM: nommu: remove Hivecs configuration is asm
Now that exception based address is handled dynamically for processors with CP15, remove Hivecs configuration in assembly. Signed-off-by: afzal mohammed --- arch/arm/kernel/head-nommu.S | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index 2ab026ffc270..e0565d73e49e 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -162,11 +162,6 @@ ENDPROC(secondary_startup_arm) #ifdef CONFIG_CPU_ICACHE_DISABLE bic r0, r0, #CR_I #endif -#ifdef CONFIG_CPU_HIGH_VECTOR - orr r0, r0, #CR_V -#else - bic r0, r0, #CR_V -#endif mcr p15, 0, r0, c1, c0, 0 @ write control reg #elif defined (CONFIG_CPU_V7M) /* For V7M systems we want to modify the CCR similarly to the SCTLR */ -- 2.11.0
[PATCH WIP 1/4] ARM: nommu: dynamic exception base address setting
No-MMU dynamic exception base address configuration on processors with CP15. TODO: Handle MMU case as well as ARM_MPU scenario dynamically Signed-off-by: afzal mohammed --- arch/arm/mm/nommu.c | 62 +++-- 1 file changed, 60 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index 681cec879caf..e82056df0635 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -23,6 +24,8 @@ #include "mm.h" +unsigned long vectors_base; + #ifdef CONFIG_ARM_MPU struct mpu_rgn_info mpu_rgn_info; @@ -279,15 +282,70 @@ static void sanity_check_meminfo_mpu(void) {} static void __init mpu_setup(void) {} #endif /* CONFIG_ARM_MPU */ +#ifdef CONFIG_CPU_CP15 +/* + * ID_PRF1 bits (CP#15 ID_PFR1) + */ +#define ID_PFR1_SE (0x3 << 4) /* Security extension enable bits */ + +#ifndef CONFIG_CPU_HIGH_VECTOR +static inline unsigned long get_id_pfr1(void) +{ + unsigned long val; + asm("mrc p15, 0, %0, c0, c1, 1" : "=r" (val) : : "cc"); + return val; +} + +static inline void set_vbar(unsigned long val) +{ + asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc"); +} + +static bool __init security_extensions_enabled(void) +{ + return !!(get_id_pfr1() & ID_PFR1_SE); +} +#endif + +static unsigned long __init setup_vector_base(void) +{ + unsigned long reg, base; + + reg = get_cr(); + +#ifdef CONFIG_CPU_HIGH_VECTOR + set_cr(reg | CR_V); + base = 0x; +#else + set_cr(reg & ~CR_V); + base = 0; + if (security_extensions_enabled()) { +#ifdef CONFIG_REMAP_VECTORS_TO_RAM + base = CONFIG_DRAM_BASE; +#endif + set_vbar(base); + } +#endif /* CONFIG_CPU_HIGH_VECTOR */ + + return base; +} +#endif /* CONFIG_CPU_CP15 */ + void __init arm_mm_memblock_reserve(void) { #ifndef CONFIG_CPU_V7M + +#ifdef CONFIG_CPU_CP15 + vectors_base = setup_vector_base(); +#else + vectors_base = CONFIG_VECTORS_BASE; +#endif /* * Register the exception vector page. * some architectures which the DRAM is the exception vector to trap, * alloc_page breaks with error, although it is not NULL, but "0." */ - memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE); + memblock_reserve(vectors_base, 2 * PAGE_SIZE); #else /* ifndef CONFIG_CPU_V7M */ /* * There is no dedicated vector page on V7-M. So nothing needs to be @@ -311,7 +369,7 @@ void __init sanity_check_meminfo(void) */ void __init paging_init(const struct machine_desc *mdesc) { - early_trap_init((void *)CONFIG_VECTORS_BASE); + early_trap_init((void *)vectors_base); mpu_setup(); bootmem_init(); } -- 2.11.0
Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Hi, On Tue, Dec 13, 2016 at 10:02:26AM +, Russell King - ARM Linux wrote: > Is there really any need to do this in head.S ? I believe it's > entirely possible to do it later - arch/arm/mm/nommu.c:paging_init(). As memblock_reserve() for exception address was done before paging_init(), seems it has to be done by arm_mm_memblock_reserve() in arch/arm/mm/nommu.c, WIP patch follows, but not that happy - conditional compilation's make it not so readable, still better to see in C. > Also, if the region setup for the vectors was moved as well, it would > then be possible to check the ID registers to determine whether this > is supported, and make the decision where to locate the vectors base > more dynamically. This would affect Cortex-R's, which is a bit concerning due to lack of those platforms with me, let me try to get it right. Seems translating __setup_mpu() altogether to C & installing at a later, but suitable place might be better. And feeling something strange about Cortex-R support in mainline, don't know whether it boots out of the box, there are no Cortex-R cpu compatibles in dts(i), but devicetree documentation documents it. Still wrecking Cortex-R's could get counted as a regression as dts is not considered Kernel. Looks like there is a Cortex-R mafia around mainline ;) > That leaves one pr_notice() call using the CONFIG_VECTORS_BASE > constant... Seems you want to completely kick out CONFIG_VECTORS_BASE. Saw 2 interesting MMU cases, 1. in devicemaps_init(), if Hivecs is not set, it is being mapped to virtual address zero, was wondering how MMU Kernel can handle exceptions with zero address base (& still prints 0x as vector base) 2. One of the platform does a ioremap of CONFIG_VECTORS_BASE Once i take care of the above, the ugly conditional compilation in 3/4th patch (@arch/arm/mm/init.c) of WIP patch series that follows will be removed. Please let know if you have any comments on the above. Also !MMU Kernel could boot on 3 ARM v7-A platforms - AM335x Beagle Bone (A8), AM437x IDK (A9) & Vybrid VF610 (on A5 core, note that it has M4 core too) with same Kernel image*. Vybrid did not need any platform specific tweaks, just 1/2th patch (put in patch system as 8635/1) & WIP series over Vladimir's one, while TI Sitara AMx3's needed one w.r.t remap. Please bear my delay - to fill the stomach, work not on Linux and then the vacations. Regards afzal * Since initramfs was used, tty port had to be changed in initramfs build for Vybrid, but Kernel except for above initramfs change, was identical.
Re: [PATCH 33/37] ARM: dts: vf610m4-cosmic: Correct license text
Hi, On Thu, Dec 15, 2016 at 12:57:42AM +0100, Alexandre Belloni wrote: > The license test has been mangled at some point then copy pasted across The patch text has been mangled at this point ... ;) > multiple files. Restore it to what it should be. > Note that this is not intended as a license change. Acked-by: Afzal Mohammed Regards afzal
Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Hi, On Tue, Dec 13, 2016 at 09:38:21AM +, Vladimir Murzin wrote: > On 11/12/16 13:12, Afzal Mohammed wrote: > > this probably would have to be made robust so as to not cause issue on > > other v7-A's upon trying to do !MMU (this won't affect normal MMU boot), > > or specifically where security extensions are not enabled. Also effect > > of hypervisor extension also need to be considered. Please let know if > > any better ways to handle this. > You might need to check ID_PFR1 for that. Had been searching ARM ARM for this kind of a thing, thanks. > > +#ifdef CONFIG_REMAP_VECTORS_TO_RAM > > + mov r3, #CONFIG_VECTORS_BASE@ read VECTORS_BASE > ldr r3,=CONFIG_VECTORS_BASE > > would be more robust. I hit this in [1] > > [1] https://www.spinics.net/lists/arm-kernel/msg546825.html Russell suggested doing it in paging_init(), then probably assembly circus can be avoided. Regards afzal
Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Hi, On Tue, Dec 13, 2016 at 10:02:26AM +, Russell King - ARM Linux wrote: > On Sun, Dec 11, 2016 at 06:42:55PM +0530, Afzal Mohammed wrote: > > bic r0, r0, #CR_V > > #endif > > mcr p15, 0, r0, c1, c0, 0 @ write control reg > > + > > +#ifdef CONFIG_REMAP_VECTORS_TO_RAM > > + mov r3, #CONFIG_VECTORS_BASE@ read VECTORS_BASE > > + mcr p15, 0, r3, c12, c0, 0 @ write to VBAR > > +#endif > > + > Is there really any need to do this in head.S ? Seeing the high vector configuration done here, pounced upon it :) > I believe it's > entirely possible to do it later - arch/arm/mm/nommu.c:paging_init(). > > Also, if the region setup for the vectors was moved as well, it would > then be possible to check the ID registers to determine whether this > is supported, and make the decision where to locate the vectors base > more dynamically. i will look into it. Regards afzal
linux-kernel@vger.kernel.org
Hi, On Sun, Dec 11, 2016 at 06:40:28PM +0530, Afzal Mohammed wrote: > Kernel reached the stage of invoking user space init & panicked, though > it could not reach till prompt for want of user space executables > > So far i have not come across a toolchain (or a way to create toolchain) > to create !MMU user space executables for Cortex-A. Now able to reach prompt using buildroot initramfs, Thanks to Peter Korsgaard for suggesting the way to create user space executables for !MMU Cortex-A. > multi_v7_defconfig was used & all platforms except TI OMAP/AM/DM/DRA & > Freescale i.MX family was deselected. ARM_MPU option was disabled as > Vladimir had given an early warning. DRAM_BASE was set to 0x8000. > During the course of bringup, futex was causing issues, hence FUTEX was > removed. L1 & L2 caches were disabled in config. High vectors were > disabled & vectors were made to remap to base of RAM. An additional OMAP > specific change to avoid one ioremap was also required. For the sake of completeness, SMP was disabled & flat binary support enabled in Kernel. Regards afzal
Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Hi, On Sun, Dec 11, 2016 at 06:42:55PM +0530, Afzal Mohammed wrote: > Kernel text start at an offset of at least 32K to account for page > tables in MMU case. Proper way to put it might have been "32K (to account for 16K initial page tables & the old atags)", unless i missed something. Regards afzal
[PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM
Remap exception base address to start of RAM in Kernel in !MMU mode. Based on existing Kconfig help, Kernel was expecting it to be configured by external support. Also earlier it was not possible to copy the exception table to start of RAM due to Kconfig dependency, which has been fixed by a change prior to this. Kernel text start at an offset of at least 32K to account for page tables in MMU case. On a !MMU build too this space is kept aside, and since 2 pages (8K) is the maximum for exception plus stubs, it can be placed at the start of RAM. Signed-off-by: Afzal Mohammed --- i am a bit shaky about this change, though it works here on Cortex-A9, this probably would have to be made robust so as to not cause issue on other v7-A's upon trying to do !MMU (this won't affect normal MMU boot), or specifically where security extensions are not enabled. Also effect of hypervisor extension also need to be considered. Please let know if any better ways to handle this. arch/arm/Kconfig-nommu | 6 +++--- arch/arm/kernel/head-nommu.S | 6 ++ 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu index b7576349528c..f57fbe3d5eb0 100644 --- a/arch/arm/Kconfig-nommu +++ b/arch/arm/Kconfig-nommu @@ -46,9 +46,9 @@ config REMAP_VECTORS_TO_RAM If your CPU provides a remap facility which allows the exception vectors to be mapped to writable memory, say 'n' here. - Otherwise, say 'y' here. In this case, the kernel will require - external support to redirect the hardware exception vectors to - the writable versions located at DRAM_BASE. + Otherwise, say 'y' here. In this case, the kernel will + redirect the hardware exception vectors to the writable + versions located at DRAM_BASE. config ARM_MPU bool 'Use the ARM v7 PMSA Compliant MPU' diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index 6b4eb27b8758..ac31c9647830 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -158,6 +158,12 @@ __after_proc_init: bic r0, r0, #CR_V #endif mcr p15, 0, r0, c1, c0, 0 @ write control reg + +#ifdef CONFIG_REMAP_VECTORS_TO_RAM + mov r3, #CONFIG_VECTORS_BASE@ read VECTORS_BASE + mcr p15, 0, r3, c12, c0, 0 @ write to VBAR +#endif + #elif defined (CONFIG_CPU_V7M) /* For V7M systems we want to modify the CCR similarly to the SCTLR */ #ifdef CONFIG_CPU_DCACHE_DISABLE -- 2.11.0
[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM
REMAP_VECTORS_TO_RAM depends on DRAM_BASE, but since DRAM_BASE is a hex, REMAP_VECTORS_TO_RAM could never get enabled. Also depending on DRAM_BASE is redundant as whenever REMAP_VECTORS_TO_RAM makes itself available to Kconfig, DRAM_BASE also is available as the Kconfig gets sourced on !MMU. Signed-off-by: Afzal Mohammed --- arch/arm/Kconfig-nommu | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu index aed66d5df7f1..b7576349528c 100644 --- a/arch/arm/Kconfig-nommu +++ b/arch/arm/Kconfig-nommu @@ -34,8 +34,7 @@ config PROCESSOR_ID used instead of the auto-probing which utilizes the register. config REMAP_VECTORS_TO_RAM - bool 'Install vectors to the beginning of RAM' if DRAM_BASE - depends on DRAM_BASE + bool 'Install vectors to the beginning of RAM' help The kernel needs to change the hardware exception vectors. In nommu mode, the hardware exception vectors are normally -- 2.11.0
linux-kernel@vger.kernel.org
Hi, ARM core fixes required to bring up !MMU Kernel on v7 Cortex-A. This was done on top of Vladimir Murzin's !MMU multiplatform series[1]. Platform used was Cortex-A9, AM437x IDK. Kernel reached the stage of invoking user space init & panicked, though it could not reach till prompt for want of user space executables, it went as much as Kernel can help by itself. But that is an issue independent of the Kernel, hence posting the series (also thought of at least posting the existing patches b'fore merge window starts). So far i have not come across a toolchain (or a way to create toolchain) to create !MMU user space executables for Cortex-A. It is being hoped that Cortex-R toolchain might help here (Thanks Arnd). This is being looked into. multi_v7_defconfig was used & all platforms except TI OMAP/AM/DM/DRA & Freescale i.MX family was deselected. ARM_MPU option was disabled as Vladimir had given an early warning. DRAM_BASE was set to 0x8000. During the course of bringup, futex was causing issues, hence FUTEX was removed. L1 & L2 caches were disabled in config. High vectors were disabled & vectors were made to remap to base of RAM. An additional OMAP specific change to avoid one ioremap was also required. 2/2th patch has been sticked with RFC label, as, though it works, it might have to be made robust so as to not cause issue on other v7-A's upon trying to do !MMU (this won't affect normal MMU boot), or specifically where security extensions are not enabled. Also effect of hypervisor extension also need to be considered. Please let know if any better ways to handle this. Boot logs at the end. Afzal Mohammed (2): ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM ARM: nommu: remap exception base address to RAM arch/arm/Kconfig-nommu | 9 - arch/arm/kernel/head-nommu.S | 6 ++ 2 files changed, 10 insertions(+), 5 deletions(-) [1] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM", http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html (git://linux-arm.org/linux-vm.git nommu-rfc-v2) [2] Boot log [0.00] Booting Linux on physical CPU 0x0 [0.00] Linux version 4.9.0-rc7-00026-g7a142ca8231b (afzal@debian) (gcc version 6.2.0 (GCC) ) #23 Sun Dec 11 14:59:57 IST 2016 [0.00] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=00c50478 [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [0.00] OF: fdt:Machine model: TI AM437x Industrial Development Kit [0.00] bootconsole [earlycon0] enabled [0.00] AM437x ES1.2 (sgx neon) [0.00] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260096 [0.00] Kernel command line: console=ttyO0,115200n8 root=/dev/ram0 rw initrd=0x8180,8M earlyprintk [0.00] PID hash table entries: 4096 (order: 2, 16384 bytes) [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [0.00] Memory: 1021276K/1048576K available (6558K kernel code, 523K rwdata, 2096K rodata, 444K init, 274K bss, 27300K reserved, 0K cma-reserved) [0.00] Virtual kernel memory layout: [0.00] vector : 0x8000 - 0x80001000 ( 4 kB) [0.00] fixmap : 0xffc0 - 0xfff0 (3072 kB) [0.00] vmalloc : 0x - 0x (4095 MB) [0.00] lowmem : 0x8000 - 0xc000 (1024 MB) [0.00] modules : 0x8000 - 0xc000 (1024 MB) [0.00] .text : 0x80008000 - 0x8066f948 (6559 kB) [0.00] .init : 0x8087d000 - 0x808ec000 ( 444 kB) [0.00] .data : 0x808ec000 - 0x8096ef60 ( 524 kB) [0.00].bss : 0x8096ef60 - 0x809b3a9c ( 275 kB) [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [0.00] NR_IRQS:16 nr_irqs:16 16 [0.00] OMAP clockevent source: timer1 at 32786 Hz [0.000255] sched_clock: 64 bits at 500MHz, resolution 2ns, wraps every 4398046511103ns [0.009514] clocksource: arm_global_timer: mask: 0x max_cycles: 0xe6a171a037, max_idle_ns: 881590485102 ns [0.021986] Switching to timer-based delay loop, resolution 2ns [0.140838] clocksource: 32k_counter: mask: 0x max_cycles: 0x, max_idle_ns: 58327039986419 ns [0.151820] OMAP clocksource: 32k_counter at 32768 Hz [0.230698] Console: colour dummy device 80x30 [0.236205] Calibrating delay loop (skipped), value calculated using timer frequency.. 1000.00 BogoMIPS (lpj=500) [0.248268] pid_max: default: 32768 minimum: 301 [0.255822] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes) [0.263618] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes) [0.322900] devtmpfs: initialized [0.936367] VFP support v0.3: implementor 41 architecture 3 part 30 va
Re: RFC: documentation of the autogroup feature [v2]
Hi, On Thu, Nov 24, 2016 at 10:41:29PM +0100, Michael Kerrisk (man-pages) wrote: >Suppose that there are two autogroups competing for the same >CPU. The first group contains ten CPU-bound processes from a >kernel build started with make -j10. The other contains a sin‐ >gle CPU-bound process: a video player. The effect of auto‐ >grouping is that the two groups will each receive half of the >CPU cycles. That is, the video player will receive 50% of the >CPU cycles, rather just 9% of the cycles, which would likely than ? Regards afzal >lead to degraded video playback. Or to put things another way: >an autogroup that contains a large number of CPU-bound pro‐ >cesses does not end up overwhelming the CPU at the expense of >the other jobs on the system.
Re: [PATCH v2 08/10] ARM: dts: nuc900: Add nuc970 dts files
Hi, On Wed, Jul 13, 2016 at 03:26:40PM +0800, Wan Zongshun wrote: > Do you mean I should add cpus into soc yes Regards afzal
Re: [PATCH v2 08/10] ARM: dts: nuc900: Add nuc970 dts files
Hi, On Sun, Jul 10, 2016 at 03:42:20PM +0800, Wan Zongshun wrote: > This patch is to add dts support for nuc970 platform. cpu ! in soc ? lost in fab ? ;) Regards afzal
Re: [PATCH] net: ethernet: ti: cpdma: switch to use genalloc
Hi, On Fri, Jun 24, 2016 at 12:15:41PM -0400, Lennart Sorensen wrote: > although the style does require using brackets for the else if the > if required them. As an aside, though most of the style rationale is K & R, K & R consistently uses unbalanced braces for if-else-* For a one that learns C unadultered from K & R, probably Kernel coding style comes naturally, except for trivial things like above. ...a brick for the shed. Regards afzal
Re: [PATCH] net: ethernet: ti: cpdma: switch to use genalloc
Hi, On Fri, Jun 24, 2016 at 11:35:15AM +0530, Mugunthan V N wrote: > On Thursday 23 June 2016 06:26 PM, Ivan Khoronzhuk wrote: > >> +if (pool->cpumap) { > >> +dma_free_coherent(pool->dev, pool->mem_size, pool->cpumap, > >> + pool->phys); > >> +} else { > >> +iounmap(pool->iomap); > >> +} > > single if, brackets? > > if() has multiple line statement, so brackets are must. Another paint to the bikeshed, seems documented coding style mentions otherwise. Regards afzal
Re: [PATCH 01/48] clk: at91: replace usleep() by udelay() calls
Hi, On Mon, Jun 13, 2016 at 05:24:09PM +0200, Alexandre Belloni wrote: > On 11/06/2016 at 00:30:36 +0200, Arnd Bergmann wrote : > > Does this have to be called that early? It seems wasteful to always > > call udelay() here, when these are functions that are normally > > allowed to sleep. > So I've tested it and something like that would work: > > if (system_state < SYSTEM_RUNNING) > udelay(osc->startup_usec); > else > usleep_range(osc->startup_usec, osc->startup_usec + 1); > > But I'm afraid it would be the first driver to actually do something > like that (however, it is already the only driver trying to sleep). tglx has suggested to modify clock core to handle a somewhat similar kind of scenario (probably should work here too) and avoid driver changes, http://lkml.kernel.org/r/alpine.DEB.2.11.1606061448010.28031@nanos Regards afzal
Re: [PATCH v3 02/12] of: add J-Core cpu bindings
Hi, On Thu, May 26, 2016 at 04:44:02PM -0500, Rob Landley wrote: > As far as I know, we're the first nommu SMP implementation in Linux. According to hearsay, thou shall be called Buzz Aldrin, Blackfin is Neil Armstrong. Regards afzal