Re: ARM: vmsplit 4g/4g

2020-06-15 Thread afzal mohammed
Hi Linus,

On Mon, Jun 15, 2020 at 11:11:04AM +0200, Linus Walleij wrote:

> OK I would be very happy to look at it so I can learn a bit about the
> hands-on and general approach here. Just send it to this address
> directly and I will look!

Have sent it

> > For the next 3 weeks, right now, i cannot say whether i would be able
> > to spend time on it, perhaps might be possible, but only during that
> > time i will know.
> 
> I'm going for vacation the next 2 weeks or so, but then it'd be great if
> we can start looking at this in-depth!

Yes for me too

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-14 Thread afzal mohammed
Hi,

On Sun, Jun 14, 2020 at 06:51:43PM +0530, afzal mohammed wrote:

> It is MB/s for copying one file to another via user space buffer, i.e.
> the value coreutils 'dd' shows w/ status=progress (here it was busybox
> 'dd', so instead it was enabling a compile time option)

Just for correctness, status=progress is not required, it's there in
the default 3rd line of coreutils 'dd' o/p

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-14 Thread afzal mohammed
Hi,

On Sat, Jun 13, 2020 at 10:45:33PM +0200, Arnd Bergmann wrote:

> 4% boot time increase sounds like a lot, especially if that is only for
> copy_from_user/copy_to_user. In the end it really depends on how well
> get_user()/put_user() and small copies can be optimized in the end.

i mentioned the worst case(happened only once), normally it was in
the range 2-3%

> From the numbers you
> measured, it seems the beaglebone currently needs an extra ~6µs or
> 3µs per copy_to/from_user() call with your patch, depending on what
> your benchmark was (MB/s for just reading or writing vs MB/s for
> copying from one file to another through a user space buffer).

It is MB/s for copying one file to another via user space buffer, i.e.
the value coreutils 'dd' shows w/ status=progress (here it was busybox
'dd', so instead it was enabling a compile time option)

> but if you want to test what the overhead is, you could try changing
> /dev/zero (or a different chardev like it) to use a series of
> put_user(0, u32uptr++) in place of whatever it has, and then replace the
> 'str' instruction with dummy writes to ttbr0 using the value it already
> has, like:
> 
>   mcr p15, 0, %0, c2, c0, 0  /* set_ttbr0() */
>   isb  /* prevent speculative access to kernel table */
>   str%1, [%2],0 /* write 32 bit to user space */
>   mcr p15, 0, %0, c2, c0, 0  /* set_ttbr0() */
>   isb  /* prevent speculative access to user table */

> It would be interesting to compare it to the overhead of a
> get_user_page_fast() based implementation.

i have to relocate & be on quarantine couple of weeks, so i will
temporarily stop here, otherwise might end up in roadside.

Reading feedbacks from everyone, some of it i could grasp only bits &
pieces, familiarizing more w/ mm & vfs would help me add value better
to the goal/discussion. Linus Walleij, if you wish to explore things,
feel free, right now don't know how my connectivity would be for next
3 weeks.

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-14 Thread afzal mohammed
Hi,

On Sat, Jun 13, 2020 at 02:15:52PM +0100, Russell King - ARM Linux admin wrote:
> On Sat, Jun 13, 2020 at 05:34:32PM +0530, afzal mohammed wrote:

> > i think C
> > library cuts any size read, write to page size (if it exceeds) &
> > invokes the system call.

> You can't make that assumption about read(2).  stdio in the C library
> may read a page size of data at a time, but programs are allowed to
> call read(2) directly, and the C library will pass such a call straight
> through to the kernel.  So, if userspace requests a 16k read via
> read(2), then read(2) will be invoked covering 16k.
> 
> As an extreme case, for example:
> 
> $ strace -e read dd if=/dev/zero of=/dev/null bs=1048576 count=1
> read(0, 
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 1048576) = 1048576

Okay. Yes, observed that dd is passing whatever is the 'bs' to
Kernel and from the 'dd' sources (of busybox), it is invoking read
system call directly passing 'bs', so it is the tmpfs read that is
splitting it to page size as mentioned by Arnd.

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-13 Thread afzal mohammed
Hi,

On Sat, Jun 13, 2020 at 01:56:15PM +0100, Al Viro wrote:

> Incidentally, what about get_user()/put_user()?  _That_ is where it's
> going to really hurt...

All other uaccess routines are also planned to be added, posting only
copy_{from,to}_user() was to get early feedback (mentioned in the
cover letter)

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-13 Thread afzal mohammed
Hi,

On Sat, Jun 13, 2020 at 02:08:11PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 12, 2020 at 1:20 PM afzal mohammed  
> wrote:

> > +// Started from arch/um/kernel/skas/uaccess.c
> 
> Does it mean you will deduplicate it there?

What i meant was, that file was taken as a template & nothing more, at
same time i wanted to give credit to that file, i will explicitly
mention it next time.

It is not meant to deduplicate it. Functionality here is completely
different.

In the case here, there would be different virtual address mapping
that CPU will be see once in Kernel as compared to user mode.

Here a facility is provided to access the user page, when the
current virtual address mapping of the CPU excludes it. This
is for providing full 4G virtual address to both user & kernel on
32bit ARM to avoid using highmem or reduce the impact of highmem,
i.e. so that Kernel can address till 4GiB (almost) as lowmem.

Here assumption is that user mapping is not a subset of virtual
address mapped by CPU, but a separate one. Upon Kernel entry ttbr0
is changed to Kernel lowmem, while upon Kernel exit is changed back to
user pages (ttbrx in ARM, iiuc, equivalent to cr3 in x86)

Now realize that i am unable to put coherently the problem being
attempted to solve here to a person not familiar w/ the issue
w/o taking considerable time. If above explanation is not enough,
i will try to explain later in a better way.

> > +#include 
> > +#include 
> > +#include 
> > +#include 
> 
> Perhaps ordered?

will take care

> > +static int do_op_one_page(unsigned long addr, int len,
> > +int (*op)(unsigned long addr, int len, void *arg), void 
> > *arg,
> > +struct page *page)
> 
> Maybe typedef for the func() ?

will take care

> > +{
> > +   int n;
> > +
> > +   addr = (unsigned long) kmap_atomic(page) + (addr & ~PAGE_MASK);
> 
> I don't remember about this one...

i am not following you here, for my case !CONFIG_64BIT case in that
file was required, hence only it was picked (or rather not deleted)

> > +   size = min(PAGE_ALIGN(addr) - addr, (unsigned long) len);
> 
> ...but here seems to me you can use helpers (offset_in_page() or how
> it's called).

i was not aware of it, will use it as required.

> 
> Also consider to use macros like PFN_DOWN(), PFN_UP(), etc in your code.

Okay

> 
> > +   remain = len;
> > +   if (size == 0)
> > +   goto page_boundary;
> > +
> > +   n = do_op_one_page(addr, size, op, arg, *pages);
> > +   if (n != 0) {
> 
> > +   remain = (n < 0 ? remain : 0);
> 
> Why duplicate three times (!) this line, if you can move it to under 'out'?

yes better to move there

> 
> > +   goto out;
> > +   }
> > +
> > +   pages++;
> > +   addr += size;
> > +   remain -= size;
> > +
> > +page_boundary:
> > +   if (remain == 0)
> > +   goto out;
> > +   while (addr < ((addr + remain) & PAGE_MASK)) {
> > +   n = do_op_one_page(addr, PAGE_SIZE, op, arg, *pages);
> > +   if (n != 0) {
> > +   remain = (n < 0 ? remain : 0);
> > +   goto out;
> > +   }
> > +
> > +   pages++;
> > +   addr += PAGE_SIZE;
> > +   remain -= PAGE_SIZE;
> > +   }
> 
> Sounds like this can be refactored to iterate over pages rather than 
> addresses.

Okay, i will check

> > +static int copy_chunk_from_user(unsigned long from, int len, void *arg)
> > +{
> > +   unsigned long *to_ptr = arg, to = *to_ptr;
> > +
> > +   memcpy((void *) to, (void *) from, len);
> 
> What is the point in the casting to void *?

The reason it was there was because of copy-paste :), passing unsigned
long as 'void *' or 'const void *' requires casting right ?, or you
meant something else ?

now i checked removing the cast, compiler is abusing me :), says
'makes pointer from integer without a cast'

> > +   num_pages = DIV_ROUND_UP((unsigned long)from + n, PAGE_SIZE) -
> > +(unsigned long)from / PAGE_SIZE;
> 
> PFN_UP() ?

Okay

> I think you can clean up the code a bit after you will get the main
> functionality working.

Yes, surely, intention was to post proof-of-concept ASAP, perhaps
contents will change drastically in next version so that any
resemblence of arch/um/kernel/skas/uaccess.c might not be there.

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-13 Thread afzal mohammed
Hi,

On Fri, Jun 12, 2020 at 10:07:28PM +0200, Arnd Bergmann wrote:

> I think a lot
> of usercopy calls are only for a few bytes, though this is of course
> highly workload dependent and you might only care about the large
> ones.

Observation is that max. pages reaching copy_{from,to}_user() is 2,
observed maximum of n (number of bytes) being 1 page size. i think C
library cuts any size read, write to page size (if it exceeds) &
invokes the system call. Max. pages reaching 2, happens when 'n'
crosses page boundary, this has been observed w/ small size request
as well w/ ones of exact page size (but not page aligned).

Even w/ dd of various size >4K, never is the number of pages required
to be mapped going greater than 2 (even w/ 'dd' 'bs=1M')

i have a worry (don't know whether it is an unnecessary one): even
if we improve performance w/ large copy sizes, it might end up in a
sluggishness w.r.t user experience due to most (hence a high amount)
of user copy calls being few bytes & there the penalty being higher.
And benchmark would not be able to detect anything abnormal since
usercopy are being tested on large sizes.

Quickly comparing boot-time on Beagle Bone White, boot time increases
by only 4%, perhaps this worry is irrelevant, but just thought will
put it across.

> There is also still hope of optimizing small aligned copies like
> 
> set_ttbr0(user_ttbr);
> ldm();
> set_ttbr0(kernel_ttbr);
> stm();

Hmm, more needs to be done to be in a position to test it.

Regards
afzal


Re: [RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g

2020-06-12 Thread afzal mohammed
Hi,

On Fri, Jun 12, 2020 at 09:31:12PM +0530, afzal mohammed wrote:

>  512 1K  4K 16K 32K 64K 1M
>  
> normal   30  46  89 95  90  85  65
> 
> uaccess_w_memcpy 28.545  85 92  91  85  65
> 
> w/ series22  36  72 79  78  75  61

For the sake of completeness all in MB/s, w/ various 'dd' 'bs' sizes.

Regards
afzal


Re: [RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g

2020-06-12 Thread afzal mohammed
Hi,

On Fri, Jun 12, 2020 at 11:19:23AM -0400, Nicolas Pitre wrote:
> On Fri, 12 Jun 2020, afzal mohammed wrote:

> > Performance wise, results are not encouraging, 'dd' on tmpfs results,

> Could you compare with CONFIG_UACCESS_WITH_MEMCPY as well?

 512 1K  4K 16K 32K 64K 1M
 
normal   30  46  89 95  90  85  65

uaccess_w_memcpy 28.545  85 92  91  85  65

w/ series22  36  72 79  78  75  61

There are variations in the range +/-2 in some readings when repeated,
not put above, to keep comparison simple.

Regards
afzal


Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-12 Thread afzal mohammed
Hi,

On Fri, Jun 12, 2020 at 02:02:13PM +0200, Arnd Bergmann wrote:
> On Fri, Jun 12, 2020 at 12:18 PM afzal mohammed  
> wrote:

> > Roughly a one-third drop in performance. Disabling highmem improves
> > performance only slightly.

> There are probably some things that can be done to optimize it,
> but I guess most of the overhead is from the page table operations
> and cannot be avoided.

Ingo's series did a follow_page() first, then as a fallback did it
invoke get_user_pages(), i will try that way as well.

Yes, i too feel get_user_pages_fast() path is the most time consuming,
will instrument & check.

> What was the exact 'dd' command you used, in particular the block size?
> Note that by default, 'dd' will request 512 bytes at a time, so you usually
> only access a single page. It would be interesting to see the overhead with
> other typical or extreme block sizes, e.g. '1', '64', '4K', '64K' or '1M'.

It was the default(512), more test results follows (in MB/s),

512 1K  4K  16K 32K 64K 1M

w/o series  30  46  89  95  90  85  65

w/ series   22  36  72  79  78  75  61

perf drop   26% 21% 19% 16% 13% 12%6%

Hmm, results ain't that bad :)

> If you want to drill down into where exactly the overhead is (i.e.
> get_user_pages or kmap_atomic, or something different), using
> 'perf record dd ..', and 'perf report' may be helpful.

Let me dig deeper & try to find out where the major overhead and try
to figure out ways to reduce it.

One reason to disable highmem & test (results mentioned earlier) was
to make kmap_atomic() very lightweight, that was not making much
difference, around 3% only.

> > +static int copy_chunk_from_user(unsigned long from, int len, void *arg)
> > +{
> > +   unsigned long *to_ptr = arg, to = *to_ptr;
> > +
> > +   memcpy((void *) to, (void *) from, len);
> > +   *to_ptr += len;
> > +   return 0;
> > +}
> > +
> > +static int copy_chunk_to_user(unsigned long to, int len, void *arg)
> > +{
> > +   unsigned long *from_ptr = arg, from = *from_ptr;
> > +
> > +   memcpy((void *) to, (void *) from, len);
> > +   *from_ptr += len;
> > +   return 0;
> > +}
> 
> Will gcc optimize away the indirect function call and inline everything?
> If not, that would be a small part of the overhead.

i think not, based on objdump, i will make these & wherever other
places possible inline & see the difference.

> > +   num_pages = DIV_ROUND_UP((unsigned long)from + n, PAGE_SIZE) -
> > +(unsigned long)from / PAGE_SIZE;
> 
> Make sure this doesn't turn into actual division operations but uses shifts.
> It might even be clearer here to open-code the shift operation so readers
> can see what this is meant to compile into.

Okay

> 
> > +   pages = kmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL | 
> > __GFP_ZERO);
> > +   if (!pages)
> > +   goto end;
> 
> Another micro-optimization may be to avoid the kmalloc for the common case,
> e.g. anything with "num_pages <= 64", using an array on the stack.

Okay

> > +   ret = get_user_pages_fast((unsigned long)from, num_pages, 0, pages);
> > +   if (ret < 0)
> > +   goto free_pages;
> > +
> > +   if (ret != num_pages) {
> > +   num_pages = ret;
> > +   goto put_pages;
> > +   }
> 
> I think this is technically incorrect: if get_user_pages_fast() only
> gets some of the
> pages, you should continue with the short buffer and return the number
> of remaining
> bytes rather than not copying anything. I think you did that correctly
> for a failed
> kmap_atomic(), but this has to use the same logic.

yes, will fix that.


Regards
afzal


Re: ARM: vmsplit 4g/4g

2020-06-12 Thread afzal mohammed
Hi,

On Wed, Jun 10, 2020 at 12:10:21PM +0200, Linus Walleij wrote:
> On Mon, Jun 8, 2020 at 1:09 PM afzal mohammed  wrote:

> > Not yet. Yes, i will do the performance evaluation.
> >
> > i am also worried about the impact on performance as these
> > [ get_user_pages() or friends, kmap_atomic() ] are additionally
> > invoked in the copy_{from,to}_user() path now.
> 
> I am happy to help!

Thanks Linus

> I am anyway working on MMU-related code (KASan) so I need to be on
> top of this stuff.

i earlier went thr' KASAN series secretly & did learn a thing or two
from that!

> What test is appropriate for this? I would intuitively think hackbench?

'dd', i think, as you mentioned 'hackbench' i will use that as well.

> > Note that this was done on a topic branch for user copy. Changes for
> > kernel static mapping to vmalloc has not been merged with these.
> > Also having kernel lowmem w/ a separate asid & switching at kernel
> > entry/exit b/n user & kernel lowmem by changing ttbr0 is yet to be
> > done. Quite a few things remaining to be done to achieve vmsplit 4g/4g
> 
> I will be very excited to look at patches or a git branch once you have
> something you want to show. Also to just understand how you go about
> this.

Don't put too much expectation on me, this is more of a learning for
me. For user copy, the baby steps has been posted (To'ed you). On the
static kernel mapping on vmalloc front, i do not want to post the
patches in the current shape, though git-ized, will result in me
getting mercilessly thrashed in public :). Many of the other platforms
would fail and is not multi-platform friendly. i do not yet have a
public git branch, i can send you the (ugly) patches separately, just
let me know.

> I have several elder systems under my roof

i have only a few low RAM & CPU systems, so that is certainly helpful.

> so my contribution could hopefully be to help and debug any issues

If you would like, we can work together, at the same time keep in mind
that me spending time on it would be intermittent & erratic (though i
am trying to keep a consistent, but slow pace) perhaps making it
difficult to coordinate. Or else i will continue the same way & request
your help when required.

For the next 3 weeks, right now, i cannot say whether i would be able
to spend time on it, perhaps might be possible, but only during that
time i will know.

Regards
afzal


[RFC 3/3] ARM: provide CONFIG_VMSPLIT_4G_DEV for development

2020-06-12 Thread afzal mohammed
Select UACCESS_GUP_KMAP_MEMCPY initially.

Signed-off-by: afzal mohammed 
---
 arch/arm/Kconfig | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index c77c93c485a08..ae2687679d7c8 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1326,6 +1326,15 @@ config PAGE_OFFSET
default 0xB000 if VMSPLIT_3G_OPT
default 0xC000
 
+config VMSPLIT_4G_DEV
+   bool "Experimental changes for 4G/4G user/kernel split"
+   depends on ARM_LPAE
+   select UACCESS_GUP_KMAP_MEMCPY
+   help
+ Experimental changes during 4G/4G user/kernel split development.
+ Existing vmsplit config option is used, once development is done,
+ this would be put as a new choice & _DEV suffix removed.
+
 config NR_CPUS
int "Maximum number of CPUs (2-32)"
range 2 32
-- 
2.26.2



[RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

2020-06-12 Thread afzal mohammed
copy_{from,to}_user() uaccess helpers are implemented by user page
pinning, followed by temporary kernel mapping & then memcpy(). This
helps to achieve user page copy when current virtual address mapping
of the CPU excludes user pages.

Performance wise, results are not encouraging, 'dd' on tmpfs results,

ARM Cortex-A8, BeagleBone White (256MiB RAM):
w/o series - ~29.5 MB/s
w/ series - ~20.5 MB/s
w/ series & highmem disabled - ~21.2 MB/s

On Cortex-A15(2GiB RAM) in QEMU:
w/o series - ~4 MB/s
w/ series - ~2.6 MB/s

Roughly a one-third drop in performance. Disabling highmem improves
performance only slightly.

'hackbench' also showed a similar pattern.

uaccess routines using page pinning & temporary kernel mapping is not
something new, it has been done long long ago by Ingo [1] as part of
4G/4G user/kernel mapping implementation on x86, though not merged in
mainline.

[1] 
https://lore.kernel.org/lkml/Pine.LNX.4.44.0307082332450.17252-10@localhost.localdomain/

Signed-off-by: afzal mohammed 
---
 lib/Kconfig   |   4 +
 lib/Makefile  |   3 +
 lib/uaccess_gup_kmap_memcpy.c | 162 ++
 3 files changed, 169 insertions(+)
 create mode 100644 lib/uaccess_gup_kmap_memcpy.c

diff --git a/lib/Kconfig b/lib/Kconfig
index 5d53f9609c252..dadf4f6cc391d 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -622,6 +622,10 @@ config ARCH_HAS_MEMREMAP_COMPAT_ALIGN
 config UACCESS_MEMCPY
bool
 
+# pin page + kmap_atomic + memcpy for user copies, intended for vmsplit 4g/4g
+config UACCESS_GUP_KMAP_MEMCPY
+   bool
+
 config ARCH_HAS_UACCESS_FLUSHCACHE
bool
 
diff --git a/lib/Makefile b/lib/Makefile
index 685aee60de1d5..bc457f85e391a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -309,3 +309,6 @@ obj-$(CONFIG_OBJAGG) += objagg.o
 
 # KUnit tests
 obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
+
+# uaccess
+obj-$(CONFIG_UACCESS_GUP_KMAP_MEMCPY) += uaccess_gup_kmap_memcpy.o
diff --git a/lib/uaccess_gup_kmap_memcpy.c b/lib/uaccess_gup_kmap_memcpy.c
new file mode 100644
index 0..1536762df1fd5
--- /dev/null
+++ b/lib/uaccess_gup_kmap_memcpy.c
@@ -0,0 +1,162 @@
+// SPDX-License-Identifier: GPL-2.0
+// Started from arch/um/kernel/skas/uaccess.c
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+static int do_op_one_page(unsigned long addr, int len,
+int (*op)(unsigned long addr, int len, void *arg), void *arg,
+struct page *page)
+{
+   int n;
+
+   addr = (unsigned long) kmap_atomic(page) + (addr & ~PAGE_MASK);
+   n = (*op)(addr, len, arg);
+   kunmap_atomic((void *)addr);
+
+   return n;
+}
+
+static long buffer_op(unsigned long addr, int len,
+ int (*op)(unsigned long, int, void *), void *arg,
+ struct page **pages)
+{
+   long size, remain, n;
+
+   size = min(PAGE_ALIGN(addr) - addr, (unsigned long) len);
+   remain = len;
+   if (size == 0)
+   goto page_boundary;
+
+   n = do_op_one_page(addr, size, op, arg, *pages);
+   if (n != 0) {
+   remain = (n < 0 ? remain : 0);
+   goto out;
+   }
+
+   pages++;
+   addr += size;
+   remain -= size;
+
+page_boundary:
+   if (remain == 0)
+   goto out;
+   while (addr < ((addr + remain) & PAGE_MASK)) {
+   n = do_op_one_page(addr, PAGE_SIZE, op, arg, *pages);
+   if (n != 0) {
+   remain = (n < 0 ? remain : 0);
+   goto out;
+   }
+
+   pages++;
+   addr += PAGE_SIZE;
+   remain -= PAGE_SIZE;
+   }
+   if (remain == 0)
+   goto out;
+
+   n = do_op_one_page(addr, remain, op, arg, *pages);
+   if (n != 0) {
+   remain = (n < 0 ? remain : 0);
+   goto out;
+   }
+
+   return 0;
+out:
+   return remain;
+}
+
+static int copy_chunk_from_user(unsigned long from, int len, void *arg)
+{
+   unsigned long *to_ptr = arg, to = *to_ptr;
+
+   memcpy((void *) to, (void *) from, len);
+   *to_ptr += len;
+   return 0;
+}
+
+static int copy_chunk_to_user(unsigned long to, int len, void *arg)
+{
+   unsigned long *from_ptr = arg, from = *from_ptr;
+
+   memcpy((void *) to, (void *) from, len);
+   *from_ptr += len;
+   return 0;
+}
+
+unsigned long gup_kmap_copy_from_user(void *to, const void __user *from, 
unsigned long n)
+{
+   struct page **pages;
+   int num_pages, ret, i;
+
+   if (uaccess_kernel()) {
+   memcpy(to, (__force void *)from, n);
+   return 0;
+   }
+
+   num_pages = DIV_ROUND_UP((unsigned long)from + n, PAGE_SIZE) -
+(unsigned long)from / PAGE_SIZE;
+   pages = kmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL | 
__GFP_ZERO);
+  

[RFC 2/3] ARM: uaccess: let UACCESS_GUP_KMAP_MEMCPY enabling

2020-06-12 Thread afzal mohammed
Turn off existing raw_copy_{from,to}_user() using
arm_copy_{from,to}_user() when CONFIG_UACCESS_GUP_KMAP_MEMCPY is
enabled.

Signed-off-by: afzal mohammed 
---
 arch/arm/include/asm/uaccess.h | 20 
 arch/arm/kernel/armksyms.c |  2 ++
 arch/arm/lib/Makefile  |  7 +--
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 98c6b91be4a8a..4a16ae52d4978 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -512,6 +512,15 @@ do {   
\
 extern unsigned long __must_check
 arm_copy_from_user(void *to, const void __user *from, unsigned long n);
 
+#ifdef CONFIG_UACCESS_GUP_KMAP_MEMCPY
+extern unsigned long __must_check
+gup_kmap_copy_from_user(void *to, const void __user *from, unsigned long n);
+static inline __must_check unsigned long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+   return gup_kmap_copy_from_user(to, from, n);
+}
+#else
 static inline unsigned long __must_check
 raw_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
@@ -522,12 +531,22 @@ raw_copy_from_user(void *to, const void __user *from, 
unsigned long n)
uaccess_restore(__ua_flags);
return n;
 }
+#endif
 
 extern unsigned long __must_check
 arm_copy_to_user(void __user *to, const void *from, unsigned long n);
 extern unsigned long __must_check
 __copy_to_user_std(void __user *to, const void *from, unsigned long n);
 
+#ifdef CONFIG_UACCESS_GUP_KMAP_MEMCPY
+extern unsigned long __must_check
+gup_kmap_copy_to_user(void __user *to, const void *from, unsigned long n);
+static inline __must_check unsigned long
+raw_copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+   return gup_kmap_copy_to_user(to, from, n);
+}
+#else
 static inline unsigned long __must_check
 raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
@@ -541,6 +560,7 @@ raw_copy_to_user(void __user *to, const void *from, 
unsigned long n)
return arm_copy_to_user(to, from, n);
 #endif
 }
+#endif
 
 extern unsigned long __must_check
 arm_clear_user(void __user *addr, unsigned long n);
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 98bdea51089d5..8c92fe30d1559 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -96,8 +96,10 @@ EXPORT_SYMBOL(mmiocpy);
 #ifdef CONFIG_MMU
 EXPORT_SYMBOL(copy_page);
 
+#ifndef CONFIG_UACCESS_GUP_KMAP_MEMCPY
 EXPORT_SYMBOL(arm_copy_from_user);
 EXPORT_SYMBOL(arm_copy_to_user);
+#endif
 EXPORT_SYMBOL(arm_clear_user);
 
 EXPORT_SYMBOL(__get_user_1);
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b6..1aeff2cd7b4b3 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -16,8 +16,11 @@ lib-y:= changebit.o csumipv6.o csumpartial.o 
  \
   io-readsb.o io-writesb.o io-readsl.o io-writesl.o  \
   call_with_stack.o bswapsdi2.o
 
-mmu-y  := clear_user.o copy_page.o getuser.o putuser.o   \
-  copy_from_user.o copy_to_user.o
+mmu-y  := clear_user.o copy_page.o getuser.o putuser.o
+
+ifndef CONFIG_UACCESS_GUP_KMAP_MEMCPY
+  mmu-y+= copy_from_user.o copy_to_user.o
+endif
 
 ifdef CONFIG_CC_IS_CLANG
   lib-y+= backtrace-clang.o
-- 
2.26.2



[RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g

2020-06-12 Thread afzal mohammed
Hi,

copy_{from,to}_user() uaccess helpers are implemented by user page
pinning, followed by temporary kernel mapping & then memcpy(). This
helps to achieve user page copy when current virtual address mapping
of the CPU excludes user pages.

Other uaccess routines are also planned to be modified to make use of
pinning plus kmap_atomic() based on the feedback here.

This is done as one of the initial steps to achieve 4G virtual
address mapping for user as well as Kernel on ARMv7 w/ LPAE.

Motive behind this is to enable Kernel access till 4GiB (almost) as
lowmem, thus helping in removing highmem support for platforms having
upto 4GiB RAM. In the case of platforms having >4GiB, highmem is still
required for the Kernel to be able to access whole RAM.

Performance wise, results are not encouraging, 'dd' on tmpfs results,

ARM Cortex-A8, BeagleBone White (256MiB RAM):
w/o series - ~29.5 MB/s
w/ series - ~20.5 MB/s
w/ series & highmem disabled - ~21.2 MB/s

On Cortex-A15(2GiB RAM) in QEMU:
w/o series - ~4 MB/s
w/ series - ~2.6 MB/s

Roughly a one-third drop in performance. Disabling highmem improves
performance only slightly.

'hackbench' also showed a similar pattern.

Ways to improve the performance has to be explored, if any one has
thoughts on it, please share.

uaccess routines using page pinning & temporary kernel mapping is not
something new, it has been done by Ingo long long ago [1] as part of
4G/4G user/kernel mapping implementation on x86, though not merged in
mainline.

Arnd has outlined basic design for vmsplit 4g/4g, uaccess routines
using user page pinning plus kmap_atomic() is one part of that.

[1] 
https://lore.kernel.org/lkml/Pine.LNX.4.44.0307082332450.17252-10@localhost.localdomain/

Last 2 patches are only meant for testing first patch.

Regards
afzal

afzal mohammed (3):
  lib: copy_{from,to}_user using gup & kmap_atomic()
  ARM: uaccess: let UACCESS_GUP_KMAP_MEMCPY enabling
  ARM: provide CONFIG_VMSPLIT_4G_DEV for development

 arch/arm/Kconfig   |   9 ++
 arch/arm/include/asm/uaccess.h |  20 
 arch/arm/kernel/armksyms.c |   2 +
 arch/arm/lib/Makefile  |   7 +-
 lib/Kconfig|   4 +
 lib/Makefile   |   3 +
 lib/uaccess_gup_kmap_memcpy.c  | 162 +
 7 files changed, 205 insertions(+), 2 deletions(-)
 create mode 100644 lib/uaccess_gup_kmap_memcpy.c

-- 
2.26.2



Re: ARM: vmsplit 4g/4g

2020-06-09 Thread afzal mohammed
Hi,

On Mon, Jun 08, 2020 at 08:47:27PM +0530, afzal mohammed wrote:
> On Mon, Jun 08, 2020 at 04:43:57PM +0200, Arnd Bergmann wrote:

> > There is another difference: get_user_pages_fast() does not return
> > a  vm_area_struct pointer, which is where you would check the access
> > permissions. I suppose those pointers could not be returned to callers
> > that don't already hold the mmap_sem.
> 
> Ok, thanks for the details, i need to familiarize better with mm.

i was & now more confused w.r.t checking access permission using
vm_area_struct to deny write on a read only user page.

i have been using get_user_pages_fast() w/ FOLL_WRITE in copy_to_user.
Isn't that sufficient ?, afaiu, get_user_pages_fast() will ensure that
w/ FOLL_WRITE, pte has write permission, else no struct page * is
handed back to the caller.

One of the simplified path which could be relevant in the majority of
the cases that i figured out as follows,

 get_user_pages_fast
  internal_user_pages_fast
   gup_pgd_range [ no mmap_sem acquire path]
gup_p4d_range 
 gup_pud_range
  gup_pmd_range
   gup_pte_range
if (!pte_access_permitted(pte, flags & FOLL_WRITE))
 [ causes to return NULL page if access violation ]

   __gup_longterm_unlocked [ mmap_sem acquire path]
get_user_pages_unlocked
 __get_user_pages_locked
  __get_user_pages
   follow_page_mask
follow_p4d_mask
 follow_pud_mask
  follow_pmd_mask
   follow_page_pte
if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags))
 [ causes to return NULL page if access violation ]

As far as i could see none of the get_user_pages() caller are passing
struct vm_area_struct ** to get it populated.

And Ingo's series eons ago didn't either pass it or check permission
using it (it was passing a 'write' arguement, which i believe
corrresponds to FOLL_WRITE)

Am i missing something or wrong in the analysis ?

Regards
afzal


Re: ARM: vmsplit 4g/4g

2020-06-08 Thread afzal mohammed
Hi,

On Mon, Jun 08, 2020 at 04:43:57PM +0200, Arnd Bergmann wrote:

> There is another difference: get_user_pages_fast() does not return
> a  vm_area_struct pointer, which is where you would check the access
> permissions. I suppose those pointers could not be returned to callers
> that don't already hold the mmap_sem.

Ok, thanks for the details, i need to familiarize better with mm.

Regards
afzal


Re: ARM: vmsplit 4g/4g

2020-06-08 Thread afzal mohammed
Hi,

On Sun, Jun 07, 2020 at 09:26:26PM +0200, Arnd Bergmann wrote:

> I think you have to use get_user_pages() though instead of
> get_user_pages_fast(),
> in order to be able to check the permission bits to prevent doing a
> copy_to_user()
> into read-only mappings.

i was not aware of this, is it documented somewhere ?, afaiu,
difference b/n get_user_pages_fast() & get_user_pages() is that fast
version will try to pin pages w/o acquiring mmap_sem if possible.

> Do you want me to review the uaccess patch to look for any missing
> corner cases, or do you want to do the whole set of user access helpers
> first?

i will cleanup and probably post RFC initially for the changes
handling copy_{from,to}_user() to get feedback.

Regards
afzal


Re: ARM: vmsplit 4g/4g

2020-06-08 Thread afzal mohammed
Hi,

[ my previous mail did not make into linux-arm-kernel mailing list,
 got a  mail saying it has a suspicious header and that it is waiting
 moderator approval ]

On Sun, Jun 07, 2020 at 05:11:16PM +0100, Russell King - ARM Linux admin wrote:
> On Sun, Jun 07, 2020 at 06:29:32PM +0530, afzal mohammed wrote:

> > get_user_pages_fast() followed by kmap_atomic() & then memcpy() seems
> > to work in principle for user copy.
> 
> Have you done any performance evaluation of the changes yet? I think
> it would be a good idea to keep that in the picture. If there's any
> significant regression, then that will need addressing.

Not yet. Yes, i will do the performance evaluation.

i am also worried about the impact on performance as these
[ get_user_pages() or friends, kmap_atomic() ] are additionally
invoked in the copy_{from,to}_user() path now.

Note that this was done on a topic branch for user copy. Changes for
kernel static mapping to vmalloc has not been merged with these.
Also having kernel lowmem w/ a separate asid & switching at kernel
entry/exit b/n user & kernel lowmem by changing ttbr0 is yet to be
done. Quite a few things remaining to be done to achieve vmsplit 4g/4g

Regards
afzal


ARM: vmsplit 4g/4g

2020-06-07 Thread afzal mohammed
Hi,

On Sat, May 16, 2020 at 09:35:57AM +0200, Arnd Bergmann wrote:
> On Sat, May 16, 2020 at 8:06 AM afzal mohammed  
> wrote:

> > Okay, so the conclusion i take is,
> > 1. VMSPLIT 4G/4G have to live alongside highmem
> > 2. For user space copy, do pinning followed by kmap

> Right, though kmap_atomic() should be sufficient here
> because it is always a short-lived mapping.

get_user_pages_fast() followed by kmap_atomic() & then memcpy() seems
to work in principle for user copy.

Verified in a crude way by pointing TTBR0 to a location that has user
pgd's cleared upon entry to copy_to_user() & restoring TTBR0 to
earlier value after user copying was done and ensuring boot.

Meanwhile more testing w/ kernel static mapping in vmalloc space
revealed a major issue, w/ LPAE it was not booting. There were issues
related to pmd handling, w/ !LPAE those issues were not present as pmd
is in effect equivalent to pgd. The issues has been fixed, though now
LPAE boots, but feel a kind of fragile, will probably have to revisit
it.

Regards
afzal


Re: ARM: static kernel in vmalloc space

2020-05-15 Thread afzal mohammed
Hi,

On Thu, May 14, 2020 at 05:32:41PM +0200, Arnd Bergmann wrote:

> Typical distros currently offer two kernels, with and without LPAE,
> and they probably don't want to add a third one for LPAE with
> either highmem or vmsplit-4g-4g. Having extra user address
> space and more lowmem is both going to help users that
> still have 8GB configurations.

Okay, so the conclusion i take is,

1. VMSPLIT 4G/4G have to live alongside highmem
2. For user space copy, do pinning followed by kmap

Regards
afzal


Re: ARM: static kernel in vmalloc space

2020-05-14 Thread afzal mohammed
Hi,

On Thu, May 14, 2020 at 07:05:45PM +0530, afzal mohammed wrote:

> So if we make VMSPLIT_4G_4G, depends on !HIGH_MEMORY (w/ mention of
> caveat in Kconfig help that this is meant for platforms w/ <=4GB), then
> we can do copy_{from,to}_user the same way currently do, and no need to
> do the user page pinning & kmap, right ?

i think user page pinning is still required, but kmap can be avoided
by using lowmem corresponding to that page, right ?, or am i
completely wrong ?

Regards
afzal


Re: ARM: static kernel in vmalloc space

2020-05-14 Thread afzal mohammed
Hi,

On Thu, May 14, 2020 at 02:41:11PM +0200, Arnd Bergmann wrote:
> On Thu, May 14, 2020 at 1:18 PM afzal mohammed  
> wrote:

> > 1. SoC w/ LPAE
> > 2. TTBR1 (top 256MB) for static kernel, modules, io mappings, vmalloc,
> > kmap, fixmap & vectors

> Right, these kind of go together because pre-LPAE cannot do the
> same TTBR1 split, and they more frequently have conflicting
> static mappings.
> 
> It's clearly possible to do something very similar for older chips
> (v6 or v7 without LPAE, possibly even v5), it just gets harder
> while providing less benefit.

Yes, lets have it only for LPAE

> > 3. TTBR0 (low 3768MB) for user space & lowmem (kernel lowmem to have

> hardcoded 3840/256 split is likely the best compromise of all the

hmm,i swallowed 72MB ;)

> > 4. for user space to/from copy
> >  a. pin user pages
> >  b. kmap user page (can't corresponding lowmem be used instead ?)

> - In the long run, there is no need for kmap()/kmap_atomic() after
>   highmem gets removed from the kernel, but for the next few years
>   we should still assume that highmem can be used, in order to support
>   systems like the 8GB highbank, armadaxp, keystone2 or virtual
>   machines. For lowmem pages (i.e. all pages when highmem is
>   disabled), kmap_atomic() falls back to page_address() anyway,
>   so there is no much overhead.

Here i have some confusion - iiuc, VMSPLIT_4G_4G is meant to help
platforms having RAM > 768M and <= 4GB disable high memory and still
be able to access full RAM, so high memory shouldn't come into picture,
right ?. And for the above platforms it can continue current VMPSLIT
option (the default 3G/1G), no ?, as VMSPLIT_4G_4G can't help complete
8G to be accessible from lowmem.

So if we make VMSPLIT_4G_4G, depends on !HIGH_MEMORY (w/ mention of
caveat in Kconfig help that this is meant for platforms w/ <=4GB), then
we can do copy_{from,to}_user the same way currently do, and no need to
do the user page pinning & kmap, right ?

Only problem i see is Kernel compiled w/ VMSPLIT_4G_4G not suitable
for >4GB machines, but anyway iiuc, it is was not meant for those
machines. And it is not going to affect our current multiplatform
setup as LPAE is not defined in multi_v7.

Regards
afzal


Re: ARM: static kernel in vmalloc space

2020-05-14 Thread afzal mohammed
Hi,

On Tue, May 12, 2020 at 09:49:59PM +0200, Arnd Bergmann wrote:

> Any idea which bit you want to try next?

My plan has been to next post patches for the static kernel migration
to vmalloc space (currently the code is rigid, taking easy route
wherever possible & not of high quality) as that feature has an
independent existence & adds value by itself.  And then start working
on other steps towards VMSPLIT_4G_4G.

Now that you mentioned about other things, i will slowly start those
as well.

> Creating a raw_copy_{from,to}_user()
> based on get_user_pages()/kmap_atomic()/memcpy() is probably a good
> next thing to do. I think it can be done one page at a time with only
> checking for
> get_fs(), access_ok(), and page permissions, while get_user()/put_user()
> need to handle a few more corner cases.

Before starting w/ other things, i would like to align on the high
level design,

My understanding (mostly based on your comments) as follows,
(i currently do not have a firm grip over these things, hope to have
it once started w/ the implementation)

1. SoC w/ LPAE 
2. TTBR1 (top 256MB) for static kernel, modules, io mappings, vmalloc,
kmap, fixmap & vectors
3. TTBR0 (low 3768MB) for user space & lowmem (kernel lowmem to have
separate ASID)
4. for user space to/from copy
 a. pin user pages
 b. kmap user page (can't corresponding lowmem be used instead ?)
 c. copy

Main points are as above, right ?, anything missed ?, or anything more
you want to add ?, let me know your opinion.

Regards
afzal


Re: ARM: static kernel in vmalloc space

2020-05-12 Thread afzal mohammed
Hi,

On Mon, May 11, 2020 at 05:29:29PM +0200, Arnd Bergmann wrote:

> What do you currently do with the module address space?

In the current setup, module address space was untouched, i.e. virtual
address difference b/n text & module space is far greater than 32MB, at
least > (2+768+16)MB and modules can't be loaded unless ARM_MODULE_PLTS
is enabled (this was checked now)

> easiest way is to just always put modules into vmalloc space, as we already
> do with CONFIG_ARM_MODULE_PLTS when the special area gets full,
> but that could be optimized once the rest works.

Okay

Regards
afzal


ARM: static kernel in vmalloc space (was Re: [PATCH 0/3] Highmem support for 32-bit RISC-V)

2020-05-11 Thread afzal mohammed
Hi,

Kernel now boots to prompt w/ static kernel mapping moved to vmalloc
space.

Changes currently done have a couple of platform specific things, this
has to be modified to make it multiplatform friendly (also to be taken
care is ARM_PATCH_PHYS_VIRT case). Module address space has to be
taken care as well.

Logs follows

Regards
afzal

[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 5.7.0-rc1-00043-ge8ffd99475b9c (afzal@afzalpc) 
(gcc version 8.2.0 (GCC_MA), GNU ld (GCC_MA) 2.31.1) #277 SMP Mon May 11 
18:16:51 IST 2020
[0.00] CPU: ARMv7 Processor [412fc0f1] revision 1 (ARMv7), cr=10c5387d
[0.00] CPU: div instructions available: patching division code
[0.00] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[0.00] OF: fdt: Machine model: V2P-CA15
[0.00] printk: bootconsole [earlycon0] enabled
[0.00] Memory policy: Data cache writealloc
[0.00] efi: UEFI not found.
[0.00] Reserved memory: created DMA memory pool at 0x1800, size 8 
MiB
[0.00] OF: reserved mem: initialized node vram@1800, compatible id 
shared-dma-pool
[0.00] percpu: Embedded 20 pages/cpu s49164 r8192 d24564 u81920
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 522751
[0.00] Kernel command line: console=ttyAMA0,115200 rootwait 
root=/dev/mmcblk0 earlyprintk
[0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, 
linear)
[0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, 
linear)
[0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
[0.00] Memory: 2057032K/2097148K available (12288K kernel code, 1785K 
rwdata, 5188K rodata, 2048K init, 403K bss, 40116K reserved, 0K cma-reserved, 
1310716K highmem)
[0.00] Virtual kernel memory layout:
[0.00] vector  : 0x - 0x1000   (   4 kB)
[0.00] fixmap  : 0xffc0 - 0xfff0   (3072 kB)
[0.00] vmalloc : 0xf100 - 0xff80   ( 232 MB)
[0.00] lowmem  : 0xc000 - 0xf000   ( 768 MB)
[0.00] pkmap   : 0xbfe0 - 0xc000   (   2 MB)
[0.00] modules : 0xbf00 - 0xbfe0   (  14 MB)
[0.00]   .text : 0xf1208000 - 0xf1f0   (13280 kB)
[0.00]   .init : 0xf250 - 0xf270   (2048 kB)
[0.00]   .data : 0xf270 - 0xf28be558   (1786 kB)
[0.00].bss : 0xf28be558 - 0xf29231a8   ( 404 kB)
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[0.00] rcu: Hierarchical RCU implementation.
[0.00] rcu: RCU event tracing is enabled.
[0.00] rcu: RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=2.
[0.00] rcu: RCU calculated value of scheduler-enlistment delay is 10 
jiffies.
[0.00] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[0.00] random: get_random_bytes called from start_kernel+0x304/0x49c 
with crng_init=0
[0.000311] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 
89478484971ns
[0.006788] clocksource: arm,sp804: mask: 0x max_cycles: 0x, 
max_idle_ns: 1911260446275 ns
[0.008479] Failed to initialize 
'/bus@800/motherboard/iofpga@3,/timer@12': -22
[0.013414] arch_timer: cp15 timer(s) running at 62.50MHz (virt).
[0.013875] clocksource: arch_sys_counter: mask: 0xff 
max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns
[0.014610] sched_clock: 56 bits at 62MHz, resolution 16ns, wraps every 
4398046511096ns
[0.015199] Switching to timer-based delay loop, resolution 16ns
[0.020168] Console: colour dummy device 80x30
[0.022219] Calibrating delay loop (skipped), value calculated using timer 
frequency.. 125.00 BogoMIPS (lpj=625000)
[0.026998] pid_max: default: 32768 minimum: 301
[0.028835] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, 
linear)
[0.029319] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, 
linear)
[0.044484] CPU: Testing write buffer coherency: ok
[0.045452] CPU0: Spectre v2: firmware did not set auxiliary control 
register IBE bit, system vulnerable
[0.057536] /cpus/cpu@0 missing clock-frequency property
[0.058065] /cpus/cpu@1 missing clock-frequency property
[0.058538] CPU0: thread -1, cpu 0, socket 0, mpidr 8000
[0.066972] Setting up static identity map for 0x8030 - 0x803000ac
[0.074772] rcu: Hierarchical SRCU implementation.
[0.083336] EFI services will not be available.
[0.085605] smp: Bringing up secondary CPUs ...
[0.090454] CPU1: thread -1, cpu 1, socket 0, mpidr 8001
[0.090560] CPU1: Spectre v2: firmware did not set auxiliary control 
register IBE bit, system vulnerable
[0.096711] smp: Brought up 1 node, 2 CPUs
[0.097132] SMP: Total of 2 processors activa

Re: [PATCH] ARM: omap1: fix irq setup

2020-05-05 Thread afzal mohammed
Hi,

On Tue, May 05, 2020 at 04:13:48PM +0200, Arnd Bergmann wrote:

> A recent cleanup introduced a bug on any omap1 machine that has
> no wakeup IRQ, i.e. omap15xx:

> Move this code into a separate function to deal with it cleanly.
> 
> Fixes: b75ca5217743 ("ARM: OMAP: replace setup_irq() by request_irq()")
> Signed-off-by: Arnd Bergmann 

Sorry for the mistake and thanks for the fix,

Acked-by: afzal mohammed 

Regards
afzal


Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

2020-05-04 Thread afzal mohammed
[ +linux-arm-kernel

  Context: This is regarding VMSPLIT_4G_4G support for 32-bit ARM as a
  possible replacement to highmem. For that, initially, it is being
  attempted to move static kernel mapping from lowmem to vmalloc space.

  in next reply, i will remove everyone/list !ARM related ]

Hi,

On Sun, May 03, 2020 at 10:20:39PM +0200, Arnd Bergmann wrote:

> Which SoC platform are you running this on? Just making
> sure that this won't conflict with static mappings later.

Versatile Express V2P-CA15 on qemu, qemu options include --smp 2 &
2GB memory.

BTW, i could not convince myself why, except for DEBUG_LL, static io
mappings are used.

> 
> One problem I see immediately in arm_memblock_init()

Earlier it went past arm_memblock_init(), issue was clearing the page
tables from VMALLOC_START in devicemaps_init() thr' paging_init(),
which was like cutting the sitting branch of the tree.

Now it is crashing at debug_ll_io_init() of devicemap_init(), and
printascii/earlycon was & is being used to debug :). Things are going
wrong when it tries to create mapping for debug_ll. It looks like a
conflict with static mapping, which you mentioned above, at the same
time i am not seeing kernel static mapping in the same virtual
address, need to dig deeper.

Also tried removing DEBUG_LL, there is a deafening silence in the
console ;)

> is that it uses
> __pa() to convert from virtual address in the linear map to physical,
> but now you actually pass an address that is in vmalloc rather than
> the linear map.

__virt_to_phys_nodebug() which does the actual work on __pa() invocation
has been modifed to handle that case (ideas lifted from ARM64's
implementation), though currently it is a hack as below (and applicable
only for ARM_PATCH_PHYS_VIRT disabled case), other hacks being
VMALLOC_OFFSET set to 0 and adjusting vmalloc size.

static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x)
{
phys_addr_t __x = (phys_addr_t)x;

if (__x >= 0xf000)
return __x - KIMAGE_OFFSET + PHYS_OFFSET;
else
return __x - PAGE_OFFSET + PHYS_OFFSET;
}

Regards
afzal


Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

2020-05-03 Thread afzal mohammed
Hi Arnd,

> On Tue, Apr 14, 2020 at 09:29:46PM +0200, Arnd Bergmann wrote:

> > Another thing to try early is to move the vmlinux virtual address
> > from the linear mapping into vmalloc space. This does not require
> > LPAE either, but it only works on relatively modern platforms that
> > don't have conflicting fixed mappings there.

i have started by attempting to move static kernel mapping from lowmem
to vmalloc space. At boot the execution so far has went past assembly
& reached C, to be specific, arm_memblock_init [in setup_arch()],
currently debugging the hang that happens after that point. To make
things easier in the beginning, ARM_PATCH_PHYS_VIRT is disabled &
platform specific PHYS_OFFSET is fed, this is planned to be fixed once
it boots.

[ i will probably start a new thread or hopefully RFC on LAKML ]

Regards
afzal


Re: [PATCHv2 5/5] arm64: allwinner: a64: Add support for TERES-I laptop

2018-03-19 Thread afzal mohammed
Hi Maxime,

On Sun, Mar 18, 2018 at 09:22:51PM +0100, Maxime Ripard wrote:
> The first part is supposed to be the name of the boards. I did sed
> s/leds/teres-i/, and applied, together with all the patches but the
> PWM (so I had to drop the backlight node as well).
> 
> Please coordinate with Andre about who should send the PWM support.

Assuming that these patches were applied to your sunxi/dt64-for-4.17
branch, since PWM support patch is missing, there is a build error,

arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts:129.1-5 Label or path pwm 
not found

Diff at the end cures it.

(there is another H6 pine 64 DT build error related to header file
missing)

afzal


--->8---
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts
b/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts
index b3c7ef6b6fe5..d9baab3dc96b 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-teres-i.dts
@@ -126,12 +126,6 @@
status = "okay";
 };
 
-&pwm {
-   pinctrl-names = "default";
-   pinctrl-0 = <&pwm_pin>;
-   status = "okay";
-};
-
 &ohci1 {
status = "okay";
 };



Re: [PATCHv2 5/5] arm64: allwinner: a64: Add support for TERES-I laptop

2018-03-16 Thread afzal mohammed
Hi,

On Fri, Mar 16, 2018 at 12:07:53PM +0530, afzal mohammed wrote:

> Received only patch 4 & 5 in my inbox, receive path was via
> linux-kernel rather than linux-arm-kernel, but in both archives all
> patches are seen (though threading seems not right), probably missing
> patches are due to issue gmail have with LKML,

Cover letter plus 1-3 patches was swallowed by spam filter, even your
reply to me on v1 cover letter subthread was so, dunno whether it has
something to do with your mail header contents.

afzal


Re: [PATCHv2 5/5] arm64: allwinner: a64: Add support for TERES-I laptop

2018-03-15 Thread afzal mohammed
Hi,

On Thu, Mar 15, 2018 at 04:25:10PM +, Harald Geyer wrote:
> The TERES-I is an open hardware laptop built by Olimex using the
> Allwinner A64 SoC.
> 
> Add the board specific .dts file, which includes the A64 .dtsi and
> enables the peripherals that we support so far.
> 
> Signed-off-by: Harald Geyer 

Received only patch 4 & 5 in my inbox, receive path was via
linux-kernel rather than linux-arm-kernel, but in both archives all
patches are seen (though threading seems not right), probably missing
patches are due to issue gmail have with LKML,

so had to pull the series from patchwork, for the series,

Tested-by: afzal mohammed 

afzal


Re: arm64: allwinner: Add support for TERES I laptop

2018-03-15 Thread afzal mohammed
Hi,

On Thu, Mar 15, 2018 at 10:36:06PM +0530, afzal mohammed wrote:
> Thanks for the patches
> 
> w/ defconfig could reach to prompt via serial console using audio
> jack.
> 
> And just by enabling PWM_SUN4I & FB_SIMPLE, laptop could function
> standalone as well.
> 
> Suggestions (feel free to ignore):
> 
> 1. seems currently only review comment pending is on simple
> framebuffer, perhaps you can proceed removing just that so that a
> basic bootable system can be achieved at the earliest (iiuc, anyway
> drm would be the final solution for display)
> 
> 2. in next revision (if), may be you can put keywords DIY and/or Open
> Hardware (irrespective of whatever exactly that means) Laptop in the
> subject itself, that might bring more interest/eyeballs, especially at
> this time of ME & so on.

Realizing now that your v2 patches & above mail crossed.

afzal



Re: [PATCH 00/16] remove eight obsolete architectures

2018-03-15 Thread afzal mohammed
Hi,

On Thu, Mar 15, 2018 at 10:56:48AM +0100, Arnd Bergmann wrote:
> On Thu, Mar 15, 2018 at 10:42 AM, David Howells  wrote:

> > Do we have anything left that still implements NOMMU?

Please don't kill !MMU.

> Yes, plenty.

> I've made an overview of the remaining architectures for my own reference[1].
> The remaining NOMMU architectures are:
> 
> - arch/arm has ARMv7-M (Cortex-M microcontroller), which is actually
> gaining traction

ARMv7-R as well, also seems ARM is coming up with more !MMU's - v8-M,
v8-R. In addition, though only of academic interest, ARM MMU capable
platform's can run !MMU Linux.

afzal

> - arch/sh has an open-source J2 core that was added not that long ago,
> it seems to
>   be the only SH compatible core that anyone is working on.
> - arch/microblaze supports both MMU/NOMMU modes (most use an MMU)
> - arch/m68k supports several NOMMU targets, both the coldfire SoCs and the
>   classic processors
> - c6x has no MMU


Re: arm64: allwinner: Add support for TERES I laptop

2018-03-15 Thread afzal mohammed
Hi,

On Mon, Mar 12, 2018 at 04:10:45PM +, Harald Geyer wrote:
> This series adds support for the TERES I open hardware laptop produced
> by olimex. With these patches and a bootloader capable of setting up
> simple framebuffer the laptop is quite useable.

Thanks for the patches

w/ defconfig could reach to prompt via serial console using audio
jack.

And just by enabling PWM_SUN4I & FB_SIMPLE, laptop could function
standalone as well.

Suggestions (feel free to ignore):

1. seems currently only review comment pending is on simple
framebuffer, perhaps you can proceed removing just that so that a
basic bootable system can be achieved at the earliest (iiuc, anyway
drm would be the final solution for display)

2. in next revision (if), may be you can put keywords DIY and/or Open
Hardware (irrespective of whatever exactly that means) Laptop in the
subject itself, that might bring more interest/eyeballs, especially at
this time of ME & so on.

Regards
afzal


Re: [tip:x86/pti] x86/speculation: Use IBRS if available before calling into firmware

2018-02-11 Thread afzal mohammed
Hi,

On Sun, Feb 11, 2018 at 11:19:10AM -0800, tip-bot for David Woodhouse wrote:

> x86/speculation: Use IBRS if available before calling into firmware
> 
> Retpoline means the kernel is safe because it has no indirect branches.
> But firmware isn't, so use IBRS for firmware calls if it's available.

afaui, so only retpoline means still mitigation not enough.

Also David W has mentioned [1] that even with retpoline, IBPB is also
required (except Sky Lake).

If IBPB & IBRS is not supported by ucode, shouldn't the below indicate
some thing on the lines of Mitigation not enough ?

> - return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
> + return sprintf(buf, "%s%s%s%s\n", 
> spectre_v2_strings[spectre_v2_enabled],
>  boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
> +boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
>  spectre_v2_module_string());

On 4.16-rc1, w/ GCC 7.3.0,

/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer 
sanitization
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full generic 
retpoline

Here for the user (at least for me), it is not clear whether the
mitigation is enough. In the present system (Ivy Bridge), as ucode
update is not available, IBPB is not printed along with
"spectre_v2:Mitigation", so unless i am missing something, till then
this system should be considered vulnerable, but for a user not
familiar with details of the issue, it cannot be deduced.

Perhaps an additional status field [OKAY,PARTIAL] to Mitigation in
sysfs might be helpful. All these changes are in the air for me, this
is from a user perspective, sorry if my feedback seems idiotic.

afzal


[1] lkml.kernel.org/r/1516638426.9521.20.ca...@infradead.org


Re: [PATCH] doc: memory-barriers: reStructure Text

2018-01-04 Thread afzal mohammed
Hi,

On Thu, Jan 04, 2018 at 11:27:55AM +0100, Markus Heiser wrote:

> IMO symlinks are mostly ending in a mess, URLs are never stable.
> There is a 
> 
>  https://www.kernel.org/doc/html/latest/objects.inv
> 
> to handle such requirements. Take a look at *intersphinx* :
> 
>  http://www.sphinx-doc.org/en/stable/ext/intersphinx.html
> 
> to see how it works:  Each Sphinx HTML build creates a file named objects.inv 
> that
> contains a mapping from object names to URIs relative to the HTML set’s root.
> 
> This means articles from external (like lwn articles) has to be recompiled.
> Not perfect, but a first solution. 

Thanks for the details.

> I really like them

Initially i was sceptical of rst & once instead of hitting the fly,
hit "make htmldocs" on the keyboard :), and the opinion about it was
changed. It was easy to navigate through various docs & the realized
that various topics (& many) were present (yes, it was there earlier
also, but had to dive inside Documentation & search, while viewing the
toplevel index.html made them standout). It was like earlier you had
to go after docs, but now it was docs coming after you, that is my
opinion.

Later while fighting with memory-barriers.txt, felt that it might be
good for it as well as to be in that company.

And the readability as a text is not hurt as well.

It was thought that rst conversion could be done quickly, but since
this was my first attempt with rst, had to put some effort to get a
not so bad output, even if this patch dies, i am happy to have learnt
rst conversion to some extent.

> > Upon trying to understand memory-barriers.txt, i felt that it might be
> > better to have it in PDF/HTML format, thus attempted to convert it to
> > rst. And i see it not being welcomed, hence shelving the conversion.
> 
> I think that's a pity.

When one of the author of the original document objected, i felt it is
better to backoff. But if there is a consensus, i will proceed.

afzal


Re: [PATCH] doc: memory-barriers: reStructure Text

2018-01-03 Thread afzal mohammed
Hi,

On Thu, Jan 04, 2018 at 09:48:50AM +0800, Boqun Feng wrote:

> > The location chosen is "Documentation/kernel-hacking", i was unsure
> > where this should reside & there was no .rst file in top-level directory
> > "Documentation", so put it into one of the existing folder that seemed
> > to me as not that unsuitable.
> > 
> > Other files refer to memory-barrier.txt, those also needs to be
> > adjusted based on where .rst can reside.

> How do you plan to handle the external references? For example, the
> following LWN articles has a link this file:
> 
>   https://lwn.net/Articles/718628/
> 
> And changing the name and/or location will break that link, AFAIK.

If necessary to handle these, symlink might help here i believe.

Upon trying to understand memory-barriers.txt, i felt that it might be
better to have it in PDF/HTML format, thus attempted to convert it to
rst. And i see it not being welcomed, hence shelving the conversion.

afzal


Re: [PATCH] doc: memory-barriers: reStructure Text

2018-01-03 Thread afzal mohammed
Hi,

On Thu, Jan 04, 2018 at 12:48:28AM +0100, Peter Zijlstra wrote:

> > Let PDF & HTML's be created out of memory-barriers Text by
> > reStructuring.

> So I hate this rst crap with a passion, so NAK from me.

Okay, the outcome is exactly as was feared.

Abondoning the patch, let this be > /dev/null

afzal


[PATCH] doc: memory-barriers: reStructure Text

2018-01-03 Thread afzal mohammed
Let PDF & HTML's be created out of memory-barriers Text by
reStructuring.

reStructuring done were,
1. Section headers modification, lower header case except start
2. Removal of manual index(contents section), since it now gets created
   automatically for html/pdf
3. Internal cross reference for easy navigation
4. Alignment adjustments
5. Strong emphasis made wherever there was emphasis earlier (through
   other ways), strong was chosen as normal emphasis showed in italics,
   which was felt to be not enough & strong showed it in bold
6. ASCII text & code snippets in literal blocks
7. Backquotes for inline instances in the paragraph's where they are
   expressed not in English, but in C, pseudo-code, file path etc.
8. Notes section created out of the earlier notes
9. Manual numbering replaced by auto-numbering
10.Bibliography (References section) made such that it can be
   cross-linked

Signed-off-by: afzal mohammed 
---

Hi,

With this change, pdf & html could be generated. There certainly are
improvements to be made, but thought of first knowing whether migrating
memory-barriers from txt to rst is welcome.

The location chosen is "Documentation/kernel-hacking", i was unsure
where this should reside & there was no .rst file in top-level directory
"Documentation", so put it into one of the existing folder that seemed
to me as not that unsuitable.

Other files refer to memory-barrier.txt, those also needs to be
adjusted based on where .rst can reside.

afzal


 Documentation/kernel-hacking/index.rst |1 +
 .../memory-barriers.rst}   | 1707 ++--
 2 files changed, 837 insertions(+), 871 deletions(-)
 rename Documentation/{memory-barriers.txt => 
kernel-hacking/memory-barriers.rst} (63%)

diff --git a/Documentation/kernel-hacking/index.rst 
b/Documentation/kernel-hacking/index.rst
index fcb0eda3cca3..20eb56d02ea5 100644
--- a/Documentation/kernel-hacking/index.rst
+++ b/Documentation/kernel-hacking/index.rst
@@ -7,3 +7,4 @@ Kernel Hacking Guides
 
hacking
locking
+   memory-barriers
diff --git a/Documentation/memory-barriers.txt 
b/Documentation/kernel-hacking/memory-barriers.rst
similarity index 63%
rename from Documentation/memory-barriers.txt
rename to Documentation/kernel-hacking/memory-barriers.rst
index 479ecec80593..60b6a8be8a09 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/kernel-hacking/memory-barriers.rst
@@ -1,14 +1,13 @@
-
-LINUX KERNEL MEMORY BARRIERS
-
+
+Linux kernel memory barriers
+
 
-By: David Howells 
-Paul E. McKenney 
-Will Deacon 
-Peter Zijlstra 
+:Authors: David Howells ,
+  Paul E. McKenney ,
+  Will Deacon ,
+  Peter Zijlstra 
 
-==
-DISCLAIMER
+Disclaimer
 ==
 
 This document is not a specification; it is intentionally (for the sake of
@@ -21,10 +20,9 @@ hardware.
 
 The purpose of this document is twofold:
 
- (1) to specify the minimum functionality that one can rely on for any
- particular barrier, and
-
- (2) to provide a guide as to how to use the barriers that are available.
+* to specify the minimum functionality that one can rely on for any
+  particular barrier
+* to provide a guide as to how to use the barriers that are available
 
 Note that an architecture can provide more than the minimum requirement
 for any particular barrier, but if the architecture provides less than
@@ -35,78 +33,10 @@ architecture because the way that arch works renders an 
explicit barrier
 unnecessary in that case.
 
 
-
-CONTENTS
-
-
- (*) Abstract memory access model.
-
- - Device operations.
- - Guarantees.
-
- (*) What are memory barriers?
-
- - Varieties of memory barrier.
- - What may not be assumed about memory barriers?
- - Data dependency barriers.
- - Control dependencies.
- - SMP barrier pairing.
- - Examples of memory barrier sequences.
- - Read memory barriers vs load speculation.
- - Multicopy atomicity.
-
- (*) Explicit kernel barriers.
-
- - Compiler barrier.
- - CPU memory barriers.
- - MMIO write barrier.
-
- (*) Implicit kernel memory barriers.
-
- - Lock acquisition functions.
- - Interrupt disabling functions.
- - Sleep and wake-up functions.
- - Miscellaneous functions.
-
- (*) Inter-CPU acquiring barrier effects.
-
- - Acquires vs memory accesses.
- - Acquires vs I/O accesses.
-
- (*) Where are memory barriers needed?
-
- - Interprocessor interaction.
- - Atomic operations.
- - Accessing devices.
- - Interrupts.
-
- (*) Kernel I/O barrier effects.
-
- (*) Assumed minimum execution ordering model.
-
- (*) The effects of the cpu cache.
-
- - Cache coherency.
- - Cache coherency vs DMA

Re: Prototype patch for Linux-kernel memory model

2017-12-22 Thread afzal mohammed
Hi,

On Fri, Dec 22, 2017 at 09:41:32AM +0530, afzal mohammed wrote:
> On Thu, Dec 21, 2017 at 08:15:02AM -0800, Paul E. McKenney wrote:

> > Have you installed and run the herd tool?  Doing so would allow you
> > to experiment with changes to the litmus tests.
> 
> Yes, i installed herd tool and then i was at a loss :(, so started
> re-reading the documentation, yet to run any of the tests.

Above was referring to "opam install herdtools7" & the pre-requisites,
with the current HEAD of herd, build fails as below, but builds fine
with the latest tag - 7.47.

Could run a couple of tests as well now, thanks.

afzal


herdtools7(master)$ make all
sh ./build.sh $HOME
+ /usr/bin/ocamldep.opt -modules gen/RISCVCompile_gen.ml > 
gen/RISCVCompile_gen.ml.depends
File "gen/RISCVCompile_gen.ml", line 94, characters 8-9:
Error: Syntax error
Command exited with code 2.
Compilation unsuccessful after building 1439 targets (0 cached) in 00:00:59.
Makefile:4: recipe for target 'all' failed
make: *** [all] Error 10


Re: Prototype patch for Linux-kernel memory model

2017-12-21 Thread afzal mohammed
Hi,

On Thu, Dec 21, 2017 at 08:15:02AM -0800, Paul E. McKenney wrote:
> On Thu, Dec 21, 2017 at 09:00:55AM +0530, afzal mohammed wrote:

> > Since it is now mentioned that r1 can have final value of 0, though it
> > is understood, it might make things crystal clear and for the sake of
> > completeness to also show the non-automatic variable x being
> > initialized to 0.
> 
> Here we rely on the C-language and Linux-kernel convention that global
> variables that are not explicitly initialized are initialized to zero.
> (Also the documented behavior of the litmus tests and the herd tool that
> uses them.)  So that part should be OK as is.

Okay, that was suggested to bring parity with some of the examples in
explanation.txt, where global variables are explicitly initalized to
zero, that unconsciously made me feel that litmus tests also follow
that pattern, but checking again realize that litmus tests are not so.

> 
> Nevertheless, thank you for your review and comments!

Thanks for taking the effort to reply.

> Have you installed and run the herd tool?  Doing so would allow you
> to experiment with changes to the litmus tests.

Yes, i installed herd tool and then i was at a loss :(, so started
re-reading the documentation, yet to run any of the tests.

afzal


Re: Prototype patch for Linux-kernel memory model

2017-12-20 Thread afzal mohammed
Hi,

On Wed, Dec 20, 2017 at 08:45:38AM -0800, Paul E. McKenney wrote:
> On Wed, Dec 20, 2017 at 05:01:45PM +0530, afzal mohammed wrote:

> > > +It is tempting to assume that CPU0()'s store to x is globally ordered
> > > +before CPU1()'s store to z, but this is not the case:
> > > +
> > > + /* See Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus. */
> > > + void CPU0(void)
> > > + {
> > > + WRITE_ONCE(x, 1);
> > > + smp_store_release(&y, 1);
> > > + }
> > > +
> > > + void CPU1(void)
> > > + {
> > > + r1 = smp_load_acquire(y);
> > > + smp_store_release(&z, 1);
> > > + }
> > > +
> > > + void CPU2(void)
> > > + {
> > > + WRITE_ONCE(z, 2);
> > > + smp_mb();
> > > + r2 = READ_ONCE(x);
> > > + }
> > > +
> > > +One might hope that if the final value of r1 is 1 and the final value
> > > +of z is 2, then the final value of r2 must also be 1, but the opposite
> > > +outcome really is possible.
> > 
> > As there are 3 variables to have the values, perhaps, it might be
> > clearer to have instead of "the opposite.." - "the final value need
> > not be 1" or was that a read between the lines left as an exercise to
> > the idiots ;)
> 
> Heh!  Good catch, thank you!  How about the following for the paragraph
> immediately after that litmus test?
> 
>   One might hope that if the final value of r0 is 1 and the final
>   value of z is 2, then the final value of r1 must also be 1,
>   but it really is possible for r1 to have the final value of 0.
>   The reason, of course, is that in this version, CPU2() is not
>   part of the release-acquire chain.  This situation is accounted
>   for in the rules of thumb below.
> 
> I also fixed r1 and r2 to match the names in the actual litmus test.

Since it is now mentioned that r1 can have final value of 0, though it
is understood, it might make things crystal clear and for the sake of
completeness to also show the non-automatic variable x being
initialized to 0.

Thanks for taking into account my opinion.

afzal


Re: Prototype patch for Linux-kernel memory model

2017-12-20 Thread afzal mohammed
Hi,

Is this patch not destined to the HEAD of Torvalds ?, got that feeling
as this was in flight around merge window & have not yet made there.

On Wed, Nov 15, 2017 at 08:37:49AM -0800, Paul E. McKenney wrote:

> diff --git a/tools/memory-model/Documentation/recipes.txt 
> b/tools/memory-model/Documentation/recipes.txt

> +Taking off the training wheels
> +==
:
> +Release-acquire chains
> +--
:
> +It is tempting to assume that CPU0()'s store to x is globally ordered
> +before CPU1()'s store to z, but this is not the case:
> +
> + /* See Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus. */
> + void CPU0(void)
> + {
> + WRITE_ONCE(x, 1);
> + smp_store_release(&y, 1);
> + }
> +
> + void CPU1(void)
> + {
> + r1 = smp_load_acquire(y);
> + smp_store_release(&z, 1);
> + }
> +
> + void CPU2(void)
> + {
> + WRITE_ONCE(z, 2);
> + smp_mb();
> + r2 = READ_ONCE(x);
> + }
> +
> +One might hope that if the final value of r1 is 1 and the final value
> +of z is 2, then the final value of r2 must also be 1, but the opposite
> +outcome really is possible.

As there are 3 variables to have the values, perhaps, it might be
clearer to have instead of "the opposite.." - "the final value need
not be 1" or was that a read between the lines left as an exercise to
the idiots ;)

afzal


>  The reason, of course, is that in this
> +version, CPU2() is not part of the release-acquire chain.  This
> +situation is accounted for in the rules of thumb below.


Re: Prototype patch for Linux-kernel memory model

2017-12-19 Thread afzal mohammed
Hi,

A trivial & late (sorry) comment,

On Wed, Nov 15, 2017 at 08:37:49AM -0800, Paul E. McKenney wrote:

> +THE HAPPENS-BEFORE RELATION: hb
> +---

> +Less trivial examples of prop all involve fences.  Unlike the simple
> +examples above, they can require that some instructions are executed
> +out of program order.  This next one should look familiar:
> +
> + int buf = 0, flag = 0;
> +
> + P0()
> + {
> + WRITE_ONCE(buf, 1);
> + smp_wmb();
> + WRITE_ONCE(flag, 1);
> + }
> +
> + P1()
> + {
> + int r1;
> + int r2;
> +
> + r1 = READ_ONCE(flag);
> + r2 = READ_ONCE(buf);
> + }
> +
> +This is the MP pattern again, with an smp_wmb() fence between the two
> +stores.  If r1 = 1 and r2 = 0 at the end then there is a prop link
> +from P1's second load to its first (backwards!).  The reason is
> +similar to the previous examples: The value P1 loads from buf gets
> +overwritten by P1's store to buf,

  P0's store to buf

afzal

> the fence guarantees that the store
> +to buf will propagate to P1 before the store to flag does, and the
> +store to flag propagates to P1 before P1 reads flag.
> +
> +The prop link says that in order to obtain the r1 = 1, r2 = 0 result,
> +P1 must execute its second load before the first.  Indeed, if the load
> +from flag were executed first, then the buf = 1 store would already
> +have propagated to P1 by the time P1's load from buf executed, so r2
> +would have been 1 at the end, not 0.  (The reasoning holds even for
> +Alpha, although the details are more complicated and we will not go
> +into them.)
> +
> +But what if we put an smp_rmb() fence between P1's loads?  The fence
> +would force the two loads to be executed in program order, and it
> +would generate a cycle in the hb relation: The fence would create a ppo
> +link (hence an hb link) from the first load to the second, and the
> +prop relation would give an hb link from the second load to the first.
> +Since an instruction can't execute before itself, we are forced to
> +conclude that if an smp_rmb() fence is added, the r1 = 1, r2 = 0
> +outcome is impossible -- as it should be.


Re: [PATCH 1/6] ARM: stm32: prepare stm32 family to welcome armv7 architecture

2017-12-12 Thread afzal mohammed
Hi,

On Mon, Dec 11, 2017 at 02:40:43PM +0100, Arnd Bergmann wrote:
> On Mon, Dec 11, 2017 at 11:25 AM, Linus Walleij

> >> This patch prepares the STM32 machine for the integration of Cortex-A
> >> based microprocessor (MPU), on top of the existing Cortex-M
> >> microcontroller family (MCU). Since both MCUs and MPUs are sharing
> >> common hardware blocks we can keep using ARCH_STM32 flag for most of
> >> them. If a hardware block is specific to one family we can use either
> >> ARCH_STM32_MCU or ARCH_STM32_MPU flag.

> To what degree do we need to treat them as separate families
> at all then? I wonder if the MCU/MPU distinction is always that
> clear along the Cortex-M/Cortex-A separation,

> What
> exactly would we miss if we do away with the ARCH_STM32_MCU
> symbol here?

Based on this patch series, the only difference seems to be w.r.t ARM
components, not peripherals outside ARM subystem. Vybrid VF610 is a
similar case, though not identical (it can have both instead of
either), deals w/o extra symbols,

8064887e02fd6 (ARM: vf610: enable Cortex-M4 configuration on Vybrid SoC)

> especially if
> we ever get to a chip that has both types of cores.

Your wish fulfilled, Vybrid VF610 has both A5 & M4F and mainline Linux
boots on both (simultaneously as well), and the second Linux support,
i.e. on M4 went thr' your keyboard, see above commit :)

There are quite a few others as well, TI's AM335x (A8 + M3), AM437x
(A9 + M3), AM57x (A15 + M4), but of these Cortex M's, the one in AM57x
only can be Linux'able. On others they are meant for PM with limited
resources.

> > So yesterdays application processors are todays MCU processors.
> >
> > I said this on a lecture for control systems a while back and
> > stated it as a reason I think RTOSes are not really seeing a bright
> > future compared to Linux.

> I think there is still lots of room for smaller RTOS in the long run,

Me being an electrical engineer & worked to some extent in motor
control on RTOS/no OS (the value of my opinion is questionable
though), the thought of handling the same in Linux (even RT) sends
shivers down my spine. Here, case being considered is the type of
motor (like permanent magnet ones) where each phase of the motor has
to be properly excited during every PWM period (say every 100us,
depending on the feedback, algorithm, other synchronization) w/o which
the motor that has been told to run might try to fly. This is
different from stepper motor where if control misbehaves/stops nothing
harmful normally happens.

But my opinion is a kind of knee-jerk reaction and based on prevalent
atitude in that field, hmm.., probably i should attempt it first.

Regards
afzal


Re: vger.kernel.org mail queue issue?

2017-05-02 Thread afzal mohammed
Hi,

On Mon, May 01, 2017 at 10:50:57AM -0400, David Miller wrote:
> From: afzal mohammed 
> > On Wed, Jan 11, 2017 at 09:07:35PM -0500, David Miller wrote:
> >> From: Florian Fainelli 

> >> > I am seeing emails being received right now from @vger.kernel.org that
> >> > seem to be from this morning according to my mailer. Has anything
> >> > changed on vger.kernel.org that could cause that? Other mailing-lists
> >> > (e.g: infradead.org) seems to be fine.
> > 
> >> Nope, in fact I've been aggressively removing bouncers lately
> >> and trying to keep the system running efficiently.
> >> 
> >> I kind of suspect that google has ramped up their rate limiting
> >> settings a little bit on gmail.
> >> 
> >> I'll try to keep an eye out.
> > 
> > Seems gmail again is receiving mails with a delay, the last received
> > lk mail has date as 30 Apr 2017 08:23:50 +0300, while here it is
> > around 01 May 2017 17:10 + 0530. And lkml archives has a lot of mails
> > after that.

> There is really nothing I can do about this.
> 
> The problem is that GMAIL has extremely restrictive rate limiting.  It
> really is insufficient for absorbing the rate at which postings are
> made on the lists during the busiest times of the day.  And when the
> rate it exceeded, the gmail accounts in question simply drop postings
> for a certain period of time.
> 
> So I have to intentionally back off the rate at which vger.kernel.org
> queues up to GMAIL accounts.
> 
> If I let it go at full speed then half of the postings would get
> dropped and people would miss content.

Thanks much for handling it the way you are now, it at least helps in
getting all mails instead of missing.

The last time even before Florian reported the issue, i was seeing it,
but initially thought it was a problem related to my account, tried
unsubscribe & subscribe, contacting list owner etc, none did help,
only upon seeing Florian's mail, did realize that it was a generic
GMAIL issue.

> Complain to GMAIL if you dislike this but I have tried in the past and
> they have no intention of increasing their default posting rate
> limits.

Don't know whether you had done some changes, now able to get mails in
realtime. Next time upon seeing this kind of issue will request GMAIL,
irrespective of the outcome, i will do my part.

And thanks for taking the time to reply.

Regards
afzal


Re: vger.kernel.org mail queue issue?

2017-05-01 Thread afzal mohammed
Hi,

On Wed, Jan 11, 2017 at 09:07:35PM -0500, David Miller wrote:
> From: Florian Fainelli 

> > I am seeing emails being received right now from @vger.kernel.org that
> > seem to be from this morning according to my mailer. Has anything
> > changed on vger.kernel.org that could cause that? Other mailing-lists
> > (e.g: infradead.org) seems to be fine.

> Nope, in fact I've been aggressively removing bouncers lately
> and trying to keep the system running efficiently.
> 
> I kind of suspect that google has ramped up their rate limiting
> settings a little bit on gmail.
> 
> I'll try to keep an eye out.

Seems gmail again is receiving mails with a delay, the last received
lk mail has date as 30 Apr 2017 08:23:50 +0300, while here it is
around 01 May 2017 17:10 + 0530. And lkml archives has a lot of mails
after that.

With filters based on TO|CC (my problem) & due to cross posted mails,
to realize the issue it takes some time, seems the issue is there for
last few days.

Regards
afzal


Re: [PATCH] ARM: nommu: access ID_PFR1 only if CPUID scheme

2017-03-23 Thread afzal mohammed
Hi,

On Thu, Mar 23, 2017 at 09:37:48PM +1000, Greg Ungerer wrote:
> Tested-by: Greg Ungerer 

Thanks Greg

Since there was no negative feedback yet, change has been deposited in
rmk's patch system as 8665/1

Regards
afzal 


Re: [PATCH] ARM: nommu: access ID_PFR1 only if CPUID scheme

2017-03-23 Thread afzal mohammed
Hi,

On Fri, Mar 17, 2017 at 10:10:34PM +0530, afzal mohammed wrote:
> Greg upon trying to boot no-MMU Kernel on ARM926EJ reported boot
> failure. He root caused it to ID_PFR1 access introduced by the
> commit mentioned in the fixes tag below.
> 
> All CP15 processors need not have processor feature registers, only
> for architectures defined by CPUID scheme would have it. Hence check
> for it before accessing processor feature register, ID_PFR1.
> 
> Fixes: f8300a0b5de0 ("ARM: 8647/2: nommu: dynamic exception base address 
> setting")
> Reported-by: Greg Ungerer 
> Signed-off-by: afzal mohammed 

Greg, can i add your Tested-by ?

Regards
afzal

> ---
> 
> Hi Russell,
> 
> It would be good to have the fix go in during -rc, as,
> 
> 1. Culprit commit went in during the last merge window
> 2. Though nothing supported in mainline is known to be broken, the
> original change needs to be modified to be reliable


[PATCH] ARM: nommu: access ID_PFR1 only if CPUID scheme

2017-03-17 Thread afzal mohammed
Greg upon trying to boot no-MMU Kernel on ARM926EJ reported boot
failure. He root caused it to ID_PFR1 access introduced by the
commit mentioned in the fixes tag below.

All CP15 processors need not have processor feature registers, only
for architectures defined by CPUID scheme would have it. Hence check
for it before accessing processor feature register, ID_PFR1.

Fixes: f8300a0b5de0 ("ARM: 8647/2: nommu: dynamic exception base address 
setting")
Reported-by: Greg Ungerer 
Signed-off-by: afzal mohammed 
---

Hi Russell,

It would be good to have the fix go in during -rc, as,

1. Culprit commit went in during the last merge window
2. Though nothing supported in mainline is known to be broken, the
original change needs to be modified to be reliable

Vladimir, this is being posted as the issue is taken care run time.

Regards
afzal

---
 arch/arm/mm/nommu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index 3b5c7aaf9c76..33a45bd96860 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -303,7 +303,10 @@ static inline void set_vbar(unsigned long val)
  */
 static inline bool security_extensions_enabled(void)
 {
-   return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4);
+   /* Check CPUID Identification Scheme before ID_PFR1 read */
+   if ((read_cpuid_id() & 0x000f) == 0x000f)
+   return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4);
+   return 0;
 }
 
 static unsigned long __init setup_vectors_base(void)
-- 
2.12.0



Re: [PATCH RESEND] ARM: ep93xx: Disable TS-72xx watchdog before uncompressing

2017-02-08 Thread Afzal Mohammed
Hi,

On Thu, Feb 02, 2017 at 12:12:26PM -0800, Florian Fainelli wrote:
> The TS-72xx/73xx boards have a CPLD watchdog which is configured to
> reset the board after 8 seconds, if the kernel is large enough that this
> takes about this time to decompress the kernel, we will encounter a
> spurious reboot.

so once it reaches Kernel proper, that dog is being killed, right ?

iirc, TI AM335x's & AM43x's ROM code too leaves the on-chip watchdog
enabled & the bootloader disables it (else once it boots to prompt, it
reboots always unless watchdog driver [if present] takes care of it),
Lokesh, right ?

But yes, that brings a bootloader dependency.

Regards
afzal


Re: [PATCH v3 0/3] ARM: !MMU: v7-A support, dynamic vectors base handling

2017-02-01 Thread Afzal Mohammed
Hi,

On Wed, Feb 01, 2017 at 10:33:17AM +, Vladimir Murzin wrote:
> On 31/01/17 19:24, Russell King - ARM Linux wrote:
> > On Tue, Jan 31, 2017 at 06:34:46PM +0530, afzal mohammed wrote:

> >> ARM core changes to support !MMU Kernel on v7-A MMU processors.
> >>
> >> Based on the feedback from Russell, it was decided to handle vector
> >> base dynamically in C for no-MMU & work towards the the goal of
> >> removing VECTORS_BASE from Kconfig.
> > 
> > Looks good from my perspective.  If Vladimir can reply about patch 2,
> > then I think we'll be good to go with these.  Thanks.

Patch system has been updated with this series along with Vladimir's
Tested-by on patch 2.

Thanks

> My R-class and M-class setups continue to work with this series applied on
> top of next-20170201 plus

> following fixup for PATCH 2/3

Yes, Russell has applied another patch and the context changes a little.

> 
>  -#define VECTORS_BASE  UL(0x)
>  -
> - /*
> -  * We fix the TCM memories max 32 KiB ITCM resp DTCM at these
> -  * locations
> + #ifdef CONFIG_XIP_KERNEL
> + #define KERNEL_START  _sdata
> + #else
> 
> FWIW: Tested-by: Vladimir Murzin 

Thanks

Regards
afzal


[PATCH v3 2/3] ARM: nommu: display vectors base

2017-01-31 Thread afzal mohammed
VECTORS_BASE displays the exception base address. Now on no-MMU as
the exception base address is dynamically estimated, define
VECTORS_BASE to the variable holding it.

As it is the case, limit VECTORS_BASE constant definition to MMU.

Suggested-by: Russell King 
Signed-off-by: afzal mohammed 
---

v3:
 Simplify by defining VECTORS_BASE to vectors_base
v2:
 A change to accomodate bisectability resolution on patch 1/4

 arch/arm/include/asm/memory.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 0b5416fe7709..780549a78937 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -83,8 +83,15 @@
 #define IOREMAP_MAX_ORDER  24
 #endif
 
+#define VECTORS_BASE   UL(0x)
+
 #else /* CONFIG_MMU */
 
+#ifndef __ASSEMBLY__
+extern unsigned long vectors_base;
+#define VECTORS_BASE   vectors_base
+#endif
+
 /*
  * The limitation of user task size can grow up to the end of free ram region.
  * It is difficult to define and perhaps will never meet the original meaning
@@ -111,8 +118,6 @@
 
 #endif /* !CONFIG_MMU */
 
-#define VECTORS_BASE   UL(0x)
-
 /*
  * We fix the TCM memories max 32 KiB ITCM resp DTCM at these
  * locations
-- 
2.11.0



[PATCH v3 3/3] ARM: nommu: remove Hivecs configuration is asm

2017-01-31 Thread afzal mohammed
Now that exception based address is handled dynamically for
processors with CP15, remove Hivecs configuration in assembly.

Signed-off-by: afzal mohammed 
Tested-by: Vladimir Murzin 
---

v3:
 Vladimir's Tested-by

 arch/arm/kernel/head-nommu.S | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 6b4eb27b8758..2e21e08de747 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -152,11 +152,6 @@ __after_proc_init:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
bic r0, r0, #CR_I
 #endif
-#ifdef CONFIG_CPU_HIGH_VECTOR
-   orr r0, r0, #CR_V
-#else
-   bic r0, r0, #CR_V
-#endif
mcr p15, 0, r0, c1, c0, 0   @ write control reg
 #elif defined (CONFIG_CPU_V7M)
/* For V7M systems we want to modify the CCR similarly to the SCTLR */
-- 
2.11.0



[PATCH v3 1/3] ARM: nommu: dynamic exception base address setting

2017-01-31 Thread afzal mohammed
No-MMU dynamic exception base address configuration on CP15
processors. In the case of low vectors, decision based on whether
security extensions are enabled & whether remap vectors to RAM
CONFIG option is selected.

For no-MMU without CP15, current default value of 0x0 is retained.

Signed-off-by: afzal mohammed 
Tested-by: Vladimir Murzin 
---

v3:
 Vladimir's Tested-by
v2:
 Use existing helpers to detect security extensions
 Rewrite a CPP step to C for readability

 arch/arm/mm/nommu.c | 52 ++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index 2740967727e2..20ac52579952 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -22,6 +23,8 @@
 
 #include "mm.h"
 
+unsigned long vectors_base;
+
 #ifdef CONFIG_ARM_MPU
 struct mpu_rgn_info mpu_rgn_info;
 
@@ -278,15 +281,60 @@ static void sanity_check_meminfo_mpu(void) {}
 static void __init mpu_setup(void) {}
 #endif /* CONFIG_ARM_MPU */
 
+#ifdef CONFIG_CPU_CP15
+#ifdef CONFIG_CPU_HIGH_VECTOR
+static unsigned long __init setup_vectors_base(void)
+{
+   unsigned long reg = get_cr();
+
+   set_cr(reg | CR_V);
+   return 0x;
+}
+#else /* CONFIG_CPU_HIGH_VECTOR */
+/* Write exception base address to VBAR */
+static inline void set_vbar(unsigned long val)
+{
+   asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc");
+}
+
+/*
+ * Security extensions, bits[7:4], permitted values,
+ * 0b - not implemented, 0b0001/0b0010 - implemented
+ */
+static inline bool security_extensions_enabled(void)
+{
+   return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4);
+}
+
+static unsigned long __init setup_vectors_base(void)
+{
+   unsigned long base = 0, reg = get_cr();
+
+   set_cr(reg & ~CR_V);
+   if (security_extensions_enabled()) {
+   if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM))
+   base = CONFIG_DRAM_BASE;
+   set_vbar(base);
+   } else if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) {
+   if (CONFIG_DRAM_BASE != 0)
+   pr_err("Security extensions not enabled, vectors cannot 
be remapped to RAM, vectors base will be 0x\n");
+   }
+
+   return base;
+}
+#endif /* CONFIG_CPU_HIGH_VECTOR */
+#endif /* CONFIG_CPU_CP15 */
+
 void __init arm_mm_memblock_reserve(void)
 {
 #ifndef CONFIG_CPU_V7M
+   vectors_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vectors_base() : 0;
/*
 * Register the exception vector page.
 * some architectures which the DRAM is the exception vector to trap,
 * alloc_page breaks with error, although it is not NULL, but "0."
 */
-   memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE);
+   memblock_reserve(vectors_base, 2 * PAGE_SIZE);
 #else /* ifndef CONFIG_CPU_V7M */
/*
 * There is no dedicated vector page on V7-M. So nothing needs to be
@@ -310,7 +358,7 @@ void __init sanity_check_meminfo(void)
  */
 void __init paging_init(const struct machine_desc *mdesc)
 {
-   early_trap_init((void *)CONFIG_VECTORS_BASE);
+   early_trap_init((void *)vectors_base);
mpu_setup();
bootmem_init();
 }
-- 
2.11.0



[PATCH v3 0/3] ARM: !MMU: v7-A support, dynamic vectors base handling

2017-01-31 Thread afzal mohammed
Hi,

ARM core changes to support !MMU Kernel on v7-A MMU processors.

Based on the feedback from Russell, it was decided to handle vector
base dynamically in C for no-MMU & work towards the the goal of
removing VECTORS_BASE from Kconfig.

Exception base address is dynamically found out in C & configured.

This series also does the preparation for CONFIG_VECTORS_BASE removal.
Once vector region setup, used by Cortex-R, is made devoid of
VECTORS_BASE, it can be removed from Kconfig. [2] already decouples it
from Kconfig for MMU.

Vladimir's Tested-by on v2 has been removed from [PATCH 2/3] as it has
been changed. And as it doesn't affect functionality, Tested-by has been
retained on the other two patches, Vladimir, let me know if not okay.

This series has been verified over current mainline plus [1,2] on
1. Vybrid Cosmic+
 a. Cortex-M4 - !MMU Kernel
 b. Cortex-A5 - MMU Kernel.

This series also has been verified over Vladimir's series [3] along
with [1,2] on
1. Vybrid Cosmic+
 a. Cortex-M4 !MMU Kernel
 b. Cortex-A5 MMU Kernel
 c. Cortex-A5 !MMU Kernel
2. AM437x IDK
 a. Cortex-A9 MMU Kernel
 b. Cortex-A9 !MMU Kernel

Regards
afzal

v3:
=> Removed [PATCH 1/4] of v2 as it is in -next
=> Simplify by defining VECTORS_BASE to variable holding dynamically
calculated exception base address

v2:
=> Fix bisectability issue on !MMU builds
=> UL suffix on VECTORS_BASE definition
=> Use existing helpers to detect security extensions
=> Rewrite a CPP step to C for readability

[1] "[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM"

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473593.html
(in -next)

[2] "[PATCH v2 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig"

http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/481904.html
(in -next)

[3] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM",

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html
(git://linux-arm.org/linux-vm.git nommu-rfc-v2)

afzal mohammed (3):
  ARM: nommu: dynamic exception base address setting
  ARM: nommu: display vectors base
  ARM: nommu: remove Hivecs configuration is asm

 arch/arm/include/asm/memory.h |  9 ++--
 arch/arm/kernel/head-nommu.S  |  5 -
 arch/arm/mm/nommu.c   | 52 +--
 3 files changed, 57 insertions(+), 9 deletions(-)

-- 
2.11.0



Re: [PATCH v2 3/4] ARM: nommu: display vectors base

2017-01-30 Thread Afzal Mohammed
Hi,

On Mon, Jan 30, 2017 at 02:03:26PM +, Russell King - ARM Linux wrote:
> On Sun, Jan 22, 2017 at 08:52:12AM +0530, afzal mohammed wrote:

> > The exception base address is now dynamically estimated for no-MMU,
> > display it. As it is the case, now limit VECTORS_BASE usage to MMU
> > scenario.

> > +#define VECTORS_BASE   UL(0x)
> > +
> >  #else /* CONFIG_MMU */
> >  
> >  /*
> > @@ -111,8 +113,6 @@
> >  
> >  #endif /* !CONFIG_MMU */
> >  
> > -#define VECTORS_BASE   UL(0x)
> 
> I think adding a definition for VECTORS_BASE in asm/memory.h for nommu:
> 
> extern unsigned long vectors_base;
> #define VECTORS_BASE  vectors_base

Above was required to be enclosed by below,

 #ifndef __ASSEMBLY__
 #endif

Putting it inside the existing #ifndef __ASSEMBLY__ (which encloses
other externs) in asm/memory.h would put it alongside not so related
definitions as compared to the existing location.

And the existing #ifndef __ASSEMBLY__ in asm/memory.h is a bit down
that makes the above stand separately,

> > +#ifdef CONFIG_MMU
> > MLK(VECTORS_BASE, VECTORS_BASE + PAGE_SIZE),
> > +#else
> > +   MLK(vectors_base, vectors_base + PAGE_SIZE),
> > +#endif
> 
> will mean that this conditional becomes unnecessary.

> > -#endif
> > +#else /* CONFIG_MMU */
> > +extern unsigned long vectors_base;
> > +#endif /* CONFIG_MMU */
> 
> and you don't need this here either.

but the above improvements make the patch simpler.

Regards
afzal


[PATCH] ARM: vf610m4: defconfig: enable EXT4 filesystem

2017-01-23 Thread afzal mohammed
Enable EXT4_FS to have rootfs in EXT[2-4].

Other changes are result of savedefconfig keeping minimal config (even
without enabling EXT4_FS, these would be present).

Signed-off-by: afzal mohammed 
---

Hi Shawn,

i am not sure about the route for this patch, sending it you as the
Vybrid maintainer. Last (& the only) change to this file was picked
by Arnd.

Regards
afzal

 arch/arm/configs/vf610m4_defconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/configs/vf610m4_defconfig 
b/arch/arm/configs/vf610m4_defconfig
index aeb2482c492e..b7ecb83a95b6 100644
--- a/arch/arm/configs/vf610m4_defconfig
+++ b/arch/arm/configs/vf610m4_defconfig
@@ -7,7 +7,6 @@ CONFIG_BLK_DEV_INITRD=y
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 # CONFIG_MMU is not set
-CONFIG_ARM_SINGLE_ARMV7M=y
 CONFIG_ARCH_MXC=y
 CONFIG_SOC_VF610=y
 CONFIG_SET_MEM_PARAM=y
@@ -38,5 +37,5 @@ CONFIG_SERIAL_FSL_LPUART_CONSOLE=y
 CONFIG_MFD_SYSCON=y
 # CONFIG_HID is not set
 # CONFIG_USB_SUPPORT is not set
+CONFIG_EXT4_FS=y
 # CONFIG_MISC_FILESYSTEMS is not set
-# CONFIG_FTRACE is not set
-- 
2.11.0



Re: [PATCH 2/4] ARM: nommu: dynamic exception base address setting

2017-01-21 Thread Afzal Mohammed
Hi,

On Fri, Jan 20, 2017 at 09:50:22PM +0530, Afzal Mohammed wrote:
> On Thu, Jan 19, 2017 at 01:59:09PM +, Vladimir Murzin wrote:

> > You can use
> > 
> > cpuid_feature_extract(CPUID_EXT_PFR1, 4)
> > 
> > and add a comment explaining what we are looking for and why.

W.r.t comments, tried to keep it concise, C tokens doing a part of it.

> Yes, that is better, was not aware of this, did saw CPUID_EXT_PFR1 as
> an unused macro.

> > > +#ifdef CONFIG_CPU_CP15
> > > + vectors_base = setup_vectors_base();
> > > +#endif
> > 
> > alternatively it can be
> > 
> > unsigned long vector_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vbar() 
> > : 0;
> 
> Yes that certainly is better.

Have kept the function name as setup_vector_base() as in addition to
setting up VBAR, V bit also has to be configured by it - so that
function name remains true to it's name.

v2 with changes has been posted.

Regards
afzal


Re: [PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig

2017-01-21 Thread Afzal Mohammed
Hi,

On Thu, Jan 19, 2017 at 02:24:24PM +, Russell King - ARM Linux wrote:
> On Thu, Jan 19, 2017 at 02:07:39AM +0530, afzal mohammed wrote:

> > +#define VECTORS_BASE   0x
> 
> This should be UL(0x)

This has been taken care in v2.

Regards
afzal


[PATCH v2 4/4] ARM: nommu: remove Hivecs configuration is asm

2017-01-21 Thread afzal mohammed
Now that exception based address is handled dynamically for
processors with CP15, remove Hivecs configuration in assembly.

Signed-off-by: afzal mohammed 
---
 arch/arm/kernel/head-nommu.S | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 6b4eb27b8758..2e21e08de747 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -152,11 +152,6 @@ __after_proc_init:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
bic r0, r0, #CR_I
 #endif
-#ifdef CONFIG_CPU_HIGH_VECTOR
-   orr r0, r0, #CR_V
-#else
-   bic r0, r0, #CR_V
-#endif
mcr p15, 0, r0, c1, c0, 0   @ write control reg
 #elif defined (CONFIG_CPU_V7M)
/* For V7M systems we want to modify the CCR similarly to the SCTLR */
-- 
2.11.0



[PATCH v2 3/4] ARM: nommu: display vectors base

2017-01-21 Thread afzal mohammed
The exception base address is now dynamically estimated for no-MMU,
display it. As it is the case, now limit VECTORS_BASE usage to MMU
scenario.

Signed-off-by: afzal mohammed 
---

v2:
 A change to accomodate bisectability resolution on patch 1/4

 arch/arm/include/asm/memory.h | 4 ++--
 arch/arm/mm/init.c| 5 +
 arch/arm/mm/mm.h  | 5 +++--
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 0b5416fe7709..9ae474bf84fc 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -83,6 +83,8 @@
 #define IOREMAP_MAX_ORDER  24
 #endif
 
+#define VECTORS_BASE   UL(0x)
+
 #else /* CONFIG_MMU */
 
 /*
@@ -111,8 +113,6 @@
 
 #endif /* !CONFIG_MMU */
 
-#define VECTORS_BASE   UL(0x)
-
 /*
  * We fix the TCM memories max 32 KiB ITCM resp DTCM at these
  * locations
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 823e119a5daa..9c68e3aba87c 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -522,7 +522,12 @@ void __init mem_init(void)
"  .data : 0x%p" " - 0x%p" "   (%4td kB)\n"
"   .bss : 0x%p" " - 0x%p" "   (%4td kB)\n",
 
+#ifdef CONFIG_MMU
MLK(VECTORS_BASE, VECTORS_BASE + PAGE_SIZE),
+#else
+   MLK(vectors_base, vectors_base + PAGE_SIZE),
+#endif
+
 #ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index ce727d47275c..546f09437fca 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -79,8 +79,9 @@ struct static_vm {
 extern struct list_head static_vmlist;
 extern struct static_vm *find_static_vm_vaddr(void *vaddr);
 extern __init void add_static_vm_early(struct static_vm *svm);
-
-#endif
+#else /* CONFIG_MMU */
+extern unsigned long vectors_base;
+#endif /* CONFIG_MMU */
 
 #ifdef CONFIG_ZONE_DMA
 extern phys_addr_t arm_dma_limit;
-- 
2.11.0



[PATCH v2 2/4] ARM: nommu: dynamic exception base address setting

2017-01-21 Thread afzal mohammed
No-MMU dynamic exception base address configuration on CP15
processors. In the case of low vectors, decision based on whether
security extensions are enabled & whether remap vectors to RAM
CONFIG option is selected.

For no-MMU without CP15, current default value of 0x0 is retained.

Signed-off-by: afzal mohammed 
---

v2:
 Use existing helpers to detect security extensions
 Rewrite a CPP step to C for readability

 arch/arm/mm/nommu.c | 52 ++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index 2740967727e2..20ac52579952 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -22,6 +23,8 @@
 
 #include "mm.h"
 
+unsigned long vectors_base;
+
 #ifdef CONFIG_ARM_MPU
 struct mpu_rgn_info mpu_rgn_info;
 
@@ -278,15 +281,60 @@ static void sanity_check_meminfo_mpu(void) {}
 static void __init mpu_setup(void) {}
 #endif /* CONFIG_ARM_MPU */
 
+#ifdef CONFIG_CPU_CP15
+#ifdef CONFIG_CPU_HIGH_VECTOR
+static unsigned long __init setup_vectors_base(void)
+{
+   unsigned long reg = get_cr();
+
+   set_cr(reg | CR_V);
+   return 0x;
+}
+#else /* CONFIG_CPU_HIGH_VECTOR */
+/* Write exception base address to VBAR */
+static inline void set_vbar(unsigned long val)
+{
+   asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc");
+}
+
+/*
+ * Security extensions, bits[7:4], permitted values,
+ * 0b - not implemented, 0b0001/0b0010 - implemented
+ */
+static inline bool security_extensions_enabled(void)
+{
+   return !!cpuid_feature_extract(CPUID_EXT_PFR1, 4);
+}
+
+static unsigned long __init setup_vectors_base(void)
+{
+   unsigned long base = 0, reg = get_cr();
+
+   set_cr(reg & ~CR_V);
+   if (security_extensions_enabled()) {
+   if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM))
+   base = CONFIG_DRAM_BASE;
+   set_vbar(base);
+   } else if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) {
+   if (CONFIG_DRAM_BASE != 0)
+   pr_err("Security extensions not enabled, vectors cannot 
be remapped to RAM, vectors base will be 0x\n");
+   }
+
+   return base;
+}
+#endif /* CONFIG_CPU_HIGH_VECTOR */
+#endif /* CONFIG_CPU_CP15 */
+
 void __init arm_mm_memblock_reserve(void)
 {
 #ifndef CONFIG_CPU_V7M
+   vectors_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vectors_base() : 0;
/*
 * Register the exception vector page.
 * some architectures which the DRAM is the exception vector to trap,
 * alloc_page breaks with error, although it is not NULL, but "0."
 */
-   memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE);
+   memblock_reserve(vectors_base, 2 * PAGE_SIZE);
 #else /* ifndef CONFIG_CPU_V7M */
/*
 * There is no dedicated vector page on V7-M. So nothing needs to be
@@ -310,7 +358,7 @@ void __init sanity_check_meminfo(void)
  */
 void __init paging_init(const struct machine_desc *mdesc)
 {
-   early_trap_init((void *)CONFIG_VECTORS_BASE);
+   early_trap_init((void *)vectors_base);
mpu_setup();
bootmem_init();
 }
-- 
2.11.0



[PATCH v2 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig

2017-01-21 Thread afzal mohammed
For MMU configurations, VECTORS_BASE is always 0x, a macro
definition will suffice.

For no-MMU, exception base address is dynamically determined in
subsequent patches. To preserve bisectability, now make the
macro applicable for no-MMU scenario too.

Thanks to 0-DAY kernel test infrastructure that found the
bisectability issue. This macro will be restricted to MMU case upon
dynamically determining exception base address for no-MMU.

Once exception address is handled dynamically for no-MMU,
VECTORS_BASE can be removed from Kconfig.

Suggested-by: Russell King 
Signed-off-by: afzal mohammed 
---

v2: 
 Fix bisectability issue on !MMU builds
 UL suffix on VECTORS_BASE definition

 arch/arm/include/asm/memory.h  | 2 ++
 arch/arm/mach-berlin/platsmp.c | 3 ++-
 arch/arm/mm/dump.c | 5 +++--
 arch/arm/mm/init.c | 4 ++--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 76cbd9c674df..0b5416fe7709 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -111,6 +111,8 @@
 
 #endif /* !CONFIG_MMU */
 
+#define VECTORS_BASE   UL(0x)
+
 /*
  * We fix the TCM memories max 32 KiB ITCM resp DTCM at these
  * locations
diff --git a/arch/arm/mach-berlin/platsmp.c b/arch/arm/mach-berlin/platsmp.c
index 93f90688db18..578d41031abf 100644
--- a/arch/arm/mach-berlin/platsmp.c
+++ b/arch/arm/mach-berlin/platsmp.c
@@ -15,6 +15,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -75,7 +76,7 @@ static void __init berlin_smp_prepare_cpus(unsigned int 
max_cpus)
if (!cpu_ctrl)
goto unmap_scu;
 
-   vectors_base = ioremap(CONFIG_VECTORS_BASE, SZ_32K);
+   vectors_base = ioremap(VECTORS_BASE, SZ_32K);
if (!vectors_base)
goto unmap_scu;
 
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index 9fe8e241335c..21192d6eda40 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 struct addr_marker {
@@ -31,8 +32,8 @@ static struct addr_marker address_markers[] = {
{ 0,"vmalloc() Area" },
{ VMALLOC_END,  "vmalloc() End" },
{ FIXADDR_START,"Fixmap Area" },
-   { CONFIG_VECTORS_BASE,  "Vectors" },
-   { CONFIG_VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
+   { VECTORS_BASE, "Vectors" },
+   { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
{ -1,   NULL },
 };
 
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 370581aeb871..823e119a5daa 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -521,8 +522,7 @@ void __init mem_init(void)
"  .data : 0x%p" " - 0x%p" "   (%4td kB)\n"
"   .bss : 0x%p" " - 0x%p" "   (%4td kB)\n",
 
-   MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
-   (PAGE_SIZE)),
+   MLK(VECTORS_BASE, VECTORS_BASE + PAGE_SIZE),
 #ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
-- 
2.11.0



[PATCH v2 0/4] ARM: v7-A !MMU support, prepare CONFIG_VECTORS_BASE removal

2017-01-21 Thread afzal mohammed
Hi,

ARM core changes to support !MMU Kernel on v7-A MMU processors. This
series also does the preparation for CONFIG_VECTORS_BASE removal.

Based on the feedback from Russell, it was decided to handle vector
base dynamically in C for no-MMU & work towards the the goal of
removing VECTORS_BASE from Kconfig. MMU platform's always have
exception base address at 0x, hence a macro was defined and
it was decoupled from Kconfig. No-MMU CP15 scenario is handled
dynamically in C. Once vector region setup, used by Cortex-R, is
made devoid of VECTORS_BASE, it can be removed from Kconfig.

This series has been verified over current mainline plus [2] on
Vybrid Cosmic+, Cortex-M4 - !MMU Kernel and Cortex-A5 - MMU Kernel.

This series also has been verified over Vladimir's series plus [2] on
1. Vybrid Cosmic+
 a. Cortex-M4 !MMU Kernel
 b. Cortex-A5 MMU Kernel
 c. Cortex-A5 !MMU Kernel
2. AM437x IDK
 a. Cortex-A9 MMU Kernel
 b. Cortex-A9 !MMU Kernel

Regards
afzal


v2:
=> Fix bisectability issue on !MMU builds
=> UL suffix on VECTORS_BASE definition
=> Use existing helpers to detect security extensions
=> Rewrite a CPP step to C for readability

[1] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM",

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html
(git://linux-arm.org/linux-vm.git nommu-rfc-v2)

[2] "[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM"

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473593.html
(in -next)

afzal mohammed (4):
  ARM: mmu: decouple VECTORS_BASE from Kconfig
  ARM: nommu: dynamic exception base address setting
  ARM: nommu: display vectors base
  ARM: nommu: remove Hivecs configuration is asm

 arch/arm/include/asm/memory.h  |  2 ++
 arch/arm/kernel/head-nommu.S   |  5 
 arch/arm/mach-berlin/platsmp.c |  3 ++-
 arch/arm/mm/dump.c |  5 ++--
 arch/arm/mm/init.c |  9 ++--
 arch/arm/mm/mm.h   |  5 ++--
 arch/arm/mm/nommu.c| 52 --
 7 files changed, 67 insertions(+), 14 deletions(-)

-- 
2.11.0



Re: [PATCH 2/4] ARM: nommu: dynamic exception base address setting

2017-01-20 Thread Afzal Mohammed
Hi,

On Thu, Jan 19, 2017 at 01:59:09PM +, Vladimir Murzin wrote:
> On 18/01/17 20:38, afzal mohammed wrote:

> > +#define ID_PFR1_SE (0x3 << 4)  /* Security extension enable bits */
> 
> This bitfiled is 4 bits wide.

Since only 2 LSb's out of the 4 were enough to detect whether security
extensions were enabled, it was done so. i am going to use your below
suggestion & this would be taken care by that.

> > +   if (security_extensions_enabled()) {
> 
> You can use
> 
> cpuid_feature_extract(CPUID_EXT_PFR1, 4)
> 
> and add a comment explaining what we are looking for and why.

Yes, that is better, was not aware of this, did saw CPUID_EXT_PFR1 as
an unused macro.

> > +#ifdef CONFIG_CPU_CP15
> > +   vectors_base = setup_vectors_base();
> > +#endif
> 
> alternatively it can be
> 
>   unsigned long vector_base = IS_ENABLED(CONFIG_CPU_CP15) ? setup_vbar() 
> : 0;

Yes that certainly is better.

Regards
afzal


Re: [PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig

2017-01-20 Thread Afzal Mohammed
Hi,

On Thu, Jan 19, 2017 at 02:24:24PM +, Russell King - ARM Linux wrote:
> On Thu, Jan 19, 2017 at 02:07:39AM +0530, afzal mohammed wrote:

> > +++ b/arch/arm/include/asm/memory.h

> > +#define VECTORS_BASE   0x
> 
> This should be UL(0x)

> > -   MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
> > -   (PAGE_SIZE)),
> > +   MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)),
> 
> which means you don't need it here, which will then fix the build error
> reported by the 0-day builder.

Seems there is some confusion here,

VECTORS_BASE definition above in memory.h is enclosed within
CONFIG_MMU. Robot used a no-MMU defconfig, it didn't get a
VECTORS_BASE definition at this patch, causing the build error. Our
dear robot mentioned that my HEAD didn't break build, but
bisectability is broken at this point.

With "PATCH 3/4 ARM: nommu: display vectors base", the above is
changed to
#ifdef CONFIG_MMU
MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)),
#else
...
#endif
thus making the series build again for no-MMU

One option to keep bisectability would be to squash this with PATCH
3/4, but i think a better & natural solution would be define
VECTORS_BASE outside of
#ifdef CONFIG_MMU
...
#else
...
#endif
and then in PATCH 3/4, move VECTORS_BASE to be inside
#ifdef CONFIG_MMU
...
#else

Regards
afzal


Re: [PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig

2017-01-19 Thread Afzal Mohammed
+ Marvell Berlin SoC maintainers - Sebastian, Jisheng

On Thu, Jan 19, 2017 at 02:07:39AM +0530, afzal mohammed wrote:
> For MMU configurations, VECTORS_BASE is always 0x, a macro
> definition will suffice.
> 
> Once exception address is handled dynamically for no-MMU also (this
> would involve taking care of region setup too), VECTORS_BASE can be
> removed from Kconfig.
> 
> Suggested-by: Russell King 
> Signed-off-by: afzal mohammed 
> ---
> 
> Though there was no build error without inclusion of asm/memory.h, to
> be on the safer side it has been added, to reduce chances of build
> breakage in random configurations.
> 
>  arch/arm/include/asm/memory.h  | 2 ++
>  arch/arm/mach-berlin/platsmp.c | 3 ++-
>  arch/arm/mm/dump.c | 5 +++--
>  arch/arm/mm/init.c | 4 ++--
>  4 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 76cbd9c674df..9cc9f1dbc88e 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -83,6 +83,8 @@
>  #define IOREMAP_MAX_ORDER24
>  #endif
>  
> +#define VECTORS_BASE 0x
> +
>  #else /* CONFIG_MMU */
>  
>  /*
> diff --git a/arch/arm/mach-berlin/platsmp.c b/arch/arm/mach-berlin/platsmp.c
> index 93f90688db18..578d41031abf 100644
> --- a/arch/arm/mach-berlin/platsmp.c
> +++ b/arch/arm/mach-berlin/platsmp.c
> @@ -15,6 +15,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -75,7 +76,7 @@ static void __init berlin_smp_prepare_cpus(unsigned int 
> max_cpus)
>   if (!cpu_ctrl)
>   goto unmap_scu;
>  
> - vectors_base = ioremap(CONFIG_VECTORS_BASE, SZ_32K);
> + vectors_base = ioremap(VECTORS_BASE, SZ_32K);
>   if (!vectors_base)
>   goto unmap_scu;
>  
> diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
> index 9fe8e241335c..21192d6eda40 100644
> --- a/arch/arm/mm/dump.c
> +++ b/arch/arm/mm/dump.c
> @@ -18,6 +18,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
>  
>  struct addr_marker {
> @@ -31,8 +32,8 @@ static struct addr_marker address_markers[] = {
>   { 0,"vmalloc() Area" },
>   { VMALLOC_END,  "vmalloc() End" },
>   { FIXADDR_START,"Fixmap Area" },
> - { CONFIG_VECTORS_BASE,  "Vectors" },
> - { CONFIG_VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
> + { VECTORS_BASE, "Vectors" },
> + { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
>   { -1,   NULL },
>  };
>  
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 370581aeb871..cf47f86f79ed 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -521,8 +522,7 @@ void __init mem_init(void)
>   "  .data : 0x%p" " - 0x%p" "   (%4td kB)\n"
>   "   .bss : 0x%p" " - 0x%p" "   (%4td kB)\n",
>  
> - MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
> - (PAGE_SIZE)),
> + MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)),
>  #ifdef CONFIG_HAVE_TCM
>   MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
>   MLK(ITCM_OFFSET, (unsigned long) itcm_end),
> -- 
> 2.11.0


Re: [PATCH 3/4] ARM: nommu: display vectors base

2017-01-19 Thread Afzal Mohammed
Hi,

On Wed, Jan 18, 2017 at 10:13:15PM +, Russell King - ARM Linux wrote:
> On Thu, Jan 19, 2017 at 02:08:37AM +0530, afzal mohammed wrote:

> > +   MLK_ROUNDUP(vectors_base, vectors_base + PAGE_SIZE),
> 
> I think MLK() will do here - no need to use the rounding-up version
> as PAGE_SIZE is a multiple of 1k.

Yes, i will replace it.

Earlier, used MLK(), got some build error, now checking again, no
build error, i should have messed up something at that time.

Regards
afzal


[PATCH 3/4] ARM: nommu: display vectors base

2017-01-18 Thread afzal mohammed
The exception base address is now dynamically estimated for no-MMU
case, display it.

Signed-off-by: afzal mohammed 
---
 arch/arm/mm/init.c | 5 +
 arch/arm/mm/mm.h   | 5 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index cf47f86f79ed..9e11f255c3bf 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -522,7 +522,12 @@ void __init mem_init(void)
"  .data : 0x%p" " - 0x%p" "   (%4td kB)\n"
"   .bss : 0x%p" " - 0x%p" "   (%4td kB)\n",
 
+#ifdef CONFIG_MMU
MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)),
+#else
+   MLK_ROUNDUP(vectors_base, vectors_base + PAGE_SIZE),
+#endif
+
 #ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index ce727d47275c..546f09437fca 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -79,8 +79,9 @@ struct static_vm {
 extern struct list_head static_vmlist;
 extern struct static_vm *find_static_vm_vaddr(void *vaddr);
 extern __init void add_static_vm_early(struct static_vm *svm);
-
-#endif
+#else /* CONFIG_MMU */
+extern unsigned long vectors_base;
+#endif /* CONFIG_MMU */
 
 #ifdef CONFIG_ZONE_DMA
 extern phys_addr_t arm_dma_limit;
-- 
2.11.0



[PATCH 4/4] ARM: nommu: remove Hivecs configuration is asm

2017-01-18 Thread afzal mohammed
Now that exception based address is handled dynamically for
processors with CP15, remove Highvecs configuration in assembly.

Signed-off-by: afzal mohammed 
---
 arch/arm/kernel/head-nommu.S | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 6b4eb27b8758..2e21e08de747 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -152,11 +152,6 @@ __after_proc_init:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
bic r0, r0, #CR_I
 #endif
-#ifdef CONFIG_CPU_HIGH_VECTOR
-   orr r0, r0, #CR_V
-#else
-   bic r0, r0, #CR_V
-#endif
mcr p15, 0, r0, c1, c0, 0   @ write control reg
 #elif defined (CONFIG_CPU_V7M)
/* For V7M systems we want to modify the CCR similarly to the SCTLR */
-- 
2.11.0



[PATCH 2/4] ARM: nommu: dynamic exception base address setting

2017-01-18 Thread afzal mohammed
No-MMU dynamic exception base address configuration on CP15
processors. In the case of low vectors, decision based on whether
security extensions are enabled & whether remap vectors to RAM
CONFIG option is selected.

For no-MMU without CP15, current default value of 0x0 is retained.

Signed-off-by: afzal mohammed 
---
 arch/arm/mm/nommu.c | 64 +++--
 1 file changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index 2740967727e2..db8e784f20f3 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -22,6 +23,8 @@
 
 #include "mm.h"
 
+unsigned long vectors_base;
+
 #ifdef CONFIG_ARM_MPU
 struct mpu_rgn_info mpu_rgn_info;
 
@@ -278,15 +281,72 @@ static void sanity_check_meminfo_mpu(void) {}
 static void __init mpu_setup(void) {}
 #endif /* CONFIG_ARM_MPU */
 
+#ifdef CONFIG_CPU_CP15
+#ifdef CONFIG_CPU_HIGH_VECTOR
+static unsigned long __init setup_vectors_base(void)
+{
+   unsigned long reg = get_cr();
+
+   set_cr(reg | CR_V);
+   return 0x;
+}
+#else /* CONFIG_CPU_HIGH_VECTOR */
+/*
+ * ID_PRF1 bits (CP#15 ID_PFR1)
+ */
+#define ID_PFR1_SE (0x3 << 4)  /* Security extension enable bits */
+
+/* Read processor feature register ID_PFR1 */
+static unsigned long get_id_pfr1(void)
+{
+   unsigned long val;
+
+   asm("mrc p15, 0, %0, c0, c1, 1" : "=r" (val) : : "cc");
+   return val;
+}
+
+/* Write exception base address to VBAR */
+static void set_vbar(unsigned long val)
+{
+   asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc");
+}
+
+static bool __init security_extensions_enabled(void)
+{
+   return !!(get_id_pfr1() & ID_PFR1_SE);
+}
+
+static unsigned long __init setup_vectors_base(void)
+{
+   unsigned long base = 0, reg = get_cr();
+
+   set_cr(reg & ~CR_V);
+   if (security_extensions_enabled()) {
+   if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM))
+   base = CONFIG_DRAM_BASE;
+   set_vbar(base);
+   } else if (IS_ENABLED(CONFIG_REMAP_VECTORS_TO_RAM)) {
+   if (CONFIG_DRAM_BASE != 0)
+   pr_err("Security extensions not enabled, vectors cannot 
be remapped to RAM, vectors base will be 0x\n");
+   }
+
+   return base;
+}
+#endif /* CONFIG_CPU_HIGH_VECTOR */
+#endif /* CONFIG_CPU_CP15 */
+
 void __init arm_mm_memblock_reserve(void)
 {
 #ifndef CONFIG_CPU_V7M
+#ifdef CONFIG_CPU_CP15
+   vectors_base = setup_vectors_base();
+#endif
/*
 * Register the exception vector page.
 * some architectures which the DRAM is the exception vector to trap,
 * alloc_page breaks with error, although it is not NULL, but "0."
 */
-   memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE);
+   memblock_reserve(vectors_base, 2 * PAGE_SIZE);
 #else /* ifndef CONFIG_CPU_V7M */
/*
 * There is no dedicated vector page on V7-M. So nothing needs to be
@@ -310,7 +370,7 @@ void __init sanity_check_meminfo(void)
  */
 void __init paging_init(const struct machine_desc *mdesc)
 {
-   early_trap_init((void *)CONFIG_VECTORS_BASE);
+   early_trap_init((void *)vectors_base);
mpu_setup();
bootmem_init();
 }
-- 
2.11.0



[PATCH 1/4] ARM: mmu: decouple VECTORS_BASE from Kconfig

2017-01-18 Thread afzal mohammed
For MMU configurations, VECTORS_BASE is always 0x, a macro
definition will suffice.

Once exception address is handled dynamically for no-MMU also (this
would involve taking care of region setup too), VECTORS_BASE can be
removed from Kconfig.

Suggested-by: Russell King 
Signed-off-by: afzal mohammed 
---

Though there was no build error without inclusion of asm/memory.h, to
be on the safer side it has been added, to reduce chances of build
breakage in random configurations.

 arch/arm/include/asm/memory.h  | 2 ++
 arch/arm/mach-berlin/platsmp.c | 3 ++-
 arch/arm/mm/dump.c | 5 +++--
 arch/arm/mm/init.c | 4 ++--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 76cbd9c674df..9cc9f1dbc88e 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -83,6 +83,8 @@
 #define IOREMAP_MAX_ORDER  24
 #endif
 
+#define VECTORS_BASE   0x
+
 #else /* CONFIG_MMU */
 
 /*
diff --git a/arch/arm/mach-berlin/platsmp.c b/arch/arm/mach-berlin/platsmp.c
index 93f90688db18..578d41031abf 100644
--- a/arch/arm/mach-berlin/platsmp.c
+++ b/arch/arm/mach-berlin/platsmp.c
@@ -15,6 +15,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -75,7 +76,7 @@ static void __init berlin_smp_prepare_cpus(unsigned int 
max_cpus)
if (!cpu_ctrl)
goto unmap_scu;
 
-   vectors_base = ioremap(CONFIG_VECTORS_BASE, SZ_32K);
+   vectors_base = ioremap(VECTORS_BASE, SZ_32K);
if (!vectors_base)
goto unmap_scu;
 
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index 9fe8e241335c..21192d6eda40 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 struct addr_marker {
@@ -31,8 +32,8 @@ static struct addr_marker address_markers[] = {
{ 0,"vmalloc() Area" },
{ VMALLOC_END,  "vmalloc() End" },
{ FIXADDR_START,"Fixmap Area" },
-   { CONFIG_VECTORS_BASE,  "Vectors" },
-   { CONFIG_VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
+   { VECTORS_BASE, "Vectors" },
+   { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
{ -1,   NULL },
 };
 
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 370581aeb871..cf47f86f79ed 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -521,8 +522,7 @@ void __init mem_init(void)
"  .data : 0x%p" " - 0x%p" "   (%4td kB)\n"
"   .bss : 0x%p" " - 0x%p" "   (%4td kB)\n",
 
-   MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
-   (PAGE_SIZE)),
+   MLK(UL(VECTORS_BASE), UL(VECTORS_BASE) + (PAGE_SIZE)),
 #ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
-- 
2.11.0



[PATCH 0/4] ARM: v7-A !MMU support, CONFIG_VECTORS_BASE removal (almost)

2017-01-18 Thread afzal mohammed
Hi,

ARM core changes to support !MMU Kernel on v7-A MMU processors. This
series also does the preparation for CONFIG_VECTORS_BASE removal.

Based on the feedback from Russell on the initial patches (part RFC),
it was decided to handle vector base dynamically in C & work towards
the the goal of removing VECTORS_BASE from Kconfig. MMU platform's
always have exception base address at 0x, while no-MMU CP15
scenario was handled dynamically in C. Hivecs handling for no-MMU CP15
that was done in asm has been moved to C as part of dynamic handling.
This now leaves only vector region setup, used by Cortex-R, to be made
devoid of VECTORS_BASE so as to remove it from Kconfig.

Vladimir is planning to rework MPU code, so it has been left untouched.
VECTORS_BASE is to be removed from Kconfig after the MPU region rework.

This series has been tested on top of mainline on,
1. Vybrid CM4 (!MMU)
2. Vybrid CA5 (MMU)

and on top of Vladimir's series[1] on,
1. Vybrid CM4 (!MMU)
2. Vybrid CA5 (MMU & !MMU)
3. AM437x IDK (MMU & !MMU)

Both above had an additional patch [2] as well, which is in next now.

Regards
afzal

[1] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM",

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html
(git://linux-arm.org/linux-vm.git nommu-rfc-v2)

[2] "[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM"

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/473593.html

afzal mohammed (4):
  ARM: mmu: decouple VECTORS_BASE from Kconfig
  ARM: nommu: dynamic exception base address setting
  ARM: nommu: display vectors base
  ARM: nommu: remove Hivecs configuration is asm

 arch/arm/include/asm/memory.h  |  2 ++
 arch/arm/kernel/head-nommu.S   |  5 
 arch/arm/mach-berlin/platsmp.c |  3 +-
 arch/arm/mm/dump.c |  5 ++--
 arch/arm/mm/init.c |  9 --
 arch/arm/mm/mm.h   |  5 ++--
 arch/arm/mm/nommu.c| 64 --
 7 files changed, 79 insertions(+), 14 deletions(-)

-- 
2.11.0



Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2017-01-16 Thread Afzal Mohammed
Hi,

On Mon, Jan 16, 2017 at 09:53:41AM +, Vladimir Murzin wrote:
> On 15/01/17 11:47, Afzal Mohammed wrote:

> > mpu_setup_region() in arch/arm/mm/nommu.c that takes care of
> > MPU_RAM_REGION only. And that seems to be a kind of redundant as it is
> > also done in asm at __setup_mpu(). Git blames asm & C to consecutive
> > commits, that makes me a little shaky about the conclusion on it being
> > redundant.
> 
> It is not redundant. MPU setup is done it two steps. The first step done in
> asm to enable caches, there only kernel image is covered; the second step 
> takes
> care on the whole RAM given via dt or "mem=" parameter.

Okay, thanks for the details.

> > Thinking of invoking mpu_setup() from secondary_start_kernel() in
> > arch/arm/kernel/smp.c, with mpu_setup() being slightly modified to
> > avoid storing region details again when invoked by secondary cpu's.
> 
> I have wip patches on reworking MPU setup code. The idea is to start using
> mpu_rgn_info[] actively, so asm part for secondariness would just sync-up
> content of that array. Additionally, it seems that we can reuse free MPU slots
> to cover memory which is discarded due to MPU alignment restrictions... 
> 
> > Vladimir, once changes are done after a revisit, i would need your
> > help to test on Cortex-R.
> 
> I'm more than happy to help, but currently I have limited bandwidth, so if it
> can wait till the next dev cycle I'd try to make MPU rework finished by that
> time.

Okay, please feel free to do MPU rework the way you were planning, you
know more details & have the platform to achieve it with much higher
efficiency than me.

Regards
afzal


Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2017-01-15 Thread Afzal Mohammed
Hi,

On Sat, Jan 07, 2017 at 10:43:39PM +0530, Afzal Mohammed wrote:
> On Tue, Dec 13, 2016 at 10:02:26AM +, Russell King - ARM Linux wrote:

> > Also, if the region setup for the vectors was moved as well, it would
> > then be possible to check the ID registers to determine whether this
> > is supported, and make the decision where to locate the vectors base
> > more dynamically.
> 
> This would affect Cortex-R's, which is a bit concerning due to lack of
> those platforms with me, let me try to get it right.

QEMU too doesn't seem to provide a Cortex-R target

> Seems translating __setup_mpu() altogether to C

afaics, a kind of C translation is already present as
mpu_setup_region() in arch/arm/mm/nommu.c that takes care of
MPU_RAM_REGION only. And that seems to be a kind of redundant as it is
also done in asm at __setup_mpu(). Git blames asm & C to consecutive
commits, that makes me a little shaky about the conclusion on it being
redundant.

> & installing at a later, but suitable place might be better.

But looks like enabling MPU can't be moved to C & that would
necessitate keeping at least some portion of__setu_mpu() in asm.

Instead, moving region setup only for vectors to C as Russell
suggested at first would have to be done.

A kind of diff at the end is in my mind, with additional changes to
handle the similar during secondary cpu bringup too.

Thinking of invoking mpu_setup() from secondary_start_kernel() in
arch/arm/kernel/smp.c, with mpu_setup() being slightly modified to
avoid storing region details again when invoked by secondary cpu's.

Vladimir, once changes are done after a revisit, i would need your
help to test on Cortex-R.

As an aside, wasn't aware of the fact that Cortex-R supports SMP
Linux, had thought that, of !MMU one's, only Blackfin & J2 had it.


> Also !MMU Kernel could boot on 3 ARM v7-A platforms - AM335x Beagle
> Bone (A8), AM437x IDK (A9) & Vybrid VF610 (on A5 core, note that it
> has M4 core too)

Talking about Cortex-M, AMx3's too have it, to be specific M3, but
they are not Linux-able unlike the one in VF610.

Regards
afzal

--->8---

diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index e0565d73e49e..f8ac79b6136d 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -249,20 +249,6 @@ ENTRY(__setup_mpu)
setup_region r0, r5, r6, MPU_INSTR_SIDE @ 0x0, BG region, enabled
 2: isb
 
-   /* Vectors region */
-   set_region_nr r0, #MPU_VECTORS_REGION
-   isb
-   /* Shared, inaccessible to PL0, rw PL1 */
-   mov r0, #CONFIG_VECTORS_BASE@ Cover from VECTORS_BASE
-   ldr r5,=(MPU_AP_PL1RW_PL0NA | MPU_RGN_NORMAL)
-   /* Writing N to bits 5:1 (RSR_SZ) --> region size 2^N+1 */
-   mov r6, #(((2 * PAGE_SHIFT - 1) << MPU_RSR_SZ) | 1 << MPU_RSR_EN)
-
-   setup_region r0, r5, r6, MPU_DATA_SIDE  @ VECTORS_BASE, PL0 NA, enabled
-   beq 3f  @ Memory-map not unified
-   setup_region r0, r5, r6, MPU_INSTR_SIDE @ VECTORS_BASE, PL0 NA, enabled
-3: isb
-
/* Enable the MPU */
mrc p15, 0, r0, c1, c0, 0   @ Read SCTLR
bic r0, r0, #CR_BR  @ Disable the 'default mem-map'
diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index e82056df0635..7fe8906322d5 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -269,12 +269,19 @@ void __init mpu_setup(void)
ilog2(memblock.memory.regions[0].size),
MPU_AP_PL1RW_PL0RW | MPU_RGN_NORMAL);
if (region_err) {
-   panic("MPU region initialization failure! %d", region_err);
+   panic("MPU RAM region initialization failure! %d", region_err);
} else {
-   pr_info("Using ARMv7 PMSA Compliant MPU. "
-"Region independence: %s, Max regions: %d\n",
-   mpu_iside_independent() ? "Yes" : "No",
-   mpu_max_regions());
+   region_err = mpu_setup_region(MPU_VECTORS_REGION, vectors_base,
+   ilog2(memblock.memory.regions[0].size),
+   MPU_AP_PL1RW_PL0NA | MPU_RGN_NORMAL);
+   if (region_err) {
+   panic("MPU VECTOR region initialization failure! %d",
+ region_err);
+   } else {
+   pr_info("Using ARMv7 PMSA Compliant MPU. "
+   "Region independence: %s, Max regions: %d\n",
+   mpu_iside_independent() ? "Yes" : "No",
+   mpu_max_regions());
}
 }
 #else


Re: [PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case

2017-01-08 Thread Afzal Mohammed
Hi,

On Sat, Jan 07, 2017 at 06:24:15PM +, Russell King - ARM Linux wrote:

> As I've said, CONFIG_VECTORS_BASE is _always_ 0x on MMU, so
> this always displays 0x - 0x1000 here.

> Older ARM CPUs without the V bit (ARMv3 and early ARMv4) expect the
> vectors to be at virtual address zero.
> 
> Most of these systems place ROM at physical address 0, so when the CPU
> starts from reset (with the MMU off) it starts executing from ROM.  Once
> the MMU is initialised, RAM can be placed there and the ROM vectors
> replaced.  The side effect of this is that NULL pointer dereferences
> are not always caught... of course, it makes sense that the page at
> address 0 is write protected even from the kernel, so a NULL pointer
> write dereference doesn't corrupt the vectors.
> 
> How we handle it in Linux is that we always map the page for the vectors
> at 0x, and then only map that same page at 0x if we have
> a CPU that needs it there.

Thanks for the information, i was not aware, seems that simplifies MMU
case handling.

arch/arm/mm/mmu.c:

if (!vectors_high()) {
map.virtual = 0;
map.length = PAGE_SIZE * 2;
map.type = MT_LOW_VECTORS;
create_mapping(&map);
}



arch/arm/include/asm/cp15.h:

#if __LINUX_ARM_ARCH__ >= 4
#define vectors_high()  (get_cr() & CR_V)
#else
#define vectors_high()  (0)
#endif

Deducing from your reply & above code snippets that for
__LINUX_ARM_ARCH__ >= 4, in all practical cases, vector_high() returns
true

Regards
afzal


Re: [PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case

2017-01-07 Thread Afzal Mohammed
Hi,

On Sat, Jan 07, 2017 at 11:32:27PM +0530, Afzal Mohammed wrote:

> i had thought that for MMU case if Hivecs is not enabled,
> CONFIG_VECTOR_BASE has to be considered as 0x at least for the

s/CONFIG_VECTOR_BASE/exception base address

> purpose of displaying exception base address.

Regards
afzal


Re: [PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case

2017-01-07 Thread Afzal Mohammed
Hi,

On Sat, Jan 07, 2017 at 05:38:32PM +, Russell King - ARM Linux wrote:
> On Sat, Jan 07, 2017 at 10:52:28PM +0530, afzal mohammed wrote:

> > TODO:
> > Kill off VECTORS_BASE completely - this would require to handle MMU
> >  case as well as ARM_MPU scenario dynamically.

> Why do you think MMU doesn't already handle it?

i meant here w.r.t displaying vector base address in
arch/arm/mm/init.c, i.e. dynamically get it based on Hivecs setting as
either 0x or 0x
> 
> >  config VECTORS_BASE
> > hex
> > -   default 0x if MMU || CPU_HIGH_VECTOR
> > -   default DRAM_BASE if REMAP_VECTORS_TO_RAM
> > +   default 0x if MMU
> > default 0x
> 
> When MMU=y, the resulting VECTORS_BASE is always 0x.  The only
> case where this ends up zero after your change is when MMU=n.

> The MMU case does have to cater for CPUs wanting vectors at 0x
> and at 0x, and this is handled via the page tables - but this
> has nothing to do with CONFIG_VECTORS_BASE.  CONFIG_VECTORS_BASE
> exists primarily for noMMU.

i had thought that for MMU case if Hivecs is not enabled,
CONFIG_VECTOR_BASE has to be considered as 0x at least for the
purpose of displaying exception base address.

One thing i have not yet understood is how CPU can take exception with
it base address as 0x (for Hivecs not enabled case) virtual
address as it is below Kernel memory map.

> For the Berlin and mm/dump code, we could very easily just have a
> #define VECTORS_BASE 0x in a header file and drop the CONFIG_
> prefix.

Okay, thanks for the tip.

Regards
afzal


[PATCH WIP 4/4] ARM: remove compile time vector base for CP15 case

2017-01-07 Thread afzal mohammed
vectors base is now dynamically updated for Hivecs as well as for
REMAP_VECTORS_TO_RAM case to DRAM_START. Hence remove these CP15
cases.

TODO:
Kill off VECTORS_BASE completely - this would require to handle MMU
 case as well as ARM_MPU scenario dynamically.

Signed-off-by: afzal mohammed 
---
 arch/arm/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index bc6f4065840e..720ee62b4955 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -232,8 +232,7 @@ config ARCH_MTD_XIP
 
 config VECTORS_BASE
hex
-   default 0x if MMU || CPU_HIGH_VECTOR
-   default DRAM_BASE if REMAP_VECTORS_TO_RAM
+   default 0x if MMU
default 0x
help
  The base address of exception vectors.  This must be two pages
-- 
2.11.0



[PATCH WIP 3/4] ARM: mm: nommu: display dynamic exception base

2017-01-07 Thread afzal mohammed
Display dynamically estimated nommu exception base.

TODO: Dynamically update MMU case too.

Signed-off-by: afzal mohammed 
---
 arch/arm/mm/init.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 370581aeb871..1777ee23a6a2 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -39,6 +39,10 @@
 
 #include "mm.h"
 
+#ifndef CONFIG_MMU
+extern unsigned long vectors_base;
+#endif
+
 #ifdef CONFIG_CPU_CP15_MMU
 unsigned long __init __clear_cr(unsigned long mask)
 {
@@ -521,8 +525,13 @@ void __init mem_init(void)
"  .data : 0x%p" " - 0x%p" "   (%4td kB)\n"
"   .bss : 0x%p" " - 0x%p" "   (%4td kB)\n",
 
+#ifdef CONFIG_MMU
MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
(PAGE_SIZE)),
+#else
+   MLK_ROUNDUP(vectors_base, vectors_base + PAGE_SIZE),
+#endif
+
 #ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
-- 
2.11.0



[PATCH WIP 2/4] ARM: nommu: remove Hivecs configuration is asm

2017-01-07 Thread afzal mohammed
Now that exception based address is handled dynamically for
processors with CP15, remove Hivecs configuration in assembly.

Signed-off-by: afzal mohammed 
---
 arch/arm/kernel/head-nommu.S | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 2ab026ffc270..e0565d73e49e 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -162,11 +162,6 @@ ENDPROC(secondary_startup_arm)
 #ifdef CONFIG_CPU_ICACHE_DISABLE
bic r0, r0, #CR_I
 #endif
-#ifdef CONFIG_CPU_HIGH_VECTOR
-   orr r0, r0, #CR_V
-#else
-   bic r0, r0, #CR_V
-#endif
mcr p15, 0, r0, c1, c0, 0   @ write control reg
 #elif defined (CONFIG_CPU_V7M)
/* For V7M systems we want to modify the CCR similarly to the SCTLR */
-- 
2.11.0



[PATCH WIP 1/4] ARM: nommu: dynamic exception base address setting

2017-01-07 Thread afzal mohammed
No-MMU dynamic exception base address configuration on processors
with CP15.

TODO: Handle MMU case as well as ARM_MPU scenario dynamically

Signed-off-by: afzal mohammed 
---
 arch/arm/mm/nommu.c | 62 +++--
 1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index 681cec879caf..e82056df0635 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -11,6 +11,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -23,6 +24,8 @@
 
 #include "mm.h"
 
+unsigned long vectors_base;
+
 #ifdef CONFIG_ARM_MPU
 struct mpu_rgn_info mpu_rgn_info;
 
@@ -279,15 +282,70 @@ static void sanity_check_meminfo_mpu(void) {}
 static void __init mpu_setup(void) {}
 #endif /* CONFIG_ARM_MPU */
 
+#ifdef CONFIG_CPU_CP15
+/*
+ * ID_PRF1 bits (CP#15 ID_PFR1)
+ */
+#define ID_PFR1_SE (0x3 << 4)  /* Security extension enable bits */
+
+#ifndef CONFIG_CPU_HIGH_VECTOR
+static inline unsigned long get_id_pfr1(void)
+{
+   unsigned long val;
+   asm("mrc p15, 0, %0, c0, c1, 1" : "=r" (val) : : "cc");
+   return val;
+}
+
+static inline void set_vbar(unsigned long val)
+{
+   asm("mcr p15, 0, %0, c12, c0, 0" : : "r" (val) : "cc");
+}
+
+static bool __init security_extensions_enabled(void)
+{
+   return !!(get_id_pfr1() & ID_PFR1_SE);
+}
+#endif
+
+static unsigned long __init setup_vector_base(void)
+{
+   unsigned long reg, base;
+
+   reg = get_cr();
+
+#ifdef CONFIG_CPU_HIGH_VECTOR
+   set_cr(reg | CR_V);
+   base = 0x;
+#else
+   set_cr(reg & ~CR_V);
+   base = 0;
+   if (security_extensions_enabled()) {
+#ifdef CONFIG_REMAP_VECTORS_TO_RAM
+   base = CONFIG_DRAM_BASE;
+#endif
+   set_vbar(base);
+   }
+#endif /* CONFIG_CPU_HIGH_VECTOR */
+
+   return base;
+}
+#endif /* CONFIG_CPU_CP15 */
+
 void __init arm_mm_memblock_reserve(void)
 {
 #ifndef CONFIG_CPU_V7M
+
+#ifdef CONFIG_CPU_CP15
+   vectors_base = setup_vector_base();
+#else
+   vectors_base = CONFIG_VECTORS_BASE;
+#endif
/*
 * Register the exception vector page.
 * some architectures which the DRAM is the exception vector to trap,
 * alloc_page breaks with error, although it is not NULL, but "0."
 */
-   memblock_reserve(CONFIG_VECTORS_BASE, 2 * PAGE_SIZE);
+   memblock_reserve(vectors_base, 2 * PAGE_SIZE);
 #else /* ifndef CONFIG_CPU_V7M */
/*
 * There is no dedicated vector page on V7-M. So nothing needs to be
@@ -311,7 +369,7 @@ void __init sanity_check_meminfo(void)
  */
 void __init paging_init(const struct machine_desc *mdesc)
 {
-   early_trap_init((void *)CONFIG_VECTORS_BASE);
+   early_trap_init((void *)vectors_base);
mpu_setup();
bootmem_init();
 }
-- 
2.11.0



Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2017-01-07 Thread Afzal Mohammed
Hi,

On Tue, Dec 13, 2016 at 10:02:26AM +, Russell King - ARM Linux wrote:

> Is there really any need to do this in head.S ?  I believe it's
> entirely possible to do it later - arch/arm/mm/nommu.c:paging_init().

As memblock_reserve() for exception address was done before
paging_init(), seems it has to be done by arm_mm_memblock_reserve() in
arch/arm/mm/nommu.c, WIP patch follows, but not that happy -
conditional compilation's make it not so readable, still better to
see in C.

> Also, if the region setup for the vectors was moved as well, it would
> then be possible to check the ID registers to determine whether this
> is supported, and make the decision where to locate the vectors base
> more dynamically.

This would affect Cortex-R's, which is a bit concerning due to lack of
those platforms with me, let me try to get it right. Seems
translating __setup_mpu() altogether to C & installing at a later, but
suitable place might be better.

And feeling something strange about Cortex-R support in mainline,
don't know whether it boots out of the box, there are no Cortex-R cpu
compatibles in dts(i), but devicetree documentation documents it.
Still wrecking Cortex-R's could get counted as a regression as dts is
not considered Kernel. Looks like there is a Cortex-R mafia around
mainline ;)

> That leaves one pr_notice() call using the CONFIG_VECTORS_BASE
> constant...

Seems you want to completely kick out CONFIG_VECTORS_BASE.

Saw 2 interesting MMU cases,
1. in devicemaps_init(), if Hivecs is not set, it is being mapped to
virtual address zero, was wondering how MMU Kernel can handle
exceptions with zero address base (& still prints 0x as vector
base)
2. One of the platform does a ioremap of CONFIG_VECTORS_BASE

Once i take care of the above, the ugly conditional compilation in
3/4th patch (@arch/arm/mm/init.c) of WIP patch series that follows
will be removed.

Please let know if you have any comments on the above.


Also !MMU Kernel could boot on 3 ARM v7-A platforms - AM335x Beagle
Bone (A8), AM437x IDK (A9) & Vybrid VF610 (on A5 core, note that it
has M4 core too) with same Kernel image*.

Vybrid did not need any platform specific tweaks, just 1/2th patch
(put in patch system as 8635/1) & WIP series over Vladimir's one,
while TI Sitara AMx3's needed one w.r.t remap.

Please bear my delay - to fill the stomach, work not on Linux and then
the vacations.

Regards
afzal

* Since initramfs was used, tty port had to be changed in initramfs
build for Vybrid, but Kernel except for above initramfs change, was
identical.


Re: [PATCH 33/37] ARM: dts: vf610m4-cosmic: Correct license text

2016-12-15 Thread Afzal Mohammed
Hi,

On Thu, Dec 15, 2016 at 12:57:42AM +0100, Alexandre Belloni wrote:
> The license test has been mangled at some point then copy pasted across

The patch text has been mangled at this point ...  ;)

> multiple files. Restore it to what it should be.
> Note that this is not intended as a license change.

Acked-by: Afzal Mohammed 

Regards
afzal


Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2016-12-13 Thread Afzal Mohammed
Hi,

On Tue, Dec 13, 2016 at 09:38:21AM +, Vladimir Murzin wrote:
> On 11/12/16 13:12, Afzal Mohammed wrote:

> > this probably would have to be made robust so as to not cause issue on
> > other v7-A's upon trying to do !MMU (this won't affect normal MMU boot),
> > or specifically where security extensions are not enabled. Also effect
> > of hypervisor extension also need to be considered. Please let know if
> > any better ways to handle this.

> You might need to check ID_PFR1 for that.

Had been searching ARM ARM for this kind of a thing, thanks.

> > +#ifdef CONFIG_REMAP_VECTORS_TO_RAM
> > +   mov r3, #CONFIG_VECTORS_BASE@ read VECTORS_BASE

> ldr r3,=CONFIG_VECTORS_BASE
> 
> would be more robust. I hit this in [1]
> 
> [1] https://www.spinics.net/lists/arm-kernel/msg546825.html

Russell suggested doing it in paging_init(), then probably assembly
circus can be avoided.

Regards
afzal


Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2016-12-13 Thread Afzal Mohammed
Hi,

On Tue, Dec 13, 2016 at 10:02:26AM +, Russell King - ARM Linux wrote:
> On Sun, Dec 11, 2016 at 06:42:55PM +0530, Afzal Mohammed wrote:

> > bic r0, r0, #CR_V
> >  #endif
> > mcr p15, 0, r0, c1, c0, 0   @ write control reg
> > +
> > +#ifdef CONFIG_REMAP_VECTORS_TO_RAM
> > +   mov r3, #CONFIG_VECTORS_BASE@ read VECTORS_BASE
> > +   mcr p15, 0, r3, c12, c0, 0  @ write to VBAR
> > +#endif
> > +

> Is there really any need to do this in head.S ?

Seeing the high vector configuration done here, pounced upon it :)

> I believe it's
> entirely possible to do it later - arch/arm/mm/nommu.c:paging_init().
> 
> Also, if the region setup for the vectors was moved as well, it would
> then be possible to check the ID registers to determine whether this
> is supported, and make the decision where to locate the vectors base
> more dynamically.

i will look into it.

Regards
afzal


linux-kernel@vger.kernel.org

2016-12-12 Thread Afzal Mohammed
Hi,

On Sun, Dec 11, 2016 at 06:40:28PM +0530, Afzal Mohammed wrote:

> Kernel reached the stage of invoking user space init & panicked, though
> it could not reach till prompt for want of user space executables
> 
> So far i have not come across a toolchain (or a way to create toolchain)
> to create !MMU user space executables for Cortex-A.

Now able to reach prompt using buildroot initramfs, Thanks to
Peter Korsgaard for suggesting the way to create user space executables
for !MMU Cortex-A.

> multi_v7_defconfig was used & all platforms except TI OMAP/AM/DM/DRA &
> Freescale i.MX family was deselected. ARM_MPU option was disabled as
> Vladimir had given an early warning. DRAM_BASE was set to 0x8000.
> During the course of bringup, futex was causing issues, hence FUTEX was
> removed. L1 & L2 caches were disabled in config. High vectors were
> disabled & vectors were made to remap to base of RAM. An additional OMAP
> specific change to avoid one ioremap was also required.

For the sake of completeness,
SMP was disabled & flat binary support enabled in Kernel.

Regards
afzal


Re: [PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2016-12-11 Thread Afzal Mohammed
Hi,

On Sun, Dec 11, 2016 at 06:42:55PM +0530, Afzal Mohammed wrote:

> Kernel text start at an offset of at least 32K to account for page
> tables in MMU case.

Proper way to put it might have been "32K (to account for 16K initial
page tables & the old atags)", unless i missed something.

Regards
afzal


[PATCH RFC 2/2] ARM: nommu: remap exception base address to RAM

2016-12-11 Thread Afzal Mohammed
Remap exception base address to start of RAM in Kernel in !MMU mode.

Based on existing Kconfig help, Kernel was expecting it to be
configured by external support. Also earlier it was not possible to
copy the exception table to start of RAM due to Kconfig dependency,
which has been fixed by a change prior to this.

Kernel text start at an offset of at least 32K to account for page
tables in MMU case. On a !MMU build too this space is kept aside, and
since 2 pages (8K) is the maximum for exception plus stubs, it can be
placed at the start of RAM.

Signed-off-by: Afzal Mohammed 
---

i am a bit shaky about this change, though it works here on Cortex-A9,
this probably would have to be made robust so as to not cause issue on
other v7-A's upon trying to do !MMU (this won't affect normal MMU boot),
or specifically where security extensions are not enabled. Also effect
of hypervisor extension also need to be considered. Please let know if
any better ways to handle this.


 arch/arm/Kconfig-nommu   | 6 +++---
 arch/arm/kernel/head-nommu.S | 6 ++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu
index b7576349528c..f57fbe3d5eb0 100644
--- a/arch/arm/Kconfig-nommu
+++ b/arch/arm/Kconfig-nommu
@@ -46,9 +46,9 @@ config REMAP_VECTORS_TO_RAM
  If your CPU provides a remap facility which allows the exception
  vectors to be mapped to writable memory, say 'n' here.
 
- Otherwise, say 'y' here.  In this case, the kernel will require
- external support to redirect the hardware exception vectors to
- the writable versions located at DRAM_BASE.
+ Otherwise, say 'y' here.  In this case, the kernel will
+ redirect the hardware exception vectors to the writable
+ versions located at DRAM_BASE.
 
 config ARM_MPU
bool 'Use the ARM v7 PMSA Compliant MPU'
diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 6b4eb27b8758..ac31c9647830 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -158,6 +158,12 @@ __after_proc_init:
bic r0, r0, #CR_V
 #endif
mcr p15, 0, r0, c1, c0, 0   @ write control reg
+
+#ifdef CONFIG_REMAP_VECTORS_TO_RAM
+   mov r3, #CONFIG_VECTORS_BASE@ read VECTORS_BASE
+   mcr p15, 0, r3, c12, c0, 0  @ write to VBAR
+#endif
+
 #elif defined (CONFIG_CPU_V7M)
/* For V7M systems we want to modify the CCR similarly to the SCTLR */
 #ifdef CONFIG_CPU_DCACHE_DISABLE
-- 
2.11.0



[PATCH 1/2] ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM

2016-12-11 Thread Afzal Mohammed
REMAP_VECTORS_TO_RAM depends on DRAM_BASE, but since DRAM_BASE is a
hex, REMAP_VECTORS_TO_RAM could never get enabled. Also depending on
DRAM_BASE is redundant as whenever REMAP_VECTORS_TO_RAM makes itself
available to Kconfig, DRAM_BASE also is available as the Kconfig gets
sourced on !MMU.

Signed-off-by: Afzal Mohammed 
---
 arch/arm/Kconfig-nommu | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu
index aed66d5df7f1..b7576349528c 100644
--- a/arch/arm/Kconfig-nommu
+++ b/arch/arm/Kconfig-nommu
@@ -34,8 +34,7 @@ config PROCESSOR_ID
  used instead of the auto-probing which utilizes the register.
 
 config REMAP_VECTORS_TO_RAM
-   bool 'Install vectors to the beginning of RAM' if DRAM_BASE
-   depends on DRAM_BASE
+   bool 'Install vectors to the beginning of RAM'
help
  The kernel needs to change the hardware exception vectors.
  In nommu mode, the hardware exception vectors are normally
-- 
2.11.0



linux-kernel@vger.kernel.org

2016-12-11 Thread Afzal Mohammed
Hi,

ARM core fixes required to bring up !MMU Kernel on v7 Cortex-A.

This was done on top of Vladimir Murzin's !MMU multiplatform series[1].

Platform used was Cortex-A9, AM437x IDK.

Kernel reached the stage of invoking user space init & panicked, though
it could not reach till prompt for want of user space executables, it
went as much as Kernel can help by itself. But that is an issue
independent of the Kernel, hence posting the series (also thought of
at least posting the existing patches b'fore merge window starts).

So far i have not come across a toolchain (or a way to create toolchain)
to create !MMU user space executables for Cortex-A. It is being hoped
that Cortex-R toolchain might help here (Thanks Arnd). This is being
looked into.

multi_v7_defconfig was used & all platforms except TI OMAP/AM/DM/DRA &
Freescale i.MX family was deselected. ARM_MPU option was disabled as
Vladimir had given an early warning. DRAM_BASE was set to 0x8000.
During the course of bringup, futex was causing issues, hence FUTEX was
removed. L1 & L2 caches were disabled in config. High vectors were
disabled & vectors were made to remap to base of RAM. An additional OMAP
specific change to avoid one ioremap was also required.

2/2th patch has been sticked with RFC label, as, though it works, it
might have to be made robust so as to not cause issue on other v7-A's
upon trying to do !MMU (this won't affect normal MMU boot), or
specifically where security extensions are not enabled. Also effect
of hypervisor extension also need to be considered. Please let know if
any better ways to handle this.

Boot logs at the end.


Afzal Mohammed (2):
  ARM: nommu: allow enabling REMAP_VECTORS_TO_RAM
  ARM: nommu: remap exception base address to RAM

 arch/arm/Kconfig-nommu   | 9 -
 arch/arm/kernel/head-nommu.S | 6 ++
 2 files changed, 10 insertions(+), 5 deletions(-)

[1] "[RFC v2 PATCH 00/23] Allow NOMMU for MULTIPLATFORM",

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470966.html
(git://linux-arm.org/linux-vm.git nommu-rfc-v2)

[2] Boot log

[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 4.9.0-rc7-00026-g7a142ca8231b (afzal@debian) (gcc 
version 6.2.0 (GCC) ) #23 Sun Dec 11 14:59:57 IST 2016
[0.00] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=00c50478
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
instruction cache
[0.00] OF: fdt:Machine model: TI AM437x Industrial Development Kit
[0.00] bootconsole [earlycon0] enabled
[0.00] AM437x ES1.2 (sgx neon)
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 260096
[0.00] Kernel command line: console=ttyO0,115200n8 root=/dev/ram0 rw 
initrd=0x8180,8M earlyprintk
[0.00] PID hash table entries: 4096 (order: 2, 16384 bytes)
[0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Memory: 1021276K/1048576K available (6558K kernel code, 523K 
rwdata, 2096K rodata, 444K init, 274K bss, 27300K reserved, 0K cma-reserved)
[0.00] Virtual kernel memory layout:
[0.00] vector  : 0x8000 - 0x80001000   (   4 kB)
[0.00] fixmap  : 0xffc0 - 0xfff0   (3072 kB)
[0.00] vmalloc : 0x - 0x   (4095 MB)
[0.00] lowmem  : 0x8000 - 0xc000   (1024 MB)
[0.00] modules : 0x8000 - 0xc000   (1024 MB)
[0.00]   .text : 0x80008000 - 0x8066f948   (6559 kB)
[0.00]   .init : 0x8087d000 - 0x808ec000   ( 444 kB)
[0.00]   .data : 0x808ec000 - 0x8096ef60   ( 524 kB)
[0.00].bss : 0x8096ef60 - 0x809b3a9c   ( 275 kB)
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[0.00] NR_IRQS:16 nr_irqs:16 16
[0.00] OMAP clockevent source: timer1 at 32786 Hz
[0.000255] sched_clock: 64 bits at 500MHz, resolution 2ns, wraps every 
4398046511103ns
[0.009514] clocksource: arm_global_timer: mask: 0x 
max_cycles: 0xe6a171a037, max_idle_ns: 881590485102 ns
[0.021986] Switching to timer-based delay loop, resolution 2ns
[0.140838] clocksource: 32k_counter: mask: 0x max_cycles: 
0x, max_idle_ns: 58327039986419 ns
[0.151820] OMAP clocksource: 32k_counter at 32768 Hz
[0.230698] Console: colour dummy device 80x30
[0.236205] Calibrating delay loop (skipped), value calculated using timer 
frequency.. 1000.00 BogoMIPS (lpj=500)
[0.248268] pid_max: default: 32768 minimum: 301
[0.255822] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
[0.263618] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
[0.322900] devtmpfs: initialized
[0.936367] VFP support v0.3: implementor 41 architecture 3 part 30 va

Re: RFC: documentation of the autogroup feature [v2]

2016-11-25 Thread Afzal Mohammed
Hi,

On Thu, Nov 24, 2016 at 10:41:29PM +0100, Michael Kerrisk (man-pages) wrote:

>Suppose  that  there  are two autogroups competing for the same
>CPU.  The first group contains ten CPU-bound processes  from  a
>kernel build started with make -j10.  The other contains a sin‐
>gle CPU-bound process: a video player.   The  effect  of  auto‐
>grouping  is  that the two groups will each receive half of the
>CPU cycles.  That is, the video player will receive 50% of  the
>CPU  cycles,  rather  just 9% of the cycles, which would likely

than ?

Regards
afzal

>lead to degraded video playback.  Or to put things another way:
>an  autogroup  that  contains  a large number of CPU-bound pro‐
>cesses does not end up overwhelming the CPU at the  expense  of
>the other jobs on the system.


Re: [PATCH v2 08/10] ARM: dts: nuc900: Add nuc970 dts files

2016-07-13 Thread Afzal Mohammed
Hi,

On Wed, Jul 13, 2016 at 03:26:40PM +0800, Wan Zongshun wrote:
> Do you mean I should add cpus into soc

yes

Regards
afzal


Re: [PATCH v2 08/10] ARM: dts: nuc900: Add nuc970 dts files

2016-07-12 Thread Afzal Mohammed
Hi,

On Sun, Jul 10, 2016 at 03:42:20PM +0800, Wan Zongshun wrote:
> This patch is to add dts support for nuc970 platform.

cpu ! in soc ? lost in fab ? ;)

Regards
afzal


Re: [PATCH] net: ethernet: ti: cpdma: switch to use genalloc

2016-06-25 Thread Afzal Mohammed
Hi,

On Fri, Jun 24, 2016 at 12:15:41PM -0400, Lennart Sorensen wrote:

> although the style does require using brackets for the else if the
> if required them.

As an aside, though most of the style rationale is K & R, K & R
consistently uses unbalanced braces for if-else-*

For a one that learns C unadultered from K & R, probably Kernel coding
style comes naturally, except for trivial things like above.

...a brick for the shed.

Regards
afzal


Re: [PATCH] net: ethernet: ti: cpdma: switch to use genalloc

2016-06-24 Thread Afzal Mohammed
Hi,

On Fri, Jun 24, 2016 at 11:35:15AM +0530, Mugunthan V N wrote:
> On Thursday 23 June 2016 06:26 PM, Ivan Khoronzhuk wrote:

> >> +if (pool->cpumap) {
> >> +dma_free_coherent(pool->dev, pool->mem_size, pool->cpumap,
> >> +  pool->phys);
> >> +} else {
> >> +iounmap(pool->iomap);
> >> +}

> > single if, brackets?
> 
> if() has multiple line statement, so brackets are must.

Another paint to the bikeshed,

seems documented coding style mentions otherwise.

Regards
afzal


Re: [PATCH 01/48] clk: at91: replace usleep() by udelay() calls

2016-06-14 Thread Afzal Mohammed
Hi,

On Mon, Jun 13, 2016 at 05:24:09PM +0200, Alexandre Belloni wrote:
> On 11/06/2016 at 00:30:36 +0200, Arnd Bergmann wrote :

> > Does this have to be called that early? It seems wasteful to always
> > call udelay() here, when these are functions that are normally
> > allowed to sleep.

> So I've tested it and something like that would work:
> 
>   if (system_state < SYSTEM_RUNNING)
>   udelay(osc->startup_usec);
>   else
>   usleep_range(osc->startup_usec, osc->startup_usec + 1);
> 
> But I'm afraid it would be the first driver to actually do something
> like that (however, it is already the only driver trying to sleep). 

tglx has suggested to modify clock core to handle a somewhat similar
kind of scenario (probably should work here too) and avoid driver
changes,

http://lkml.kernel.org/r/alpine.DEB.2.11.1606061448010.28031@nanos

Regards
afzal


Re: [PATCH v3 02/12] of: add J-Core cpu bindings

2016-05-27 Thread Afzal Mohammed
Hi,

On Thu, May 26, 2016 at 04:44:02PM -0500, Rob Landley wrote:

> As far as I know, we're the first nommu SMP implementation in Linux.

According to hearsay, thou shall be called Buzz Aldrin, Blackfin is
Neil Armstrong.

Regards
afzal


  1   2   3   4   >