On Wed, Mar 06, 2019 at 10:14:47AM +0000, Guillaume Tucker wrote:
> On 01/03/2019 23:23, Dan Williams wrote:
> > On Fri, Mar 1, 2019 at 1:05 PM Guillaume Tucker
> > <guillaume.tuc...@collabora.com> wrote:
> > 
> > Is there an early-printk facility that can be turned on to see how far
> > we get in the boot?
> 
> Yes, I've done that now by enabling CONFIG_DEBUG_AM33XXUART1 and
> earlyprintk in the command line.  Here's the result, with the
> commit cherry picked on top of next-20190304:
> 
>   https://lava.collabora.co.uk/scheduler/job/1526326
> 
> [    1.379522] ti-sysc 4804a000.target-module: sysc_flags 00000222 != 00000022
> [    1.396718] Unable to handle kernel paging request at virtual address 
> 77bb4003
> [    1.404203] pgd = (ptrval)
> [    1.406971] [77bb4003] *pgd=00000000
> [    1.410650] Internal error: Oops: 5 [#1] ARM
> [...]
> [    1.672310] [<c07051a0>] (clk_hw_create_clk.part.21) from [<c06fea34>] 
> (devm_clk_get+0x4c/0x80)
> [    1.681232] [<c06fea34>] (devm_clk_get) from [<c064253c>] 
> (sysc_probe+0x28c/0xde4)
> 
> It's always failing at that point in the code.  Also when
> enabling "debug" on the kernel command line, the issue goes
> away (exact same binaries etc..):
> 
>   https://lava.collabora.co.uk/scheduler/job/1526327
> 
> For the record, here's the branch I've been using:
> 
>   
> https://gitlab.collabora.com/gtucker/linux/tree/beaglebone-black-next-20190304-debug
> 
> The board otherwise boots fine with next-20190304 (SMP=n), and
> also with the patch applied but the shuffle configs set to n.
> 
> > Were there any boot *successes* on ARM with shuffling enabled? I.e.
> > clues about what's different about the specific memory setup for
> > beagle-bone-black.
> 
> Looking at the KernelCI results from next-20190215, it looks like
> only the BeagleBone Black with SMP=n failed to boot:
> 
>   https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20190215/
> 
> Of course that's not all the ARM boards that exist out there, but
> it's a fairly large coverage already.
> 
> As the kernel panic always seems to originate in ti-sysc.c,
> there's a chance it's only visible on that platform...  I'm doing
> a KernelCI run now with my test branch to double check that,
> it'll take a few hours so I'll send an update later if I get
> anything useful out of it.
> 
> In the meantime, I'm happy to try out other things with more
> debug configs turned on or any potential fixes someone might
> have.

ARM is the only arch that sets ARCH_HAS_HOLES_MEMORYMODEL to 'y'. Maybe the
failure has something to do with it...

Guillaume, can you try this patch:

diff --git a/mm/shuffle.c b/mm/shuffle.c
index 3ce1248..4a04aac 100644
--- a/mm/shuffle.c
+++ b/mm/shuffle.c
@@ -58,7 +58,8 @@ module_param_call(shuffle, shuffle_store, shuffle_show, 
&shuffle_param, 0400);
  * For two pages to be swapped in the shuffle, they must be free (on a
  * 'free_area' lru), have the same order, and have the same migratetype.
  */
-static struct page * __meminit shuffle_valid_page(unsigned long pfn, int order)
+static struct page * __meminit shuffle_valid_page(unsigned long pfn, int order,
+                                                 struct zone *z)
 {
        struct page *page;
 
@@ -80,6 +81,9 @@ static struct page * __meminit shuffle_valid_page(unsigned 
long pfn, int order)
        if (!PageBuddy(page))
                return NULL;
 
+       if (!memmap_valid_within(pfn, page, z))
+               return NULL;
+
        /*
         * ...is the page on the same list as the page we will
         * shuffle it with?
@@ -123,7 +127,7 @@ void __meminit __shuffle_zone(struct zone *z)
                 * page_j randomly selected in the span @zone_start_pfn to
                 * @spanned_pages.
                 */
-               page_i = shuffle_valid_page(i, order);
+               page_i = shuffle_valid_page(i, order, z);
                if (!page_i)
                        continue;
 
@@ -137,7 +141,7 @@ void __meminit __shuffle_zone(struct zone *z)
                        j = z->zone_start_pfn +
                                ALIGN_DOWN(get_random_long() % z->spanned_pages,
                                                order_pages);
-                       page_j = shuffle_valid_page(j, order);
+                       page_j = shuffle_valid_page(j, order, z);
                        if (page_j && page_j != page_i)
                                break;
                }
 

-- 
Sincerely yours,
Mike.

Reply via email to