On Sun, Nov 03, 2013 at 09:26:06PM +0000, Peter Maydell wrote:
> On 3 November 2013 20:48, Marcel Apfelbaum <marce...@redhat.com> wrote:
> > The problem appears when a root memory region within an
> > address space with size < UINT64_MAX has overlapping children
> > with the same size. If the size of the root memory region is UINT64_MAX
> > everyting is ok.
> >
> > Solved the regression by making the system-memory region
> > of size UINT64_MAX instead of INT64_MAX.
> >
> > Signed-off-by: Marcel Apfelbaum <marce...@redhat.com>
> > ---
> > In the mean time I am investigating why the
> > root memory region has to be UINT64_MAX size in order
> > to have overlapping children
> 
> >      system_memory = g_malloc(sizeof(*system_memory));
> > -    memory_region_init(system_memory, NULL, "system", INT64_MAX);
> > +    memory_region_init(system_memory, NULL, "system", UINT64_MAX);
> >      address_space_init(&address_space_memory, system_memory, "memory");
> 
> As you say above we should investigate why this caused a
> problem, but I was surprised the system memory space isn't
> already maximum size. It turns out that that change was
> introduced in commit 8417cebf in an attempt to avoid overflow
> issues by sticking to signed 64 bit arithmetic. This approach was
> subsequently ditched in favour of using proper 128 bit arithmetic
> in commit 08dafab4, but we never changed the init call for
> the system memory back to UINT64_MAX. So I think this is
> a good change in itself.
> 
> -- PMM

I think I debugged it.

So this patch seems to help simply because we only have
sanity checking asserts in the subpage path. UINT64_MAX will make
the region a number of full pages and avoid
hitting the checks.


I think I see what the issue is: exec.c
assumes that TARGET_PHYS_ADDR_SPACE_BITS is enough
to render any section in system memory:
number of page table levels is calculated from that:

#define P_L2_LEVELS \
        (((TARGET_PHYS_ADDR_SPACE_BITS - TARGET_PAGE_BITS - 1) / L2_BITS) + 1)

any other bits are simply ignored:

    for (i = P_L2_LEVELS - 1; i >= 0 && !lp.is_leaf; i--) {
        if (lp.ptr == PHYS_MAP_NODE_NIL) {
            return &sections[PHYS_SECTION_UNASSIGNED];
        }
        p = nodes[lp.ptr];
        lp = p[(index >> (i * L2_BITS)) & (L2_SIZE - 1)];
    }

so mask by L2_SIZE - 1 means that each round looks at L2_BITS bits,
and there are at most P_L2_LEVELS.

Any other bits are simply ignored.
This is very wrong and can break in a number of other ways,
for example I think we will also hit this assert
if we have a non aligned 64 bit BAR of a PCI device.

I think the fastest solution is to just limit
system memory size of TARGET_PAGE_BITS.
I sent a patch like this.



-- 
MST

Reply via email to