On 16/01/2021 02:38, Nathan Lynch wrote:
Alexey Kardashevskiy <a...@ozlabs.ru> writes:
On 15/01/2021 09:00, Nathan Lynch wrote:
Memory locations passed as arguments from the OS to RTAS usually need
to be addressable in 32-bit mode and must reside in the Real Mode
Area. On PAPR guests, the RMA starts at logical address 0 and is the
first logical memory block reported in the LPAR’s device tree.

On powerpc targets with RTAS, Linux makes available to user space a
region of memory suitable for arguments to be passed to RTAS via
sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
API during boot in order to ensure that it satisfies the requirements
described above.

With radix MMU, the upper limit supplied to the memblock allocation
can exceed the bounds of the first logical memory block, since
ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
a common size of the first memory block according to a small sample of
LPARs I have checked.) This leads to failures when user space invokes
an RTAS function that uses a work area, such as
ibm,configure-connector.

Alter the determination of the upper limit for rtas_rmo_buf's
allocation to consult the device tree directly, ensuring placement
within the RMA regardless of the MMU in use.

Can we tie this with RTAS (which also needs to be in RMA) and simply add
extra 64K in prom_instantiate_rtas() and advertise this address
(ALIGH_UP(rtas-base + rtas-size, PAGE_SIZE)) to the user space? We do
not need this RMO area before that point.

Can you explain more about what advantage that would bring? I'm not
seeing it. It's a more significant change than what I've written
here.


We already allocate space for RTAS and (like RMO) it needs to be in RMA, and RMO is useless without RTAS. We can reuse RTAS allocation code for RMO like this:

===
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index e9d4eb6144e1..d9527d3e01d2 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -1821,7 +1821,8 @@ static void __init prom_instantiate_rtas(void)
        if (size == 0)
                return;

-       base = alloc_down(size, PAGE_SIZE, 0);
+       /* One page for RTAS, one for RMO */
+       base = alloc_down(size, PAGE_SIZE + PAGE_SIZE, 0);
        if (base == 0)
                prom_panic("Could not allocate memory for RTAS\n");

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index d126d71ea5bd..885d95cf4ed3 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1186,6 +1186,7 @@ void __init rtas_initialize(void)
        rtas.size = size;
no_entry = of_property_read_u32(rtas.dev, "linux,rtas-entry", &entry);
        rtas.entry = no_entry ? rtas.base : entry;
+       rtas_rmo_buf = rtas.base + PAGE_SIZE;

        /* If RTAS was found, allocate the RMO buffer for it and look for
         * the stop-self token if any
@@ -1196,11 +1197,6 @@ void __init rtas_initialize(void)
                ibm_suspend_me_token = rtas_token("ibm,suspend-me");
        }
 #endif
-       rtas_rmo_buf = memblock_phys_alloc_range(RTAS_RMOBUF_MAX, PAGE_SIZE,
-                                                0, rtas_region);
-       if (!rtas_rmo_buf)
- panic("ERROR: RTAS: Failed to allocate %lx bytes below %pa\n",
-                     PAGE_SIZE, &rtas_region);
===


May be store in the FDT as "linux,rmo-base" next to "linux,rtas-base", for clarity, as sharing symbols between prom and main kernel is a bit tricky.

The benefit is that we do not do the same thing (== find 64K in RMA) in 2 different ways and if the RMO allocated my way is broken - we'll know it much sooner as RTAS itself will break too.


Would it interact well with kexec?


Good point. For this, the easiest will be setting rtas-size in the FDT to the allocated RTAS space (PAGE_SIZE*2 with the hunk above applied). Probably.


And probably do the same with per-cpu RTAS argument structures mentioned
in the cover letter?

I don't think so, since those need to be allocated with the pacas and
limited to the maximum possible CPUs, which is discovered by the kernel
much later.


The first cell of /proc/device-tree/cpus/ibm,drc-indexes is the number of cores, it is there when RTAS is instantiated, we know SMT after "ibm,client-architecture-support" (if I remember correctly).


But maybe I misunderstand what you're suggesting.


Usually it is me missing the bigger picture :)


--
Alexey

Reply via email to