Hi Drew,

On 6/3/21 2:48 PM, Gavin Shan wrote:
On 6/2/21 9:36 PM, Andrew Jones wrote:
On Wed, Jun 02, 2021 at 11:09:32AM +1000, Gavin Shan wrote:
On 6/1/21 5:50 PM, Andrew Jones wrote:
On Tue, Jun 01, 2021 at 03:30:04PM +0800, Gavin Shan wrote:
We possibly populate empty nodes where memory isn't included and might
be hot added at late time. The FDT memory nodes can't be created due
to conflicts on their names if multiple empty nodes are specified.
For example, the VM fails to start with the following error messages.

    /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64          \
    -accel kvm -machine virt,gic-version=host                        \
    -cpu host -smp 4,sockets=2,cores=2,threads=1 -m 1024M,maxmem=64G \
    -object memory-backend-ram,id=mem0,size=512M                     \
    -object memory-backend-ram,id=mem1,size=512M                     \
    -numa node,nodeid=0,cpus=0-1,memdev=mem0                         \
    -numa node,nodeid=1,cpus=2-3,memdev=mem1                         \
    -numa node,nodeid=2                                              \
    -numa node,nodeid=3                                              \
      :
    -device virtio-balloon-pci,id=balloon0,free-page-reporting=yes

    qemu-system-aarch64: FDT: Failed to create subnode /memory@80000000: \
                         FDT_ERR_EXISTS

This fixes the issue by using NUMA node ID or zero in the memory node
name to avoid the conflicting memory node names. With this applied, the
VM can boot successfully with above command lines.

Signed-off-by: Gavin Shan <gs...@redhat.com>
---
   hw/arm/boot.c | 7 ++++++-
   1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index d7b059225e..3169bdf595 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -432,7 +432,12 @@ static int fdt_add_memory_node(void *fdt, uint32_t acells, 
hwaddr mem_base,
       char *nodename;
       int ret;
-    nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
+    if (numa_node_id >= 0) {
+        nodename = g_strdup_printf("/memory@%d", numa_node_id);
+    } else {
+        nodename = g_strdup("/memory@0");
+    }
+
       qemu_fdt_add_subnode(fdt, nodename);
       qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
       ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, 
mem_base,

[...]


I've sent one separate mail to check with Rob Herring. Hopefully he have
ideas as he is maintaining linux FDT subsystem. You have been included to
that thread. I didn't find something meaningful to this thread after doing
some google search either.

Yes, I agree with you we need to follow the specification strictly. It seems
it's uncertain about the 'physical memory map' bus binding requirements.


I didn't get expected answers from device-tree experts. After rethinking about 
it,
I plan to fix this like this way, but please let me know if it sounds sensible
to you.

The idea is to assign a (not overlapped) dummy base address to each memory
node in the device-tree. The dummy is (last_valid_memory_address + NUMA ID).
The 'length' of the 'reg' property in the device-tree nodes, corresponding
to empty NUMA nodes, is still zero. This ensures the nodes are still invalid
until memory is added to these nodes.

I had the temporary patch for the implementation. It works fine and VM can
boot up successfully.

Thanks,
Gavin


Reply via email to