On Tue, Jun 17, 2014 at 03:51:35PM +1000, Alexey Kardashevskiy wrote: > On 06/17/2014 06:51 AM, Eduardo Habkost wrote: > > On Mon, Jun 16, 2014 at 06:16:29PM +1000, Alexey Kardashevskiy wrote: > >> On 06/16/2014 05:53 PM, Alexey Kardashevskiy wrote: > >>> c4177479 "spapr: make sure RMA is in first mode of first memory node" > >>> introduced regression which prevents from running guests with memoryless > >>> NUMA node#0 which may happen on real POWER8 boxes and which would make > >>> sense to debug in QEMU. > >>> > >>> This patchset aim is to fix that and also fix various code problems in > >>> memory nodes generation. > >>> > >>> These 2 patches could be merged (the resulting patch looks rather ugly): > >>> spapr: Use DT memory node rendering helper for other nodes > >>> spapr: Move DT memory node rendering to a helper > >>> > >>> Please comment. Thanks! > >>> > >> > >> Sure I forgot to add an example of what I am trying to run without errors > >> and warnings: > >> > >> /home/aik/qemu-system-ppc64 \ > >> -enable-kvm \ > >> -machine pseries \ > >> -nographic \ > >> -vga none \ > >> -drive id=id0,if=none,file=virtimg/fc20_24GB.qcow2,format=qcow2 \ > >> -device scsi-disk,id=id1,drive=id0 \ > >> -m 2080 \ > >> -smp 8 \ > >> -numa node,nodeid=0,cpus=0-7,memory=0 \ > >> -numa node,nodeid=2,cpus=0-3,mem=1040 \ > >> -numa node,nodeid=4,cpus=4-7,mem=1040 > > > > (Note: I will ignore the "cpus" argument for the discussion below.) > > The example is quite bad, I should not have used same CPUs in 2 nodes. > SPAPR allows this but QEMU does not really support this and I am not > touching this now. > > > > > > I understand now that the non-contiguous node IDs are guest-visible. > > > > But I still would like to understand the motivations for your use case, > > to understand which solution makes more sense. > > One of examples is a 2 CPUs on one die, one of CPUs is connected to memory > bus, the other is not, instead it is connected to the first CPU (via super > fast bus) and the first CPU acts as a bridge. > > > > > If you really want 5 nodes, you just need to write this: > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=1 \ > > -numa node,nodeid=2,cpus=0-3,mem=1040 \ > > -numa node,nodeid=3 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > > > If you just want 3 nodes, you can just write this: > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=1,cpus=0-3,mem=1040 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > > > But you seem to claim you need 3 nodes with non-contiguous IDs. In that > > case, which exactly is the guest-visible difference you expect to get > > between: > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=1 \ > > -numa node,nodeid=2,cpus=0-3,mem=1040 \ > > -numa node,nodeid=3 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > and > > -numa node,nodeid=0,cpus=0-7,memory=0 \ > > -numa node,nodeid=2,cpus=0-3,mem=1040 \ > > -numa node,nodeid=4,cpus=4-7,mem=1040 > > ? > > > > Because your patch is making both be exactly the same, and I guess you > > don't want that (otherwise you could simply use the 5-node command-line > > above and we wouldn't need patch 7/7). > > If it is canonical and kosher way of using NUMA in QEMU, ok, we can use it. > I just fail to see why we need a requirement for nodes to go consequently > here. And it confuses me as a user a bit if I can add "-numa > node,nodeid=22" (no memory, no cpus) but do not get to see it in the guest.
I agree with you it is confusing. But before we support that use case, we need to make sure auto-allocation is handled properly, because it would be hard to fix it later without breaking compatibility. We probably just need a "present" field on struct NodeInfo, so machine-specific code and auto-allocation code can differentiate nodes that are not present on the command-line from empty nodes that were specified in the command-line. In the meantime, people can use the 5-node example above as a workaround. > > btw how is it supposed to work with memory hotplug? Current "-numa" does > not support gaps in memory and I would expect that we will need it. Any > plans here? The DIMM device used for memory hotplug has a "node" property, for the NUMA node ID. -- Eduardo