On Mon, Jun 08, 2015 at 11:51:03AM +0200, Igor Mammedov wrote: > On Mon, 8 Jun 2015 11:28:18 +0530 > Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote: > > > On Mon, May 25, 2015 at 02:42:40PM -0300, Eduardo Habkost wrote: > > > On Mon, May 25, 2015 at 01:17:57PM +0530, Bharata B Rao wrote: > > > > On Thu, May 14, 2015 at 11:39:06AM +0200, Paolo Bonzini wrote: > > > > > On 13/05/2015 20:06, Eduardo Habkost wrote: > > > > > > Also, this introduces a circular dependency between pc-dimm.c and > > > > > > numa.c. Instead of that, pc-dimm could simply notify us when a new > > > > > > device is realized (with just (addr, end, node) as arguments), so > > > > > > we can > > > > > > save the list of memory ranges inside struct node_info. > > > > > > > > > > > > I wonder if the memory API already provides something that would > > > > > > help > > > > > > us. Paolo, do you see a way we could simply use a MemoryRegion as > > > > > > input > > > > > > to lookup the NUMA node? > > > > > > > > > > No, but I guess you could add a numa_get/set_memory_region_node_id API > > > > > that uses a hash table. That's a variant of the "pc-dimm could simply > > > > > notify" numa.c that you propose above. > > > > > > > > While you say we can't use MemoryRegion as input to lookup the NUMA > > > > node, > > > > you suggest that we add numa_get/set_memory_region_node_id. Does this > > > > API > > > > get/set NUMA node id for the given MemoryRegion ? > > > > > > I was going to suggest that, but it would require changing the > > > non-memdev code path to create a MemoryRegion for each node, too. So > > > having a numa_set_mem_node_id(start_addr, end_addr, node_id) API would > > > be simpler. > > > > In order to save the list of memory ranges inside node_info, I tried this > > approach where I call > > > > numa_set_mem_node_id(dimm.addr, dimm.size, dimm.node) from > > > > pc_dimm_realize(), but > > > > the value of dimm.addr is finalized only later in ->plug(). > > > > So we would have to call this API from arch code like pc_dimm_plug(). > > Is that acceptable ?
It looks acceptable to me, as pc.c already has all the rest of the NUMA-specific code for PC. I believe it would be interesting to keep all numa.o dependencies contained inside machine code. > Could you query pc_dimms' numa property each time you need mapping > instead of additionally storing that mapping elsewhere? The original patch did that, but I suggested the numa_set_mem_node_id() API for two reasons: 1) not requiring special cases for hotplug inside numa_get_node(); 2) not introducing a circular dependency between pc-dimm.c and numa.c. Having a numa_set_memory_region_node_id(MemoryRegion *mr, int node) API would probably be better, and make the discussion about pc_dimm.addr moot. But it would require changing memory_region_allocate_system_memory() to avoid allocate_system_memory_nonnuma() even in the !have_memdevs case. -- Eduardo