On Tue, Oct 18, 2016 at 10:43:31PM +0200, Martin Kletzander wrote: > On Mon, Oct 17, 2016 at 03:45:09PM +1100, Sam Bobroff wrote: > >On Fri, Oct 14, 2016 at 10:19:42AM +0200, Martin Kletzander wrote: > >>On Fri, Oct 14, 2016 at 11:52:22AM +1100, Sam Bobroff wrote: > >>>I did look at the libnuma and cgroups approaches, but I was concerned they > >>>wouldn't work in this case, because of the way QEMU allocates memory when > >>>mem-prealloc is used: the memory is allocated in the main process, before > >>>the > >>>CPU threads are created. (This is based only on a bit of hacking and > >>>debugging > >>>in QEMU, but it does seem explain the behaviour I've seen so far.) > >>> > >> > >>But we use numactl before QEMU is exec()'d. > > > >Sorry, I jumped ahead a bit. I'll try to explain what I mean: > > > >I think the problem with using this method would be that the NUMA policy is > >applied to all allocations by QEMU, not just ones related to the memory > >backing. I'm not sure if that would cause a serious problem but it seems > >untidy, > >and it doesn't happen in other situations (i.e. with separate memory backend > >objects, QEMU sets up the policy specifically for each one and other > >allocations aren't affected, AFAIK). Presumably, if memory were very > >restricted it could prevent the guest from starting. > > > > Yes, it is, that's what <numatune><memory/> does if you don't have any > other (<memnode/>) specifics set. > > >>>I think QEMU could be altered to move the preallocations into the VCPU > >>>threads but it didn't seem trivial and I suspected the QEMU community would > >>>point out that there was already a way to do it using backend objects. > >>>Another > >>>option would be to add a -host-nodes parameter to QEMU so that the policy > >>>can > >>>be given without adding a memory backend object. (That seems like a more > >>>reasonable change to QEMU.) > >>> > >> > >>I think upstream won't like that, mostly because there is already a > >>way. And that is using memory-backend object. I think we could just > >>use that and disable changing it live. But upstream will probably want > >>that to be configurable or something. > > > >Right, but isn't this already an issue in the cases where libvirt is already > >using memory backend objects and NUMA policy? (Or does libvirt already > >disable > >changing it live in those situations?) > > > > It is. I'm not trying to say libvirt is perfect. There are bugs, > e.g. like this one. The problem is that we tried to do *everything*, > but it's not currently possible. I'm trying to explain how stuff works > now. It definitely needs some fixing, though.
OK :-) Well, given our discussion, do you think it's worth a v2 of my original patch or would it be better to drop it in favour of some broader change? Cheers, Sam. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list