On Fri, 2011-04-15 at 12:17 -0500, Anthony Liguori wrote: > On 04/15/2011 11:36 AM, Dave Hansen wrote: > >> Why would we do this in QEMU? This prevents the guest from doing > >> ballooning reclaim during OOM. > > What the heck is "ballooning reclaim"? Could you elaborate a bit on how > > this happens? I think I'm missing some subtlety here. > > If you're in OOM and you need memory, you can't ask the host for more > and wait for a response. You have to reclaim it immediately.
Why not? The call in to the notifier chain the s390 case is synchronous. The OOM only affects one task at a time and won't proceed elsewhere while this is going on. > See the s390 balloon driver for an example of this. I could buy that in an OOM situation, but that's not what virtio-balloon is doing. > > I think 'tell host first' is the only sane way to do it. Look at the > > 'tell host second code': > > > > release_pages_by_pfn(); // let other kernel users at the pages > > tell_host(); // tell the hypervisor they're used again > > > At the point we've started using the pages again, we haven't *told* the > > host that we're using them. I think that's potentially a problem. Is > > qemu somehow cool with the guest touching pages that are supposed to be > > in the balloon and unusable? > > Yes, of course it is. All ballooning does is madvise(MADV_DONTNEED). A > guest can reclaim the memory as soon as it wants it to. That's not true on all hypervisors, but as long as it's a guarantee in qemu, I guess we're OK here. Let's hope nobody else ever starts to use virtio-balloon. Why do we even _tell_ qemu, though? The MADV_WILLNEED is nice, but far from being necessary. We could just skip the entire notification in OOM situations. -- Dave