Re: [Qemu-devel] [PATCH] Permit zero-sized qemu_malloc() & friends

Jamie Lokier Sun, 06 Dec 2009 18:53:00 -0800

Ian Molton wrote:
> Avi Kivity wrote:
> 
> > Init is pretty easy to handle.  I'm worried about runtime where you
> > can't report an error to the guest.  Real hardware doesn't oom.
> 
> In the case of the socket reconnect code I posted recently, if the
> allocation failed, it would give up trying to reconnect and inform the
> user of that chardev that it had closed. Ok, this doesnt help the guest,
> but it allows other code to clean up nicely, and we can report the
> failure to the host. IMHO thats better than leaving a sysadmin
> scratching their head wondering why it suddenly just stopped feeding the
> guest entropy and isnt trying to reconnect anymore...


If the system as a whole runs out of memory so that no-overcommit
malloc() fails on a small alloc, there's a good chance that you won't
be able to send a message to the host (how do you format the QMP
message without malloc?), and if you do manage that, there's a good
chance the host won't be able to receive it (it can't malloc either),
and if it does manage to receive the message, you can be almost
certain that it won't be able to run any GUI operations, send mail,
etc. to inform the admin.

The chances of the path "qemu small alloc -> chardev error -> send QMP
message -> receive QMP message -> parse QMG message -> do something
useful (log/email/UI)" having fully preallocated buffers for every
step, including a preallocated emergency pool for the buffers used by
QMG formatting and parsing, so that it gets all the way past the last
step are very slim indeed.

There's no point writing the code for the first steps, if it's
intractable to make the later steps do something useful.

Btw, as an admin I would really rather the socket reconnection code
keeps trying in that circumstance, if qemu does not simply fall over
due to alloc failing for something else soon after.  The most likely
scenario, imho in a server like that, is to notice it is running out
of memory and kill the real cause (e.g. another runaway process), then
restart all daemons which have died.  I'm not going to notice a
non-fatal message (in the unlikely event it is propagated all the way
up) because there are plenty of other non-fatal messages in normal
use, multiplied by hundreds of guests (across a cluster).  Or, if you
mean the chardev closing causes qemu to terminate - what's the
difference from the current qemu_malloc() behaviour?

I'd rather it behaves like a broken HWRNG if it can't get host
entropy: Don't provide data, and let the guest decide what to do, just
like it does for a broken HWRNG.  Except virtio-rng can report
unavailability rather than simply being broken :-)

-- Jamie

Re: [Qemu-devel] [PATCH] Permit zero-sized qemu_malloc() & friends

Reply via email to