Re: [libvirt] [PATCH] qemu: drop driver lock around sleep

2011-11-07 Thread Michal Privoznik
On 05.11.2011 02:49, Eric Blake wrote:
 qemu/THREADS.txt is explicit that the driver lock must not be
 held for long lengths of time, as it blocks all attempts to
 manage any other vms on the same connection.  We were violating
 this by sleep()ing while waiting for a qemu child process to
 finish execution during actions such as destroy.
 
 * src/qemu/qemu_process.h (qemuProcessKill): Alter signature.
 * src/qemu/qemu_process.c (qemuProcessKill): Add parameter.
 (qemuProcessFakeReboot, qemuProcessShutdownOrReboot)
 (qemuProcessStop): Update callers.
 * src/qemu/qemu_driver.c (qemuDomainDestroyFlags): Likewise.
 ---
 
 Without this patch, problems in halting one domain could lock
 out actions on all other domains for more than 3 seconds, which
 is awfully long.
 
 This doesn't solve all the problems - it is still possible to
 have a stuck NFS server being the reason for difficulties in
 stopping a domain, such as an lstat() call while attempting to
 relabel file systems, and those calls are still done while
 the driver lock is held; but I'll be submitting further patches
 as I try and reduce the critical section sizes.
 
 I'm not sure whether this qualifies for 0.9.7 or should wait
 for post-release.

I'd feel more comfortable to take this in after release. Hopefully,
0.9.7 is going to be stable release (at least by features/bug fixes
ratio) even if this is a small and good looking patch.
 
  src/qemu/qemu_driver.c  |2 +-
  src/qemu/qemu_process.c |   35 ---
  src/qemu/qemu_process.h |3 ++-
  3 files changed, 31 insertions(+), 9 deletions(-)
 

ACK

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] qemu: drop driver lock around sleep

2011-11-07 Thread Daniel P. Berrange
On Fri, Nov 04, 2011 at 07:49:51PM -0600, Eric Blake wrote:
 qemu/THREADS.txt is explicit that the driver lock must not be
 held for long lengths of time, as it blocks all attempts to
 manage any other vms on the same connection.  We were violating
 this by sleep()ing while waiting for a qemu child process to
 finish execution during actions such as destroy.
 
 * src/qemu/qemu_process.h (qemuProcessKill): Alter signature.
 * src/qemu/qemu_process.c (qemuProcessKill): Add parameter.
 (qemuProcessFakeReboot, qemuProcessShutdownOrReboot)
 (qemuProcessStop): Update callers.
 * src/qemu/qemu_driver.c (qemuDomainDestroyFlags): Likewise.
 ---
 
 Without this patch, problems in halting one domain could lock
 out actions on all other domains for more than 3 seconds, which
 is awfully long.
 
 This doesn't solve all the problems - it is still possible to
 have a stuck NFS server being the reason for difficulties in
 stopping a domain, such as an lstat() call while attempting to
 relabel file systems, and those calls are still done while
 the driver lock is held; but I'll be submitting further patches
 as I try and reduce the critical section sizes.
 
 I'm not sure whether this qualifies for 0.9.7 or should wait
 for post-release.
 
  src/qemu/qemu_driver.c  |2 +-
  src/qemu/qemu_process.c |   35 ---
  src/qemu/qemu_process.h |3 ++-
  3 files changed, 31 insertions(+), 9 deletions(-)

Post-0.9.7 IMHO, since it doesn't really fix any bug - its just an
optimization.


I think we need to take a bit of a broader look at the way we
are dealing with the driver lock. Originally we intended that
the driver lock would very rarely be held - only when resolving
a virDomainPtr into a virDomainObjPtr, and when adding/removing
virDomainObjPtr instances to/from the domain list. These days
though we seem to be holding it more  more often due to the
growth in the number of stateful objects in qemu_driver. This
is reducing the effectiveness of our per-VM locks, leading to
quite significant serialization, except when we're asleep on
QEMU monitor calls.

While adding more  more places where we lock and unlock the
driver will reduce prolonged lock outs due to sleeping /
slow code, it is not going to really improve our true level
of concurrency.

We need quite a radical change in driver level locking IMHO.
Take a look at the driver struct members, which is what we
are protecting here really:

struct qemud_driver {
virMutex lock;

virThreadPoolPtr workerPool;

int privileged;

uid_t user;
gid_t group;
int dynamicOwnership;

unsigned int qemuVersion;
int nextvmid;

virCgroupPtr cgroup;
int cgroupControllers;
char **cgroupDeviceACL;

virDomainObjList domains;

brControl *brctl;
/* These four directories are ones libvirtd uses (so must be root:root
 * to avoid security risk from QEMU processes */
char *configDir;
char *autostartDir;
char *logDir;
char *stateDir;
/* These two directories are ones QEMU processes use (so must match
 * the QEMU user/group */
char *libDir;
char *cacheDir;
char *saveDir;
char *snapshotDir;
char *qemuImgBinary;
unsigned int vncAutoUnixSocket : 1;
unsigned int vncTLS : 1;
unsigned int vncTLSx509verify : 1;
unsigned int vncSASL : 1;
char *vncTLSx509certdir;
char *vncListen;
char *vncPassword;
char *vncSASLdir;
unsigned int spiceTLS : 1;
char *spiceTLSx509certdir;
char *spiceListen;
char *spicePassword;
char *hugetlbfs_mount;
char *hugepage_path;

unsigned int macFilter : 1;
ebtablesContext *ebtables;
unsigned int relaxedACS : 1;
unsigned int vncAllowHostAudio : 1;
unsigned int clearEmulatorCapabilities : 1;
unsigned int allowDiskFormatProbing : 1;
unsigned int setProcessName : 1;

int maxProcesses;

int max_queued;

virCapsPtr caps;

virDomainEventStatePtr domainEventState;

char *securityDriverName;
virSecurityManagerPtr securityManager;

char *saveImageFormat;
char *dumpImageFormat;

char *autoDumpPath;
bool autoDumpBypassCache;

bool autoStartBypassCache;

pciDeviceList *activePciHostdevs;

virBitmapPtr reservedVNCPorts;

virSysinfoDefPtr hostsysinfo;

virLockManagerPluginPtr lockManager;

/* Mapping of 'char *uuidstr' - virConnectPtr
 * of guests which will be automatically killed
 * when the virConnectPtr is closed*/
virHashTablePtr autodestroy;
};


Looking at those you can say

 - Some are readonly and can never change for the lifetime of libvirtd
   eg, privileged, configDir, autostartDir, stateDir, etc. We should
   not require any locking to access these.

 - Some are loaded from the config file, and while currently readonly,
   could in theory be be changed. eg vnc/spice parameters, save
   image format, etc. We could move these into a 'qemu_driver_config'
   struct and use userspace RCU to access the config