virt-install has some code which waits for a domain to appear just after it has been created. It looks like the loop attached to the end of this email, and is functional but has two problems.

Problem (1) is that self.conn.lookupByName doesn't distinguish between a "Not found" domain and an actual error. For example there is no way to tell the difference between being unable to contact xend (an actual error), and being able to contact xend, but xend not being able to find the domain (not found).

As shown here:

  >>> import libvirt
  >>> conn = libvirt.open ("xen+tls:///")
  >>> d = conn.lookupByName ("Domain-0")
  >>> d = conn.lookupByName ("doesnotexist")
  [...]
  libvirt.libvirtError: virDomainLookupByName() failed

then I deliberately kill the remote daemon:

  >>> d = conn.lookupByName ("doesnotexist")
  libvir: Remote error : Error in the push function.
  [...]

The first exception is a Not found condition (not an error) whereas the second is an error.

Problem (2) is that virterror is over anxious to print error messages to stderr, even if the caller can handle them and even if (as in the Not found case) they don't indicate errors. In practical terms this means that the virt-install loop attached below may print out 1 or 2 error messages even when it is functioning normally. You'll see an error like this appearing [sic]:

  libvir: Xen Daemon error : GET operation failed:

Since it's difficult to change the LookupBy* functions without changing the ABI, I suspect that the best thing to do is going to be to add a new call with better semantics. Therefore I suggest:

  virDomainPtr *
    virDomainLookup (virConnectPtr conn, int flags,
                     int id, char *str, int *error);

  where flags is one of:
    VIR_LOOKUP_BY_ID, VIR_LOOKUP_BY_NAME, VIR_LOOKUP_BY_UUID
    or VIR_LOOKUP_BY_UUID_STRING

The return values are:
  ret = domain, *error = 0 => found it
  ret = NULL, *error = 0 => not found
  ret = NULL, *error = 1 => error (check virterror)

Addition 1: There would be a similar function virNetworkLookup, but without needing the 'id' parameter because networks don't have IDs.

Addition 2: Change the driver internals so that they don't call virterror in the not found case. (This requires quite a bit of rejigging in xend_internal, but is not too hard).

Addition 3: Language bindings could be modified to detect this function and if present change their existing LookupBy* functions to use the new interface.

Thoughts?

Rich.

----------------------------------------------------

This is the troublesome loop:

  logging.debug("Created guest, looking to see if it is running")
  # sleep in .25 second increments until either a) we find
  # our domain or b) it's been 5 seconds.  this is so that
  # we can try to gracefully handle domain creation failures
  num = 0
  d = None
  while num < (5 / .25): # 5 seconds, .25 second sleeps
     try:
       d = self.conn.lookupByName(self.name)
       break
     except libvirt.libvirtError, e:
       logging.debug("No guest running yet " + str(e))
       pass
     num += 1
     time.sleep(0.25)


--
Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod
Street, Windsor, Berkshire, SL4 1TE, United Kingdom.  Registered in
England and Wales under Company Registration No. 03798903

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to