On Wed, Apr 09, 2014 at 01:11:22PM +0200, Michal Skrivanek wrote: > > On Apr 8, 2014, at 18:57 , Dan Kenigsberg <dan...@redhat.com> wrote: > > > On Tue, Apr 08, 2014 at 06:52:50AM -0400, Francesco Romani wrote: > >> Hello VDSM developers, > > > > Please use devel@ovirt, and mention "vdsm" at the subject. This thread > > in particular involves Engine as well. > > > >> > >> I'd like to discuss and explain the plans for cleaning up Vm.getStats() > >> in vdsm/virt/vm.py, and how it affects a bug we have: > >> https://bugzilla.redhat.com/show_bug.cgi?id=1073478 > >> > >> Let's start from the bug. > >> > >> To make a long story short, there is a (small) race in VDSM, probably > >> introduced by commit > >> 8fedf8bde3c28edb07add23c3e9b72681cb48e49 > >> The race can actually be triggered only if the VM is shutting down, so it > >> is not that bad. > >> > >> Fixing the reported issue in the specific case can be done with a trivial > >> if, and that it what I did > >> in my initial http://gerrit.ovirt.org/#/c/25803/1/vdsm/vm.py,cm > > > > Could you explain how an AttributeError there moved the VM to Down? > > > >> > >> However, this is just a bandaid and a proper solution is needed. This is > >> the reason why > >> the following versions of http://gerrit.ovirt.org/#/c/25803 changed > >> direction toward a more comprehensive > >> approach. > >> > >> And this is the core of the issue. > >> My initial idea is to either return success and a complete, well formed > >> statistics set, or return an error. > >> However current engine seems to not cope properly with this, and we cannot > >> break backward compatibility. > > > > Would you be more precise? If getAllVmStats returns an errCode for one > > of the Vms, what happens? > > VM moves to Unknown after some time > > > > >> > >> Looks like the only way to go is to always return success and to add a > >> field to describe the content of the > >> statistics (partial, complete...). THis is admittedly a far cry from the > >> ideal solution, but it is dictated > >> by the need to preserve the compatibility with current/old engines. > > > > I don't think that I understand your suggestion, but it does not sound > > right to send a partial dictionary and to "appologize" about its > > paritiality elsewhere. The dictionary may be "partial" for engine-4.0 > > yet "complete" for engine-3.5. It's not for Vdsm to grade its own > > output. > > > >> > >> Moreover, I would like to take the chance and cleanup/refactor the > >> statistics collection. I already started > >> adding the test infrastructure: http://gerrit.ovirt.org/#/c/26536/ > >> > >> To summarize, what I suggest to do is: > >> > >> * fix https://bugzilla.redhat.com/show_bug.cgi?id=1073478 using a simple > >> ugly fix like the original > >> http://gerrit.ovirt.org/#/c/25803/1/vdsm/vm.py,cm (which I'll resubmit as > >> new change) > >> * refactor and clean getStats() and friends > >> * on the cleaner base, properly fix the statistics collection by let > >> getStats() always succeed and return > >> possibly partial stats, with a new field describing the content > >> > >> please note that I'm not really happy about this solution, but, given the > >> constraint, I don't see better > >> alternatives. > > > > Please explain the benefits of describing the partial content, as I do > > not see them. > > the main reason for me is that currently if we fail to fetch some parts of > the stats we still return what we could get. E.g. > try: > stats.update(self.guestAgent.getGuestInfo()) > except Exception: > return stats > > currently in engine we can only tell it's partial by not seeing the missing > bits… > it makes sense since for lifecycle decisions we care for the VM status, but > we don't necessarily care for other stuff
Engine knows that the data is partial, since it sees only parts of it. What is the benefit of Vdsm declaring it as partial? _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel