-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/11/14 16:22, Dan Smith wrote: >> An initial inconsistency I have noticed is that some objects >> refresh themselves from the database when calling save(), but >> others don't. > > I agree that it would be ideal for all objects to behave the same > in this regard. I expect that in practice, it's not necessary for > all objects to do this, and since it is an extra step/query in some > places, it's not without cost to just always refresh (there is no > reason to refresh a Flavor object AFAIK, for instance). > >> The lack of consistency in behaviour is obviously a problem, and >> I can't think of any good reason for a second select for objects >> which do that. However, I don't think it is good design for >> save() to refresh the object at all, and the reason is >> concurrency. The cached contents of a Nova object are *always* >> potentially stale. A refresh does nothing to change that, because >> the contents are again potentially stale as soon as it returns. >> Handling this requires concurrency primitives which we don't >> currently have (see the larger context I mentioned above). >> Refreshing an object's contents might reduce the probability of a >> race, but it doesn't fix it. Callers who want a refresh after >> save can always call object.refresh(), but for others it's just >> wasted hits on the db. > > Doing a refresh() after a save() is more than just another DB hit, > it's another RPC round-trip, which is far more expensive. > > There is a lot of code in the compute manager (and elsewhere I'm > sure) that expects to get back the refreshed object after a save > (our instance update functions have always exhibited this behavior, > so there is code built to expect it). Any time something could > change async on the object that might affect what we're about to do > will benefit from this implicit refreshing. A good example is when > we're doing something long-running on an instance and it is deleted > underneath us. If we didn't get the deleted=True change after a > save(), we might go continue doing a lot of extra work before we > notice it the next time. > > It's not that doing so prevents the object's contents from being > stale, it's that it reduces the amount of time before we notice a > change, and avoids us needing to explicitly check. Any code we have > that can't tolerate the object being stale is broken anyway. > >> Refresh on save() is also arbitrary. Why should the object be >> updated then rather than at any other time? The timing of an >> update in thread X is unrelated to the timing of an update in >> thread Y, but it's a problem whenever it happens. > > Because it's not about cache consistency, it's about the expense > required to do any of these things. To save(), we *have* to make > the round-trip, so why not get a refresh at the same time? In cases > where we explicitly want to refresh, we call refresh(), but > otherwise we use natural synchronization points like save() to do > that.
Ok, it makes more sense in this context. >> Can anybody see a problem if we didn't fetch the row at all, and >> simply updated it? Absent locking or compare-and-swap this is >> effectively what we're already doing, and it reduces the db cost >> of save to a single update statement. The difference would be >> that the object would remain stale without an explicit refresh(). >> Value munging would remain unaffected. > > At least for instance, I don't want to do away with the implicit > refreshing. I would be up for any of these options: > > 1. Documenting which objects do and don't auto-refresh 2. Making > the case for non-Instance objects to not auto-refresh 3. Making > them all auto-refresh for consistency, with the appropriate > re-tooling of the db/api code to minimize performance impact. I personally favour having consistent behaviour across the board. How about updating them all to auto-refresh by default for consistency, but adding an additional option to save() to disable it for particular calls? Ideally we'd have the opposite default and fix all callers, but I suspect that's more likely to bite us in the short term unless we're confident we can identify all the critical callers. I also suggest a tactical fix to any object which fetches itself twice on update (e.g. Aggregate). > >> Additionally, Instance, InstanceGroup, and Flavor perform >> multiple updates on save(). I would apply the same rule to the >> sub-updates, and also move them into a single transaction such >> that the updates are atomic. > > Yep, no complaints about fixing these non-atomic updates, of course > :) Thanks, Matt - -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlRjkEcACgkQNEHqGdM8NJC5hACfYimQH2MqcMTnvHc7loqi1QAZ R2EAoIOSVe83htncBWBIDBxBwFdANajG =veuA -----END PGP SIGNATURE----- _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
