> However, it presents a problem when we consider NovaObjects, and > dependencies between them.
I disagree with this assertion, because: > For example, take Instance.save(). An > Instance has relationships with several other object types, one of which > is InstanceInfoCache. Consider the following code, which is amongst what > happens in spawn(): > > instance = Instance.get_by_uuid(uuid) > instance.vm_state = vm_states.ACTIVE > instance.info_cache.network_info = new_nw_info > instance.save() > > instance.save() does (simplified): > self.info_cache.save() > self._db_save() > > Both of these saves happen in separate db transactions. This has always been two DB calls, and for a while recently, it was two RPCs, each of which did one call. > This has at least 2 undesirable effects: > > 1. A failure can result in an inconsistent database. i.e. info_cache > having been persisted, but instance.vm_state not having been persisted. > > 2. Even in the absence of a failure, an external reader can see the new > info_cache but the old instance. I think you might want to pick a different example. We update the info_cache all the time asynchronously, due to "time has passed" and other non-user-visible reasons. > New features continue to add to the problem, > including numa topology and pci requests. NUMA and PCI information are now created atomically with the instance (or at least, passed to SQLA in a way I expect does the insert as a single transaction). We don't yet do that in save(), I think because we didn't actually change this information after creation until recently. Definitely agree that we should not save the PCI part without the base instance part. > I don't think we can reasonably remove the cascading save() above due to > the deliberate design of objects. Objects don't correspond directly to > their datamodels, so save() does more work than just calling out to the > DB. We need a way to allow cascading object saves to happen within a > single DB transaction. This will mean: > > 1. A change will be persisted either entirely or not at all in the event > of a failure. > > 2. A reader will see either the whole change or none of it. This is definitely what we should strive for in cases where the updates are related, but as I said above, for things (like info cache) where it doesn't matter, we should be fine. > Note that there is this recently approved oslo.db spec to make > transactions more manageable: > > https://review.openstack.org/#/c/125181/11/specs/kilo/make-enginefacade-a-facade.rst,cm > > Again, while this will be a significant benefit to the DB api, it will > not solve the problem of cascading object saves without allowing > transaction management at the level of NovaObject.save(): we need to > allow something to call a db api with an existing session, and we need > to allow something to pass an existing db transaction to NovaObject.save(). I don't agree that we need to be concerned about this at the NovaObject.save() level. I do agree that Instance.save() needs to have a relationship to its sub-objects that facilitates atomicity (where appropriate), and that such a pattern can be used for other such hierarchies. > An obvious precursor to that is removing N309 from hacking, which > specifically tests for db apis which accept a session argument. We then > need to consider how NovaObject.save() should manage and propagate db > transactions. Right, so I believe that we had more consistent handling of transactions in the past. We had a mechanism for passing around the session between chained db/api methods to ensure they happened atomically. I think Boris led the charge to eliminate that, culminating with the hacking rule you mentioned. Maybe getting back to the justification for removing that facility would help us understand the challenges we face going forward? > [1] At a slight tangent, this looks like an artifact of some premature > generalisation a few years ago. It seems unlikely that anybody is going > to rewrite the db api using an ORM other than sqlalchemy, so we should > probably ditch it and promote it to db/api.py. We've had a few people ask about it, in terms of rewriting some or all of our DB API to talk to a totally non-SQL backend. Further, AFAIK, RAX rewrites a few of the DB API calls to use raw SQL queries for performance (or did, at one point). I'm quite happy to have the implementation of Instance.save() make use of primitives to ensure atomicity where appropriate. I don't think that's something that needs or deserves generalization at this point, and I'm not convinced it needs to be in the save method itself. Right now we update several things atomically by passing something to db/api that gets turned into properly-related SQLA objects. I think we could do the same for any that we're currently cascading separately, even if the db/api update method uses a transaction to ensure safety. --Dan
signature.asc
Description: OpenPGP digital signature
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev