On 04/21/2017 01:16 PM, Austin Macdonald wrote: > I think we have arrived at a consensus around these points:
+1 > > * Django's ModelForm validations that are used during `full_clean` is an > anti-pattern in DjangoRestFramework > and should not be used > * Validation, by our definition, will be done by the serializers in the API > layer > * The Task layer should run business logic checks at the time of use > > We have not arrived at a consensus on whether Updates/Deletes for > Importers/Publishers will be synchronous or > asynchronous. I've tried to think through each way, and what it would need to > work. > > *Synchronous Update and Delete: > * > _Validation Heuristic_ > The serializers validate in the API layer only, and it is desirable to have > that validation occur at the time > of the write. > * > * > _Concurrent Writes:_ > Let's assume synchronous Updates and Deletes. There is a possibility of > concurrent writes because sync/publish > tasks write to the importer/publisher tables respectively and these tables > can also be written by the Update > API at any time. If there is a collision the last write wins. > > Workaround: > Concurrent writes may not be a big deal because they should be not affect > each other. In platform, sync and > publish only write to the last_sync/last_publish fields, which should never > be written by the update API. If > the writes are restricted to only the affected fields, then there would be no > clobbering. As long as plugin > writers' implementations of sync and publish do not write to the same fields > allowed in an API update, this > concurrency is safe. We would obviously need to be clearly documented for > plugin writers. > > _Dealing With Broken Configurations_ > For example, while a sync task is WAITING, the user updates an importer in a > way that breaks sync. The > importer configuration should be read only once at the start of a sync. If > this configuration has been deleted > or is not syncable, business logic checks catch this and fail immediately. > Even if the configuration is > changed during the sync task, the sync task will continue with the original > configuration. The sync task will > update last_sync, but will not write to other fields. I don't see a problem > here, as long as this is > documented for the plugin writer. > > _Summary_: > I don't see any concurrency problems here. The downside of this design is > that sync and publish, (implemented > by plugin writers), should not write to the same fields as API updates. > > * > * > *Asynchronous Update and Delete* > > _Validation Heuristic_ > Assuming async updates, the data comes into the API layer where it is > validated by the serializers. The update > reserves a task (reservation is based on repository) and waits to be > performed. It is usually undesirable to > separate validation from use. There could be concern that data validated in > the API layer could become > unacceptable by the time the Task layer actually performs the write. > > Practically, this separation may not be an issue. Valid data becoming invalid > involves validations that are > dependent on state. We can split these validations into two categories, > uniqueness validations and multi-field > validations. Fortunately, the Django ORM will enforce uniqueness validation, > so the database is protected, but > we might have to handle write failures. Multi-field validations are probably > not validations by our > definition, but are business logic, which could be checked in the Task layer. > _ > _ > _Concurrent write prevention_ > Leveraging the reserved resource system, concurrent writes could not be a > problem because writes could never > be concurrent at all. Syncs, Updates, Deletes, Publishes all make > reservations based on the repository (this > is an assumption based on what we do in Pulp 2), so each would run in the > order that they were called. > > _User Controlled Order_ > A user can issue a [sync][update][sync] and be confident that the first sync > will use the original importer > configuration, and the second sync will use the updated configuration. This is not deterministic (from user point of view) even with task reservation. You still have a race condition queuing tasks. In the following example, the update made by user B will win and the update made by user A will not be used for the 2nd sync. By making this async, seems we have done is moved the race condition from the DB to the tasking queue. queue ------- 1. user-A [sync] 2. user-A [update] 3. user-B [update] (winner) 4. user-A [sync] > > _Uncertain State_ > The configuration of an importer is unpredictable if there is a waiting > update task from another user. > > > > > > > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev