On 04/14/2017 10:23 AM, Brian Bouterse wrote: > Did we ever decide on if delete and modify operations on an importer would be > synchronous API operations or > tasking system tasks? > > From what I'm reading, there isn't a MVP need (and maybe ever) to verify the > feed is set at the sync API call > and in the task. We definitely need to do that business logic, i.e. verify > the feed != None, in the task for > the MVP. If we don't need to do it in two places now, then I think doing it > procedurally in the sync task here > [0] would be good enough. That means we would not add it to the model as a > method. I think this business logic > in the sync task would be: > > ``` > if importer.feed is None: > raise ValueError("The 'feed' attribute on the importer cannot be none > when syncing") # <--- could should > probably be a descendent of PulpException since it's a predictable/common > error > ```
+1 > > [0]: > https://github.com/pulp/pulp/pull/2991/files#diff-b9ddc3416cf3909687419960d59316bbR30 > > -Brian > > > > On Wed, Apr 12, 2017 at 10:11 AM, Jeff Ortel <jor...@redhat.com > <mailto:jor...@redhat.com>> wrote: > > > > On 04/11/2017 04:08 PM, Michael Hrivnak wrote: > > I like thinking about this as business logic. Data may be valid, but it > may not be usable in a particular context. > > > > To help figure out where such logic should live, it may help to think > about where the check is most important. > > I've described that time as "at the time of use" earlier in this > discussion (maybe just on IRC). With sync as > > an example, a workflow will load an importer's config from the > database, check it for problems, and then > > immediately use the values it just inspected. This is the critical > moment where it must gracefully handle > > unusable data. This check ensures correct behavior and avoids an > unhandled exception or crash. > > +1 > > > > > We can and should also check for problems at earlier opportunities, > such as at the time a user tries to queue > > a sync. This improves the user experience, but it is not required for > correct and safe operation. > > The data can change between the time when the task is queued and when it > is executed. This adds complexity > and the value added is not deterministic. -1 for a check here. > > > > > Given that, I think it makes sense to put the check close to the data. > A method on the model seems reasonable. > > In terms of polluting the model with business logic, it isn't that > different from defining a custom query set > > on the model, which django encourages. > > Agreed, but only for data integrity. Each field should be validated. > And, the record (cross field) should be > validated. We don't want to store things in the DB that are invalid. > Seems like the django ORM already > provides for this, right? But, the model should not have methods to > validate itself for a particular > operation (usage). For example, ensure_syncable() that validates the > feed URL would not be appropriate. This > should be checked at run-time by the object doing the sync. Perhaps a > double dispatch model in > plugin.Importer could do: > > def synchronize(): > # do validation and preparation. > return self._synchronize() > > def _synchronize(): > # overridden by plugins. > raise NotImplementedError() > > > > > > As a slight tangent, some applications take this sort of checking even > further. An admirable approach in REST > > API design, which may not be a good idea for us at this time but is > interesting to note, is to make a behavior > > such as "sync" only available via a link accessed via a known name in > an object's representation. That's a > > mouthful, so here's an example: > > > > { > > "id": "foo", > > "feed": "http://cdn.redhat.com/stuff/", > > "_links": > > { > > "self": "http://here.com/importers/foo", > > "sync": "http://here.com/importers/foo/sync > <http://here.com/importers/foo/sync>" > > } > > } > > > > Consider that the link for starting a sync is not part of the published > API, except that it must be obtained > > from this representation. There are two advantages here. > > > > The main advantage I'm pointing out is that when the server creates > this representation of an Importer, it > > would only include the "sync" link if the current state of the object > would allow for a sync. If there were no > > feed, there would be no sync link, and thus the client would be unable > to even try starting one. So this is a > > third opportunity to check whether the object's state is suitable for a > sync. It even allows the client to > > show or hide a "sync" button without having to re-implement the > business logic that's already present on the > > server side. Neat, huh? > > Interesting. > > However, the duration of how long the 'sync' link is valid is not > deterministic. 1 ms after the link is > included in the returned document, a DB commit could set it to NULL. The > link could be invalid by the time > the user even receives the REST reply. > > > > > Another advantage to this kind of approach is a smaller API surface > area. We could theoretically change the > > sync URL schema at any time. We could even move it to a new, separate > service. We'd still need to document how > > to use it, but it's actual location can change. In practice I don't > think this aspect is all that valuable > > unless you are 100% bought in to this design. But it's fun to think > about. > > > > And to re-state, this kind of thing may not be worth our time to > actually do right now, and I'm not proposing > > it. I don't know to what extent DRF would make this easy. But I wanted > to bring it up for interest's sake as > > yet another place in the workflow, even closer to the end user than the > other two we've discussed, where > > applications have an opportunity to utilize context checking of data. > > > > On Tue, Apr 11, 2017 at 3:49 PM, Austin Macdonald <amacd...@redhat.com > <mailto:amacd...@redhat.com> <mailto:amacd...@redhat.com > <mailto:amacd...@redhat.com>>> wrote: > > > > Where should business logic live? As an example, I want to consider > the sync task [0] and the need to > > ensure that an importer is syncable. For now, let's say that an > importer is syncable if it has a feed_url. > > > > Since the call to sync an importer comes from the API, the earliest > we can check that the configuration is > > syncable is in a not-yet-written SyncView. Checking syncability > here is ideal because it allows us to > > immediately return a meaningful http response instead of happily > returning information about a Task that > > we already know is doomed to fail. > > > > Performing the check in the API layer is not enough. We have > discussed edge cases that lead to an > > importer's feed_url being removed while the sync task is in the > WAITING state. To make an assertion to the > > plugin that the feed_url exists, we have to check syncability again > when the Task moves into an active state. > > > > My thinking is that we should put this business logic on the model. > > > > Admittedly, it is not a clean fit with the separation of concerns > philosophy but we have already violated > > the concept by putting the sync method on the model. If sync is on > the model, it seems like > > ensure_syncable should be too. > > > > If we write the platform API layer to use this kind of business > logic, then the plugins can add double > > checking business logic without modifying the API and Task layers. > > > > > > [0]: https://pulp.plan.io/issues/2399 > <https://pulp.plan.io/issues/2399> > <https://pulp.plan.io/issues/2399 <https://pulp.plan.io/issues/2399>> > > > > On Fri, Apr 7, 2017 at 2:14 PM, Sean Myers <sean.my...@redhat.com > <mailto:sean.my...@redhat.com> > <mailto:sean.my...@redhat.com <mailto:sean.my...@redhat.com>>> wrote: > > > > On 04/07/2017 12:08 PM, Brian Bouterse wrote: > > > == questions == > > > * Where should ^ terms be documented? > > > > I'm not really sure, but recommend the wiki as a good starting > point for > > putting information that we should probably "officially" > document *somewhere*, > > but at the moment we aren't quite sure where. > > > > https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes> > > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes>> > > > > > * Take the case of a sync which has overrides provided? This > isn't in the > > > MVP, but in the future it could be. In that case, does the > serializer > > > associated with the importer validate the database data with > the overrides > > > "added" on top of it? > > > > My question here is "validate against what?". It makes sense to > validate > > against database data, but as long as the overrides aren't > themselves stored > > in the database, what does this really stop? > > > > For example, what prevents two simultaneous syncs of repos > using overrides > > that would trigger a constraint violation if they were saved, > but don't do > > this because we don't save the overrides? > > > > > * For plugin writers writing a serializer for a subclassed > Importer, do > > > they also need to express validations for fields defined on > the base > > > Importer? > > > > It depends on the validation. If it's just validating an > individual field, > > no. If it's validation of a combination of values in multiple > fields, and > > one of those fields in this case was defined in a subclass, the > subclass > > will need to add the proper validation support. > > > > > * The database still rejects data that doesn't adhere to the > data layer > > > definition right? That occurs even without the DRF serializer > correct? > > > > Again, this depends. For example, attempting to store an int in > a charfield > > will work, because Django will coerce that int to string on > save. Attempting > > to store a string in an IntegerField will fail, though, because > Django is > > not able to coerce str to int prior to saving. Generally, > though, your > > understanding is correct. Anything that the database can't > handle will be > > rejected. > > > > > * In cases where data is created in the backend, do we need > to validate > > > that as a general practice? If we do, do we call the DRF > serializer > > > regularly in the backend or just let the database reject > "bad" data at the > > > db level? > > > > As a general practice, I don't think so. Specifically, though, > when we're > > passing data around, like when a bit of platform code is taking > incoming > > plugin data and passing it into some standard workflow that > platform provides > > (like running sync on an importer, say) I think it's going to > be a good idea > > and an all-around gentlemenly thing to do to validate that data > in some way > > that appropriate to the process/workflow being invoked. > > > > I'm concerned about finding the balance between making things > user-friendly > > for plugin writers and having our checking code that provides > that user- > > friendly-ness itself be difficult to maintain and end up being > pulp-developer- > > unfriendly. > > > > > > _______________________________________________ > > Pulp-dev mailing list > > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > <mailto:Pulp-dev@redhat.com > <mailto:Pulp-dev@redhat.com>> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > <https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev>> > > > > > > > > _______________________________________________ > > Pulp-dev mailing list > > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > <mailto:Pulp-dev@redhat.com > <mailto:Pulp-dev@redhat.com>> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > <https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev>> > > > > > > > > > > -- > > > > Michael Hrivnak > > > > Principal Software Engineer, RHCE > > > > Red Hat > > > > > > > > _______________________________________________ > > Pulp-dev mailing list > > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > > > > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev